Java Lambdas and Streams
Review of Generics
Why Use Generics?
Generics allow us to use types (interfaces and classes) as parameters when defining classes, interfaces and methods. Much like formal parameters in method declarations, type parameters allow us to reuse the same code with different inputs. The difference is that the inputs (arguments) to formal parameters are values, while the inputs to type parameters are types.
Code that uses generics has many benefits over non-generic code:
Gives stronger type checking at compile time. A Java compiler applies strong type checking to generic code and issues errors if the code violates type safety. Fixing compile-time errors is much easier than fixing runtime errors.
Enables programmers to implement generic algorithms. By using generics, programmers can implement reusable, customisable, type-safe algorithms that can work with collections of different types in a pluggable manner.
Eliminates typecasts. The following code snippet without generics requires casting:
List list = new ArrayList();
list.add("Hello");
String s = (String) list.get(0); // casting required
When rewritten using generics, no casting is required:
List<String> list = new ArrayList<String>();
list.add("Hello");
String s = list.get(0); // no casting
Knowing how to use and apply generics is extremely important, especially when using Java 8 and above. Even novice Java programmers need to know how to use classes that support generics. For example, we can’t get the benefit of type checking and type safety when using collections such as List
, Map
, and Set
without knowing how to use generics.
List<Employee> emps = new ArrayList<Employee>();
Map<String, Employee> empTable = new HashMap<String, Employee>();
Intermediate Java developers should be able to define classes or methods that support generics. In Java 7 and earlier, being able to do this was mostly reserved for advanced developers. But this is done much more commonly in Java 8, because of the need to use generics for lambda expressions and stream processing. The goal of both generics and lambda functions is to make code safer and more reusable, which is a goal that all programmers share.
Syntax for Generic Classes and Methods
When declaring a class or method that supports generics, we need to define the type parameter section, which is delimited by angle brackets (<>
). It either follows the class name, or precedes a method return value if the class itself is not parameterized. It specifies the type parameters (also called type variables) T1
, T2
, …, to Tn
. The use of those identifiers refers to types, not to variables.
The following are some example code snippets that show type parameter usage:
// generic interfaces and classes
public interface Iterator<E> { ... }
public interface Map<K,V> { ... }
// generic (parameterized) classes and their methods
public class Stack<E>
{
...
public synchronized void push(E item) { ... }
public synchronized E pop() throws StackEmptyException { ... }
...
}
// generic methods in non-generic classes
public static <T> T randomElement(T[] array) { ... }
public static <T> T lastElement(List<T> elements) { ... }
As can be seen above, the widely-used type naming convention is to use single uppercase letters, usually:
E
for elementK
for keyV
for valueN
for numberR
for resultT
for typeS
,U
,V
for second, third, fourth types, etc.
This is just a convention, so any valid and relevant identifiers can be used.
Generic Method Examples
The following class is not a generic class, but its methods are generic, which means that the type parameter <T>
is not at the class declaration level, but only on the method declarations. The firstMatch()
method takes a List
of T
objects and returns a T
object. The <T>
at the beginning of the method declaration means T
is not a real type, but a type parameter that the Java compiler will determine from the context in which it is used, either as the types of the parameters of a method call, or from the instantiation of an object (if it is a generic class).
We could use the generic firstMatch()
method as follows:
List<Person> people = ...;
Person matchedPerson = MatchingUtils.firstMatch(people, ...);
List<Book> books = ...;
Book matchedBook = MatchingUtils.firstMatch(books, ...);
The following additional example is again for a non-parameterized class, but with a generic method. The method returns a random element from a generic array that is passed to it.
public class RandomUtils
{
private static Random r = new Random();
public static <T> T randomElement(T[] array)
{
return array[r.nextInt(array.length)];
}
}
The T
in the randomElement()
method declaration refers to the type which Java will infer from examining the parameters of the method call. Even if there was an existing class called T
, it is irrelevant here, because T
is a placeholder for a type to be passed in as a parameter later. The method takes in an array of T
objects and returns a T
object. For example, if we pass in an Integer
array, an Integer
object will be returned; if we pass in a Person
array, a Person
object will be returned. No typecasts are necessary.
We could use the RandomUtils
class as follows:
String names[] = { "Tom", "Dick", "Harry" };
String name = RandomUtils.randomElement(names);
...
Integer nums[] = { 2, 4, 6, 8, 10 }; // must be an Integer[], not int[]
int num = RandomUtils.randomElement(nums);
...
Color colors[] = { Color.RED, Color.GREEN, Color.BLUE };
Color color = RandomUtils.randomElement(colors);
...
Person people[] = {
new Person("Fred Bloggs", 35),
new Person("John Smith", 42),
new Person("John Doe", 27),
new Person("Jane Doe", 56),
};
Person person = RandomUtils.randomElement(people);
...
Note again that typecasting is not required to convert to String
, Color
, Person
, or Integer
. Autoboxing allows us to assign an element from the Integer[]
array to an int
, but the array passed to randomElement()
must be Integer[]
, not int[]
, since generics work only with Object
types, not primitive data types.
Generic Class Example
The following example is for a very simple generic (parameterized) stack class with push()
and pop()
methods. Both the class and the methods are generic. For comparison, there is a full Stack
class in the java.util
package.
public class Stack<E>
{
private E stack[];
private int sp; // stack pointer to next empty position
private boolean isEmpty;
private boolean isFull;
// constructor
public Stack(int size)
{
stack = new E[size];
sp = 0;
isEmpty = true;
isFull = false;
}
public synchronized void push(E item)
{
while (isFull)
{
try
{
wait();
}
catch (InterruptedException e)
{
}
}
if (sp < stack.length)
{
stack[sp++] = item;
isEmpty = false;
}
if (sp == stack.length)
isFull = true;
notifyAll();
}
public synchronized E pop () throws StackEmptyException
{
// similar code...
...
}
...
}
Methods in the class can now refer to E
both for arguments and for return values, where E
doesn’t refer to an existing type. Instead, it refers to whatever type was defined when a stack was created. In the following code, E
would refer to a String
, the push()
method would accept a String
parameter, and the pop()
method would return a String
object:
Stack<String> words = new Stack<String>(10);
words.push("Hello");
...
String s = words.pop();
System.out.println(s);
...
In the same way, if we created Stack<Person>
, the push()
method would accept a Person
and the pop()
method would return a Person
object. No typecasts would be required when using push()
and pop()
.
Type Inference (Diamond) Operator
From Java 7, we can replace the type arguments when invoking a constructor of a generic class with an empty set of type arguments (<>
) as long as the compiler can determine, or infer, the type arguments from the context. This pair of angle brackets is informally called the diamond operator.
Using the previous code:
We can use the <>
operator with the constructor, because the compiler will be able to infer the type from the usage context:
The Java compiler uses a type inference algorithm to look at each method invocation and the corresponding declaration, to determine the type argument(s) that can apply to the invocation and, if available, the type of the returned result. The compiler takes advantage of target typing to infer the type parameters of a generic method invocation. The target type of an expression is the data type that the compiler expects, based on the context. Finally the inference algorithm tries to find the most specific type that works with all of the arguments.
Multiple Type Parameters
A generic class can have multiple type parameters. For example, if we wanted to model a key:value pair, where both the key and the value could be of any type, we might create a generic Pair
class:
public class Pair<K, V>
{
private K key;
private V value;
public Pair(K key, V value)
{
this.key = key;
this.value = value;
}
public K getKey() { return key; }
public V getValue() { return value; }
}
The following statements instantiate a few objects of the Pair
class:
Pair<String, String> pr1 = new Pair<String, String> ("Hello", "World");
Pair<String, Integer> pr2 = new Pair<String, Integer> ("Two", 2);
Person p = new Person("Joe Bloggs", 37, Gender.MALE);
Pair<String, Person> pr3 = new Pair<String, Person> ("Joe", p);
From Java 7 we can use the diamond operator to reduce a certain amount of typing:
Pair<String, String> pr1 = new Pair<> ("Hello", "World");
Pair<String, Integer> pr2 = new Pair<> ("Two", 2);
Person p = new Person("Joe Bloggs", 37, Gender.MALE);
Pair<String, Person> pr3 = new Pair<> ("Joe", p);
Type Erasure
Adding generics to Java created a problem for backwards compatibility, which has always been an important issue when adding new features to the language. The problem was how to allow older, non-generic collection classes to be used alongside newer generic collections.
The designers decided to do this with typecasts:
// Given a non-generic list...
List myList = getMyList();
// This is an unsafe cast, but we can do it if
// we know that myList contains String objects.
List<String> myStringList = (List<String>) myList;
This means that, on some level, List
and List<String>
are compatible as types. Java achieves this compatibility by type erasure, which means that generic types are only visible at compile time and are stripped out by the compiler. All that is left after type erasure is the raw type of the container — in this case myStringList
has the type of List
.
Non-generic types such as List
are referred to as raw types. It is still perfectly legal to work with raw types, however we lose the strict type checking that the compiler gives us, and it’s generally a sign of poor quality code.
Compile and Runtime Typing
Consider the following statement:
We might be surprised to learn that the type of list
is different at compile time to runtime.
- At compile time, the
javac
compiler seeslist
as aList-of-String
, and uses that information for strict type checking, but the compiler doesn’t know the concrete type oflist
— it just knows thatlist
is compatible with theList
interface. - At runtime, the JVM sees
list
as a rawArrayList
because of type erasure. The type information about the actual contents has been erased and the resulting runtime type is just a raw type.
Wildcards and Bounds
In generic code, the question mark symbol ?
is called a wildcard, and represents an unknown type. The wildcard can be used in a variety of situations: as the type of a parameter, field, or local variable; and occasionally as a return type. The wildcard is never used as a type argument for a generic class instance creation, generic method invocation, or a supertype.
We have three major ways we can use wildcards — unbounded, upper bounded and lower bounded:
An unbounded wildcard uses the syntax
<?>
and represents all types. It is used as an argument for instantiations of generic types, and is useful in situations where no knowledge about the type argument of a parameterized type is needed. Unbounded wildcards allow the broadest conceivable argument set, because the unbounded wildcard<?>
stands for any type without any restrictions.An upper bounded wildcard uses the syntax
<? extends T>
and represents all types that are subtypes ofT
, including typeT
.T
is called the upper bound.A lower bounded wildcard uses the syntax
<? super T>
and represents all types that are supertypes ofT
, including typeT
.T
is called the lower bound .
Bounded wildcards are used as arguments for instantiation of generic types. Bounded wildcards are useful where only partial knowledge about the type argument of a parameterized type is needed, but where unbounded wildcards carry too little type information. A bounded wildcard carries more information than an unbounded wildcard. The supertype of such a family is called the upper bound; the subtype of such a family is called the lower bound.
We can specify an upper bound for a wildcard, or we can specify a lower bound, but we cannot specify both at the same time.
Unbounded Wildcards
The unbounded wildcard type is specified using the wildcard character ?
, for example, List<?>
. This is called a list of unknown type. There are two scenarios where an unbounded wildcard is useful:
- When we are writing a method that only uses functionality from the
Object
class. - When we are using methods in the generic class that don’t depend on the type parameter. For example,
List.size()
orList.clear()
. In fact,Class<?>
is so often used because most of the methods inClass<T>
do not depend onT
.
Consider the following printList()
method:
public static void printList(List<Object> list)
{
for (Object element : list)
System.out.println(element + " ");
System.out.println();
}
The obvious goal of printList()
is to print a list of any type, but unfortunately it can only print a list of Object
instances; it can’t print List<Integer>
, List<String>
, List<Person>
, etc., because they are not subtypes of List<Object>
.
To write a generic printList()
method, we must use the wildcard syntax List<?>
as follows:
public static void printList(List<?> list)
{
for (Object element : list)
System.out.println(element + " ");
System.out.println();
}
This works because List<T>
is a subtype of List<?>
for any concrete type T
. That means we can use printList()
to print a list of any type:
List<Integer> iList = Arrays.asList(1, 2, 3);
List<String> sList = Arrays.asList("one", "two", "three");
printList(iList);
printList(sList);
It’s important to remember that List<Object>
and List<?>
are not the same. We can add an Object
, or any subtype of Object
, into a List<Object>
. But we can only add null
into a List<?>
.
Upper Bounded Wildcards
We can use an upper bounded wildcard to relax the restrictions on a variable. For example, let’s suppose we want to write a method that works on List<Integer>
, List<Double>
, and List<Number>
. We can do this by using an upper bounded wildcard.
An upper bounded wildcard restricts the unknown type to be a specific type or a subtype of that type and is written as: <? extends T>
where T
is the upper bound. In this context, extends
is used in a general sense to mean either implements (as in interfaces) or extends (as in classes).
To write a method that works on lists of Number
and its subtypes, such as Integer
, Double
, etc., we would specify List<? extends Number>
. The term List<Number>
is more restrictive than List<? extends Number>
because List<Number>
matches a list of type Number
only, whereas List<? extends Number>
matches a list of type Number
or any of its subclasses.
Lower Bounded Wildcards
A lower bounded wildcard restricts the unknown type to be a specific type or a super type of that type and is written as: <? super T>
where T
is the lower bound.
Suppose we would like to write a method that puts Integer
objects into a list. For flexibility, we’d like the method to work with List<Integer>
, List<Number>
, and List<Object>
, i.e. anything that can hold Integer
objects.
To write the method that works on lists of Integer
and its supertypes, we specify List<? super Integer>
. The term List<Integer>
is more restrictive than List<? super Integer>
because the List<Integer>
matches a list of type Integer
only, whereas List<? super Integer>
matches a list of any type that is a supertype of Integer
.
The following code adds the numbers 1 through 16 to the end of a list:
public static void addNumbers(List<? super Integer> list)
{
for (int i = 1; i <= 16; ++i)
list.add(i);
}
Wildcard Guidelines and PECS
One of the more confusing aspects when learning to program with wildcards is determining when to use an unbounded wildcard, an upper bounded wildcard or a lower bounded wildcard.
There is an acronym coined by Joshua Block in his Effective Java book called PECS: Producer Extends, Consumer Super.
Producer Extends — If we need a
List
to produceT
values (we want to readT
s from the list), we need to declare it with<? extends T>
, e.g.List<? extends Integer>
. But we cannot add to this list! It only produces objects of that type or that can be upcast into that type.Consumer Super — If we need a
List
to consumeT
values (we want to writeT
s into the list), we need to declare it with<? super T>
, e.g.List<? super Integer>
. We can write into this list, but there are no guarantees what type of object we may read from the list.If we need to both read from and write to a list, we need to declare it exactly without wildcards, e.g.
List<Integer>
.
Here is a simple example of copying a source list to a destination list. Note how the source list src
(the producing list) uses extends
, and the destination list dest
(the consuming list) uses super
:
public class Collections
{
public static <T> void copy(List<? extends T> src,
List<? super T> dest)
{
for (int i = 0; i < src.size(); ++i)
dest.set(i, src.get(i));
}
}
Here is another way to remember when to use super
or extends
, if we think in terms of an object X
:
- If we have a list and want to read an
X
from that list, it has to be a list ofX
or a list of things that can be upcast toX
as they get read out, i.e. anything that extendsX
:
- If we want to write
X
into a list, that list needs to be either a list ofX
or a list of things thatX
can be upcast to, i.e. any superclass ofX
:
This all boils down to:
- Use
extends
when we only want to get values from a data structure. - Use
super
when we only want to put values into a data structure. - Use an explicit type when we have to do both.
Summary
- Generics (parameterized types) enforce type safety and reduce coding effort.
- Classes, interfaces and methods can be created that accept parameterized types.
- When using wildcards, remember PECS — Producer Extends, Consumer Super.
2018-05-19: Edited [jjc]
2018-03-28: Revised [lsc]
2018-03-24: Edited. [jjc]
2018-03-24: Created. [lsc]