Arrays and Collections
Creating and Using Aggregate Data Structures
PREREQUISITES — You should already…
- have access to a JDK 8, and be able to create and run simple Java Programs;
- understand the fundamentals of types, variables and static functions;
- have familiarity with the concepts of scope and lifetime.
Arrays
Arrays affords a simple shorthand syntax to create a mutable collection of data, all having the same type. The items in an array, are traditionally called elements, though the Java Specification on arrays prefers the term components). Although arrays are reference types, you can avoid calling the new
operator under some circumstance supported by shorthand syntax.
Elements in an array can be accessed in constant time. This means you can access any element in the same time, regardless of the size of the array. Arrays, like any class, have constructors, methods and a few public fields. Although the elements are mutable, the size of an array cannot change after created. The java.util.Array
class, is part of the Java Collections Framework, and provides are large number of static
functions that perform various operations on arrays.
Technically arrays are not a class type, but acts as if, for all intents and purposes, it inherits from Object
. Arrays have the same element type, shares the same internal Class
object.
Defining Array Variables
Like any variable definition in Java, the type of the array must appear before the array variable identifier. In addition to a ‹type›, an empty set of square brackes ([]
) are required to indicate that this is any array of ‹type›; meaning each element is of a single ‹type›.
Syntax — Array Variable Definition
- ‹type›
[]
‹ident›;
define an array variable.- ‹type› ‹ident›
[];
alternative syntax to define an array variable.- ‹type›
[]
‹ident›=
{
‹init-list›};
define, allocate and initialise array variable.- ‹type›
[]
‹ident›=
new
‹type›[
‹size›];
define variable; allocate space.- ‹init-list› initialiser list; series of values, separated by comma.
- ‹size› integral number of elements to create space for.
When space for an array is created with ‘new
‹type›[
‹size›]
’, the elements are initialised to default values for the type. For the primitive types, this means the space is filled with zero, and for class types (reference type), the values will be null
.
int[] ia1; //←traditional form to define array,
// still has no memory or elements.
int ia2[]; //←alternative synonym for above.
int[] ia3 = { 11, 22, 33, }; //←define, allocate & initialise.
int[] ia4 = new int[3]; //←define & allocate memory.
ia4[0] = 11; //←assign `11` to first element.
ia4[1] = 22; //←assign `22` to second element.
ia4[2] = 33; //←assign `33` to third element.
Array variable definition observations
- Variables
ia1
andia2
are uninitialised and do not reference any elements.- Variable
ia3
uses a shorthand syntax to callnew
and initialise elements.- For
ia4
, Java provides no syntax to also initialise the elements. They must be individually assigned as necessary.- An initialiser list may optionally have a comma after the last expression (see
ia3
).
At any point, you can allocate a new array object with ‘new
‹type›[
‹size›]
’, where ‹size› can be any expression producing an int
value. Normally, you would assign the result to an array variable of the right type (‹type›[]
). Each element will be zero’d by new
, or made null
for reference types.
Arrays can be defined final
inside a function, which means it must be initialised. This does not mean the elements are final
; it just means that you cannot let the array variable reference another array object. At class level, an array member can also be final
, and even static
, or a combination of both.
Arrays Are Reference Types
Logically, we visualise an array as a contiguous collection of values in memory. The number of values cannot change, but the elements can change. Once an element is selected from a ‹type›[]
array, the resulting type is ‹type›. You can do anything with this result, that you could do with a simple ‹type› variable.
It does not matter which syntax was chosen to define and populate the elements of ia
above, the memory representation and layout will remain the same. We really must understand this well: the variable ia
, which you may abstractly think of as your array, is not in reality — it only stores a reference to the real array in memory. Consider the following:
From above, you can see that ib[1]
and ia[1]
, for example, reference the exact same value in memory; that is because both ia
and ib
contain the same value, and thus reference the same memory.
Subscripting/Indexing
As we have seen by example, array variable can use a subscript operator to select (or reference) a single element (or component) in the array object. The integer operand between the square brackets, must be positive, and is treated as an offset, although many call it an index, or subscript.
Offset 0
references the first element, when used with the subscript operator. It stands to reason then, that the last element is at offset ‘‹arr›.length-1
’, where ‹arr› is any array expression, like an array variable. We can use this fact in a for
loop, to iterate through the elements of an arbitrary-sized array:
double[] da = { 1.2, 2.3, 3.4, 4.5 };
for (int i = 0; i < da.length; ++i) {
da[i] *= 1.50;
out.format("da[%d] = %.2f\n", i, da[i]);
}
The code above increases the values of each element by 50%
, which represents the rate of inflation in some locales.
If you were to use a negative offset, or an offset that is bigger or equal to an array’s length
, Java will throw and IndexOutOfBoundsException
. This does add some runtime overhead, but is a fair price, considering the disasters that will result otherwise — just ask the C/C++ programmers.
For-Each Loop and Arrays
We can also use the “For-Each Loop” to iterate through the elements. However, this kind of loop cannot be used to modify the elements, since the for-each loop makes successive copies of each element. It is also not useful, if you need to keep track of the offset of each element. However, it is the most generic pattern to use, since the for-each loop not only work with arrays, but with all Java collections, whereas the subscript operator is limited to arrays.
double[] da = { 1.2, 2.3, 3.4, 4.5 };
for (double i : da) {
i *= 1.50; //←modify a *copy*.
out.format("new i = %.2f\n", i);
}
for (double i : da) //←show *original* values.
out.format("org i = %.2f\n", i);
For arrays then, a for-each loop is not that useful, but slightly more succinct, especially if you do not have a use for the offset. Even with an array of reference types, this statement still applies.
Pass/Return Arrays to/from Functions
We can pass arrays to functions. The code extract above can be placed in a function, so that the function can be used to arbitrarily modify each element, of any array, by some percentage, which we can pass as an extra argument:
ArrayFunction.java
— Passing Array to a Function
import static java.lang.System.*;
public class ArrayFunction {
public static void main (String[] args) {
double[] da = { 1.2, 2.3, 3.4, 4.5 };
double[] db = new double[da.length + 2];
int j = 0;
for (int i = 0; i < da.length; ++i, ++j)
db[i] = da[i];
db[j++] = 5.6;
db[j++] = 6.7;
applyPercentage (da, 1.50); //←increase by 50%.
for (int i = 0; i < da.length; ++i)
out.format("da[%d] = %.2f\n", i, da[i]);
applyPercentage (db, 0.50); //←decrease by 50%.
for (int i = 0; i < db.length; ++i)
out.format("db[%d] = %.2f\n", i, db[i]);
exit(0);
}
static void applyPercentage (double[] arr, double pct) {
for (int i = 0; i < arr.length; ++i)
arr[i] *= pct;
}
}//class
da[0] = 1.80 │ db[0] = 0.60
da[1] = 3.45 │ db[1] = 1.15
da[2] = 5.10 │ db[2] = 1.70
da[3] = 6.75 │ db[3] = 2.25
│ db[4] = 2.80
│ db[5] = 3.35
It should be clear that a copy of a reference to an array is passed to the applyPercentage
function, and because the elements are mutable, we can modify the elements referenced via the parameter (arr
), which contains the copy of the reference.
WARNING — Array Elements Modifiable in Functions
Because passing and array, means passing a reference, a function can modify the elements of the array. This can be useful, but is a bug when elements are unintentionally modified. If a function is designed not to modify elements, care should be taken in its implementation. Similarly, if a function is designed to modify an array, the caller should manually make a copy of the original elements if needed.
Returning Arrays
Just as we can pass arrays, we can return arrays, but should remember, that we will be returning a reference. This is not as big an issue as it may sound, if you are not yet that comfortable with references.
ReturnArray.java
— Function Returning an Array
import static java.lang.System.*;
public class ReturnArray {
public static void main (String[] args) {
char[] dashes = makeChars('-', 20);
String line = new String(dashes);
out.println(line);
out.println(dashes);
exit(0);
}
static char[] makeChars (char chr, int len) {
char[] result = new char[len];
for (int i = 0; i < len; ++i)
result[i] = chr;
return result;
}
}//class
On a practical note, depending on what you want to do with an array of char
values, you do not necessarily have to convert it to a String
as we have done above. Both out.println
statements above, will output the same value, since println
has an overloaded version that can take a char
array as argument.
Select Array Operations
Although you write any algorithm possible using subscripting and iteration, the java.util.Arrays
class may have what you need. There may appear to be many static
functions in this class, but considering that almost all have overloaded versions for different ‹type› of arrays, it is not really overwhelming.
An easy function, which is useful for learning and debugging, is toString
. It will return a formatted String
containing all the array elements, enclosed in square brackets.
int[] ia = { 11, 22, 33, 44, 55, 66 };
String stria
= java.util.Arrays.toString(ia);
out.format("ia = %s\n", stria);
out.println(ia);
ia = [11, 22, 33, 44, 55, 66]
[I@42a57993
As you can see, the toString
that arrays inherit from Object
, does not produce anything useful, unlike …Arrays.toString()
. If the formatting suits you, or for debugging, you may find this useful. We used an int[]
(int
array), but instead of int
, you could use any of the primitive types, or an array of any ‹type› really, even Object[]
:
Object[] oa = { "ABC", 22, 'D', 44.5, 55L };
String[] sa = { "GHI", "JKL", "MNO" };
String str = java.util.Arrays.toString(oa);
out.format("oa = %s\n", str);
str = java.util.Arrays.toString(sa);
out.format("sa = %s\n", str);
oa = [ABC, 22, D, 44.5, 55]
sa = [GHI, JKL, MNO]
Instead of writing a loop to copy elements from one array to another, we can simply use one of the copyOf
functions.
double[] da = { 1.2, 2.3, 3.4, 4.5, 5.6, 6.7 };
double[] db = java.util.Arrays.copyOf(da, da.length);
double[] dc = java.util.Arrays.copyOfRange(da, 0, da.length);
double[] dd = java.util.Arrays.copyOfRange(da, 3, da.length);
out.format("da = %s\n", java.util.Arrays.toString(da));
out.format("db = %s\n", java.util.Arrays.toString(db));
out.format("dc = %s\n", java.util.Arrays.toString(dc));
out.format("dd = %s\n", java.util.Arrays.toString(dd));
out.format("da == db: %s\n", java.util.Arrays.equals(da, db));
da = [1.2, 2.3, 3.4, 4.5, 5.6, 6.7]
db = [1.2, 2.3, 3.4, 4.5, 5.6, 6.7]
dc = [1.2, 2.3, 3.4, 4.5, 5.6, 6.7]
dd = [4.5, 5.6, 6.7]
da == db: true
Note that for the copyOfRange
versions, the upper bound is exclusive; the last offset used, will be one less than the upper bound. Notice that we also used another useful function: equals
, that can compare two arrays. You can also use fill
to set all the elements of an existing array to a particular value.
Two other potentially useful functions, are sort
to sort arrays in-place, and binarySearch
to efficiently find a key in a sorted array.
NOTE — Importing java.util.Arrays
If you work with java.util.Arrays
a lot, you may want to add import java.util.Arrays;
at the top of your source file; then you can use: Arrays.toString(…)
, for example.
Or, less commendable, add: import java.util.*
to use all java.util
classes without the java.util
prefix. You can go further, and add: import static java.util.Arrays.*;
(but still will have to add import java.util.Array;
at least before), and then you can use most of the functions without prefix:
double[] da = { 1.2, 2.3, 3.4, 4.5, 5.6, 6.7 };
double[] db = copyOf(da, da.length);
double[] dd = copyOfRange(da, 3, da.length);
out.format("da = %s\n", Arrays.toString(da));
out.format("da == db: %s\n", Arrays.equals(da, db));
You cannot use toString
or equals
without qualification, as shown.
Collections
There are so many collections, the Java documentation calls it the Collection Framework. The reason we need more than arrays, is a matter of performance related to the kind of operations your algorithm require to be efficient. For example, you can insert an element in the beginning of an array (losing the last one), but the larger the array, the slower the process will be; an array is therefore not the best solution if your algorithm depends fast insertions in a list of values.
ArrayList
An java.util.ArrayList<E>
is a generic type (which effectively can work with any elements of type E
), is very efficient with arbitrary deletions and insertions of items in a collection. Furthermore, unlike an array, which is of a fixed size, the number of items in an ArrayList
can increase or decrease arbitrarily.
One major caveat: generic types, including ArrayList
does not work well with primitive types: you have to use primitive wrappers, or if speed is important, look for a third party library, like trove4j.
To define an ArrayList<E>
variable, you must replace E
with a (non-primitive) ‹type›, for example: Integer
, String
or Double
, etc. To add elements, you call the .add(
‹item›)
instance method. You cannot use subscript.
To continue from above, to change an existing element, the .set(
‹offset›,
‹item›)
method can overwrite and existing item with the new ‹item› at the specified ‹offset›.
for (int i = 0; i < ald.size(); ++i) {
double val = ald.get(i); //←use `.get(i)` instead of `[i]`.
out.format("%.2f ", val);
ald.set(i, val * 10.0); //←use `.set(i, ‹value›)` to change.
}
out.println();
As with arrays, if the ‹offset› is out of bounds, an exception will be thrown. The current number of items in the collection, can be determined with the .size()
method — which can be 0
for an object containing no items. You can also use the .isEmpty()
method to determine if a list is empty.
As promised, the for-each loop works well with any collection, include ArrayList
expressions:
for (double item : ald) //←“for-each” loop works as expected.
out.format("%.2f ", item);
out.println();
Although slightly advanced, we do show two examples using the .forEach(…)
method as an alternative to physical for(…)
statements:
ald.forEach(System.out::println); //←pass reference to `forEach` method.
ald.forEach(
e -> out.format("%.2f ", e) //←pass a lambda expression, which is
); // more flexible, but newer syntax.
out.println();
You can easily .remove()
items from a collection by offset, or by value. You can find the offset of an item with .indeOf()
— it finds the first, while lastIndexOf()
will search from the back.
From java.util.Collection
, ArrayList
inherits the .stream()
method, which is the closest equivalent to LINQ (Language INtegrated Query) in C# (fluent syntax). The java.util.stream
interface is implemented by ArrayList
, and allows you to chain method calls together, since many methods return a stream.
ArrayListStream.java
— Chaining Stream Methods on ArrayList
import static java.lang.System.*;
import java.util.ArrayList;
import java.util.Arrays;
public class ArrayListStream {
public static void main (String[] args) {
ArrayList<Integer> ali = new ArrayList<Integer>();
ali.add(66); ali.add(22); ali.add(77); ali.add(44);
ali.add(11); ali.add(33); ali.add(88); ali.add(55);
int[] result = ali.stream() //←get a stream.
.filter(i -> i % 2 == 0) //←chain `filter` to find even numbers.
.mapToInt(i -> i) //←convert to primitive `int`.
.sorted() //←sort the result.
.toArray(); //←convert to array for convenience.
out.println("Original ArrayList : " + ali);
out.println("Sorted even numbers: " + Arrays.toString(result));
exit(0);
}
Original ArrayList : [66, 22, 77, 44, 11, 33, 88, 55]
Sorted even numbers: [22, 44, 66, 88]
We could have called .sorted()
before .mapToInt()
, and would still have achieved the desired result. Afficionados were very exited when streams appeared in Java 8, and you can see why.
Although an ArrayList
can grow and shrink (via trimToSize()
) arbitrarily, and reasonably efficiently, it is on par with normal arrays as far as constant time item access is concercerned. It is a good all-rounder, since it has a good selection of list and array characteristics. An ArrayList
’s main weakness, is inserting items at the start, which is still slow, relatively speaking.
LinkedList
When arbitrary insertions at the beginning or at the end of a collection are common tasks for a certain algorithm, a java.util.LinkList<E>
might be the best option. Like ArrayList
, it also inherits from java.util.Collection
, implements similar inferfaces, and as a result have very similar operations.
Accessing items in the List<E>
by offset, gets slower the larger the number of items, since it has to count from the start of the list. In general then, you can easily replace an ArrayList
with a List
and vice versa, without affecting the rest of your code. So in ArrayListStream.java
above, you only have to replace the variable type ArrayList
, with LinkedList
, while the rest of the code remains the same:
Just for interest’s sake, here is an alternative solution to initialising a collection, making use of the fact that most collections have a constructor that can accept a list of some kind:
It does beat having a series of .add(…)
method calls.
Sets
The main characteristic of any set, is that it can contain no duplicates, and that is exactly what many programmers use it for: to eliminate duplicates, for adding an item that already exists, has no effect. There are different implementations of sets, but like ArrayList
and LinkedList
, they all inherit from java.util.Collection
, and implement the java.util.Set<E>
interface.
HashSet vs SortedSet
These are two implementations of java.util.Set<E>
, each having different characteristics. The implementation of java.util.HashSet<E>
, uses hash table, that overall makes it the best performer. The advantage of java.util.SortedSet<E>
, is that the elements are in a natural ordering; this does require more work, but the advantage is that your items in the set are always sorted, regardless of the order in which you added the items.
Intersection and Union
Now, if your were looking for set intersection, union, difference, etc. methods, you will be disappointed. At least we can simulate a union with .addAll()
and an intersection with .retainAll()
from Collection
(which all collection classes inherit).
If you are to use those statements a lot, you could of course wrap it in a function. This answer on Stack Overflow has examples. Here is a number of solutions to obtain a union and an intersection between two sets:
SetUnionIntersect.java
— Set Union and Intersection Examples
import static java.lang.System.*;
public class SetUnionIntersect {
public static void main (String[] args) {
java.util.LinkedList<String> tennis
= new java.util.LinkedList<String>(java.util.Arrays.asList(
"petra", "garry", "john", "peter", "vera"));
java.util.ArrayList<String> golf
= new java.util.ArrayList<String>(java.util.Arrays.asList(
"sandra", "louis", "john", "garry"));
out.println("Who plays tennis: " + tennis);
out.println("Who plays golf : " + golf);
/* 1a) find persons playing both sports (intersection)
*/ {
java.util.List<String> both_sports =
new java.util.ArrayList<String>();
for (String s : tennis)
if (golf.contains(s))
both_sports.add(s);
out.println("1a) Both sports : " + both_sports);
}
/* 1b) find persons playing *any* sport (union)
*/ {
java.util.Set<String> play_sport
= new java.util.HashSet<String>();
play_sport.addAll(tennis);
play_sport.addAll(golf);
out.println("1b) Play a sport: " + play_sport);
}
/* 2a) find persons playing both sports (intersection)
*/ {
java.util.Collection<String> both_sports =
new java.util.TreeSet<String>(tennis);
both_sports.retainAll(golf);
out.println("2a) Both sports : " + both_sports);
}
/* 3a) find persons playing both sports (intersection)
*/ {
java.util.List<String> both_sports = tennis.stream()
.filter(golf::contains)
.collect(java.util.stream.Collectors.toList());
out.println("3a) Both sports : " + both_sports);
}
/* 3b) find persons playing *any* sport (union)
*/ {
java.util.List<String> play_sport
= java.util.stream.Stream.concat(tennis.stream(), golf.stream())
.distinct()
.collect(java.util.stream.Collectors.toList());
out.println("3b) Play a sport: " + play_sport);
}
exit(0);
}
}//class
We have fully qualified all the names, but you are of course welcome to add import
statements to reduce the vebosity considerably. At least this way, you do not have to guess where each method or class resides.
The one advantage is that the sets do not have to be the same type. We made one an ArrayList
and the other a LinkedList
, yet it caused no problems. By the same token, the results were different types of collections, just to show how generic the classes can be.
Maps or Dictionaries
Java prefer the term “map”, but you may know the concept as “dictionary”, or a collection of “key-value pairs”. Java’s java.util.Map<K,V>
interface is implemented by a number of classes, which means they work similarly, but have different performance characteristics for the same operations.
The java.util.HashMap<K,V>
class uses a hash table, which is overall quite efficient, but the order in which the keys are retrieved is not specified. The java.util.TreeMap<K,V>
uses a red-black tree, which keeps the keys ordered, with slight overhead. The K
represents the type of the key, and V
the type of the value. For example, if we wanted to associated keys of type Integer
to values of type String
, we could define a variable ishm
(integer-string hash map) as follows:
Instead of and .add()
method as with sets, all maps provide the .put(K,V)
method to either modify the value for an existing key, or add a new entry in the map. We could add a number of entries (“items” if you like), into the ishm
map we defined above, and print out all the values with a for-each loop, using a set of keys obtained from .entrySet()
:
ishm.put(1000, "Value for 1000");
ishm.put(1010, "Value for 1010");
ishm.put(1020, "Value for 1020");
ishm.put(1030, "Value for 1030");
ishm.put(1040, "Value for 1040");
ishm.put(1020, "New value for 1020");
for (java.util.Map.Entry<?,?> kv : ishm.entrySet())
out.format("%04d: %s\n", kv.getKey(), kv.getValue());
1040: Value for 1040
1010: Value for 1010
1030: Value for 1030
1000: Value for 1000
1020: New value for 1020
As you can see, the second ishm.put(1020,…)
modifies the original value for key 1020
.
Instead of: …Map.Entry<?,?>
…, you may want to use: …Map.Entry<Integer,String>
…, but for the purpose of our code above, the simpler version is fine. The java.util.Map.Entry
interface represents a single key-value pair (an “entry” or “item”).
You can obtain the value for a given key with the .get(K)
method. This method returns null
if either the value for the key is null
, or the key does not exist. To check if a key does not exist, you can use the .containsKey(Object)
to check if a key exists.
To delete an entry, you can use .remove(Object)
, or remove all entries with .clear()
. Checking if a map is empty, requires the .isEmpty()
method, while .size()
returns the number of entries in the map. Below, we remove the entry with the key 1020
, and check if it still present:
ishm.remove(1020);
if (!ishm.containsKey(1020))
out.println("Successfully removed entry with key 1020");
A list of keys can be obtained with the .keySet()
method. For a HashMap
, the keys are retrieved in arbitrary order, and for a [TreeHash
][od-util-hashtree], the keys are in sorted order. The type of value returned by the .keySet()
method is java.util.Set<K>
. We can use this to iterate through the map, using the set of keys:
java.util.Set<Integer> keys = ishm.keySet();
out.println("Map in arbitrary order:");
for (Integer k : keys)
out.format("%d: %s\n", k, ishm.get(k));
Map in arbitrary order:
1040: Value for 1040
1010: Value for 1010
1030: Value for 1030
1000: Value for 1000
Often, we want the keys in sorted order. There are several ways to accomplish this, but there is no method that returns ordered keys; we have to either use a TreeMap
from the start, or assign the set to some other type which can be sorted with, for example, java.util.Collections.sort()
. We show three possibilities below:
/* 1) get keys in a `TreeSet` (sorted), and print key/values. could also
** have used: `TreeSet<K> keys = new TreeSet<K>(); keys.addAll(ivp);`
*/ {
java.util.Set<Integer> keys
= new java.util.TreeSet<Integer>(ishm.keySet());
out.println("1) Ordered by key:");
for (Integer k : keys)
out.format("%d: %s\n", k, ishm.get(k));
}
/* 2) alternative way to get key/values in sorted order (by key)
*/ {
java.util.List<Integer> keys
= new java.util.ArrayList<Integer>(ishm.keySet());
out.println("2) Ordered by key:");
java.util.Collections.sort(keys);
for (Integer k : keys)
out.format("%d: %s\n", k, ishm.get(k));
}
/* 3) yet another way to get sorted key/value pairs (by key). this is
** only possible since Java 8 (courtesy of `stream`s).
*/ {
java.util.List<Integer> keys = ishm.keySet().stream()
.sorted().collect(java.util.stream.Collectors.toList());
out.println("3) Ordered by key:");
java.util.Collections.sort(keys);
for (Integer k : keys)
out.format("%d: %s\n", k, ishm.get(k));
}
This about covers the major tasks we tend to perform on maps/dictionaries. Java provides numerous other collections, and if high-performance is important, you can try the collection utilities set from the highly-acclaimed Guava Core Libraries.
File Input/Ouput
Before long you may be required to read and/or write files. This requires iteration statements, special classes and a handful of concepts. Understand that reading an writing files are not related to “file handling”, which concerns delete, renaming, copying and moving files.
Prior to Java 8, file IO used to be somewhat cumbersome, but now much improved and more concise. However, you should unfortunately still have some grasp of the older NIO and NIO2 classes, since a) they are not deprecated, and b) you will still find it in older programs which you may have to maintain.
Portability is a concern when dealing with files, because different operating systems have incompatible file naming rules, different path separators, different permissions, etc. This you have to deal with to some extent in your programs; Java does not entirely shield you from it.
The java.io
package represents the original file I/O classes, but has mostly (though not entirely) superseded by the java.nio
package since Java 1.4. Subsequent versions of the JDK added incremental enhancements to java.nio
. A full treatment would involve combining these I/O classes with the Java 8 java.util.stream
package, and lambdas (although older code may use anonymous classes). For beginners, this is quite overwhelming, so this section will focus on practical and typical use, not all the fancy detail.
Dealing With Paths
The java.nio.file
package houses, amongst others, the Files
class, which provides a number of static
methods to manipulate files and directories. Many of these methods return or require as arguments, a Path
interface, implemented by the Paths
class, all in the same package.
A Path
is an abstraction of file system naming. It may be absolute, which means it has a root, or relative, which represents a relation to another directory, commonly the current working directory. You can see several examples in the Path Operations section of the official Java Tutorials.
Current Working Directory
When your program starts, either is was opened from the command line, or via a shortcut. Either way, the directory in the shortcut, or the current working directory in your shell, becomes your programs current directory. This is not necessarily the same directory where your program code resides. The following seems to reliably do the trick:
But you can also just query a system property with java.utils.Properties.getProperty()
. The relevant property is "user.dir"
, but you should also be aware of
Other mechanisms are available, but one of the above two should suffice for most purposes.
Application Directory
The get the application’s directory, we need the location of the class that contains the main
function. This ‘.class
’ file could reside in a ‘.jar
’ file, and we generally do not care about the .jar
file name, for this purpose.
First, we have to obtain an object of a java.lang.Class<T>
type, which we can get from the getClass
method, inherited by all types from java.lang.Object
, if we had an instance of our main class. Or we can hard code the class name, which is generally not acceptable. The alternative is a bit more convoluted, and involves even more reflection. Either way, now it is possible for exceptions to be thrown, which means we must deal with it.
You must replace the ‹main-class-name›
class literal part below, with your main class name (the one containing the main
function):
try{
String appdir = new java.io.File(
‹main-class-name›.class.getProtectionDomain()
.getCodeSource().getLocation().toURI().getPath()
).toString();
if (appdir.endsWith(".jar"))
appdir = appdir.substring(
0, appdir.lastIndexOf(java.io.File.separatorChar));
out.format("APPDIR = %s\n", appdir);
}
catch (java.net.URISyntaxException ex) {
/* ignore */
}
}
To get a Class<T>
object, without hard coding a name, we can use a “trick”: get the main
function by inspecting the current stack trace, an select the last item in the list (practically, that means the first stack trace element).
java.lang.StackTraceElement[] stacktrace
= java.lang.Thread.currentThread().getStackTrace();
java.lang.StackTraceElement mainfunc
= stacktrace[stacktrace.length - 1];
String classname = mainfunc.getClassName();
Class<?> mainclass;
try{
mainclass = java.lang.Class.forName(classname);
String appdir = new java.io.File(
mainclass.getProtectionDomain()
.getCodeSource().getLocation().toURI().getPath()
).toString();
if (appdir.endsWith(".jar"))
appdir = appdir.substring(
0, appdir.lastIndexOf(java.io.File.separatorChar));
out.format("5) APP = %s\n", appdir);
}
catch (java.lang.ClassNotFoundException ex) {
/* ignore */
}
catch (java.net.URISyntaxException ex) {
/* ignore */
}
This should ultimately be wrapped in some function, probably in some utility class, since one would normally only call it once during application initialisation. We do that in the following complete listing, and also add a function to return the name of the executable (.class
or .jar
file).
AppDir.java
— Find Application Directory & Name
import static java.lang.System.*;
public class AppDir {
public static void main (String... args) {
String apppath = getAppPath();
String appname = getAppName();
out.format("App Filename = `%s`\n", appname);
out.format("App Directory = `%s`\n", apppath);
exit(0);
}
public static final String getAppName () {
Class<?>[] ctxlist = getContextArray();
Class<?> mainclass = ctxlist[ctxlist.length - 1];
String appfile = new java.io.File(
mainclass.getProtectionDomain()
.getCodeSource().getLocation().getFile()).toString();
if (appfile.endsWith(".jar"))
appfile = appfile.substring(
appfile.lastIndexOf(java.io.File.separatorChar)+1);
else
appfile = mainclass.getName() + ".class";
return appfile;
}
public static final String getAppPath () {
Class<?>[] ctxlist = getContextArray();
Class<?> mainclass = ctxlist[ctxlist.length - 1];
String appdir = null;
try{
appdir = new java.io.File(
mainclass.getProtectionDomain()
.getCodeSource().getLocation().toURI().getPath()
).toString();
if (appdir.endsWith(".jar"))
appdir = appdir.substring(
0, appdir.lastIndexOf(java.io.File.separatorChar));
}
catch (java.net.URISyntaxException ex) {
/* ignore */
}
return appdir;
}
public static final Class<?>[] getContextArray() {
return new SecurityManager() {
protected Class<?>[] getContextArray(){
return super.getClassContext();
}
}.getContextArray();
}
}//class
File Operations
2018-01-02: Created. [brx]