Size Maps When Possible
Java stores Map
s as entries into buckets of singly linked lists or further buckets. Its max storage is constrained by
the JVMs memory allocation, but theoretically can hold up to 231 values. Resizing a Map
is an expensive
operation, as existing buckets may have to be redistributed, or new buckets may need to be created. Dependent on the
number of insertions to the Map
, this can happen many times, requiring a large number of repositions.
Some key terms
- Capacity: The number of buckets (length of the array of nodes) in the hash table. The initial capacity is set to 16 (24) and doubled each time it is reached.
- Load factor: The measure that decides when to increase the capacity of the Map. By default, this is set to 0.75 ( 75%) of capacity.
Indexes
Map
indexes in Java are initialised and extended in powers of 2 - and the load factor decides how many elements can be
inserted before the map must be resized. Given a load factor of 0.75, for an initial capacity of 16, up to 12 values can
be inserted before a resize occurs.
If you know that you will be inserting at least 40 values, it is optimal to select an initial capacity (of 64) that won't require a resize:
Exponent (n) | Capacity (2n) | Resize at |
---|---|---|
4 | 16 | 12 |
5 | 32 | 24 |
6 | 64 | 48 |
Benchmark
Definition
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 200, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@Fork(1)
@State(Scope.Benchmark)
public class HashMapBenchmark {
private Map<Integer, Integer> map;
@Param({"8", "16", "32", "64"})
public int capacity;
@Param({"8", "16", "32"})
public int numberOfInserts;
@Setup
public void setup() {
map = new HashMap<>(capacity);
}
@Benchmark
public Map<Integer, Integer> insert_to_hash_map() {
for (int i = 0; i < numberOfInserts; i++) {
map.put(i, i);
}
return map;
}
}
Results
As can be seen from the results below - for an initial map capacity of 8, inserting 32 keys takes 11.5% longer than the same amount of inserts to a 32 capacity map.
Benchmark | Capacity | No. of Inserts | Mode | Count | Score | Error | Units |
---|---|---|---|---|---|---|---|
HashMapBenchmark.insert_to_hash_map | 8 | 8 | avgt | 200 | 99.360 | ± 3.490 | ns/op |
HashMapBenchmark.insert_to_hash_map | 8 | 16 | avgt | 200 | 198.932 | ± 9.344 | ns/op |
HashMapBenchmark.insert_to_hash_map | 8 | 32 | avgt | 200 | 400.597 | ± 15.615 | ns/op |
HashMapBenchmark.insert_to_hash_map | 16 | 8 | avgt | 200 | 93.933 | ± 3.532 | ns/op |
HashMapBenchmark.insert_to_hash_map | 16 | 16 | avgt | 200 | 198.945 | ± 7.892 | ns/op |
HashMapBenchmark.insert_to_hash_map | 16 | 32 | avgt | 200 | 378.605 | ± 15.225 | ns/op |
HashMapBenchmark.insert_to_hash_map | 32 | 8 | avgt | 200 | 94.030 | ± 3.626 | ns/op |
HashMapBenchmark.insert_to_hash_map | 32 | 16 | avgt | 200 | 190.595 | ± 7.072 | ns/op |
HashMapBenchmark.insert_to_hash_map | 32 | 32 | avgt | 200 | 358.341 | ± 15.057 | ns/op |
HashMapBenchmark.insert_to_hash_map | 64 | 8 | avgt | 200 | 92.010 | ± 4.958 | ns/op |
HashMapBenchmark.insert_to_hash_map | 64 | 16 | avgt | 200 | 187.479 | ± 8.061 | ns/op |
HashMapBenchmark.insert_to_hash_map | 64 | 32 | avgt | 200 | 359.244 | ± 13.924 | ns/op |
Although this is not always possible at runtime, ensuring to handle the cases where it is will speed up your Java functions.