Super Fast Garbage Collectors in Java

Introduction

GarbageCollection: There are many technical definitions but in layman terms, it’s just an automated way to collect un-referenced or un-used (garbage) objects from memory and efficiently utilize the memory available for the application.

Urghh!! Yet another technical definition..

From here on let’s save some space and use “GC” as an abbreviation for Garbage Collection.

Evolution of GC

Before Java, with c or c++ we had to explicitly allocate or de-allocate the memory using malloc()/realloc()/calloc()/free()/destructors etc. Since we had to manage the memory deallocation explicitly, any mistake would make the applications prone to memory leaks. With the emergence of Java, the concept of automatic GC was introduced.

Glossary

Throughput: Average amount of time spent while running the application vs average time spent in GC

Latency: Amount of time code pauses during GC

Heap: Region in memory where java objects are stored.

Young generation: Region in heap where newly created objects are stored.

Old generation (tenured): Region in heap where long lived objects are stored.

Minor GC: Most of the objects in java are short lived. It means they are created and they die early. So most of the objects in young generation are collected with Minor GC. It’s called Minor because, it’s less expensive on the application and doesn’t add extra pauses or latencies.

Major GC: The objects which survive after minor GC are moved from young generation to old generation space and once the old generation space reaches its threshold Major GC is triggered, which in most of the cases is stop the world (STW) operation and adds overhead on the application by pausing all the threads and contributes to latency. Hence understanding the different types of GC and choosing the correct one for our application is very important.

So, enough done with these basic terms and definitions. We are here to understand SuperFast garbage collector, isn’t it ?? But wait !!! How do we quantify SuperFast if we don’t know how fast existing GC mechanisms are. Let’s spend a couple more minutes discussing the existing GC mechanisms.

Existing GC

GC works in 2 simple steps, Mark and Sweep. Mark marks the objects eligible for garbage collection. Sweep removes the objects marked by the “Mark” step.

Serial GC

Uses a single application thread to perform GC and freezes all application threads (STW), which makes it not so suitable for low latency applications and multi-threaded applications.

java -XX:+UseSerialGC -jar Application.java

Application which can afford some application pause time can enable this GC using the following argument. Mostly to be used for single threaded applications.

Parallel GC

Well this is the default GC of the JVM and unlike Serial GC, uses multiple threads to perform GC hence it’s faster than Serial GC but again freezes other application threads while running (STW). This can be enabled as:

java -XX:+UseParallelGC -jar Application.java


Suitable for Multi threaded applications.

However, if we want to use this GC we can use different options to minimize the impact of STW.

-XX:ParallelGCThreads : Parallel threads to be used for GC

-XX:MaxGCPauseMillis: Max pause time during GC. In order, to meet this SLA, GC can make some adjustments to other parameters.

-XX:GCTimeRatio: Maximum Throughput required.

CMS (Concurrent Mark & Sweep)

This uses multiple garbage collector threads for garbage collection. It’s designed for applications that prefer shorter garbage collection pauses. If more than 98% of the total time is spent in CMS garbage collection and only less than 2% of the heap is recovered, then an OutOfMemoryError is thrown by the CMS collector. If necessary, this feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the command line. CMS can be enabled using

java -XX:+UseConcMarkSweepGC -jar Application.java


Less pause time as compared to Serial and parallel GC

CMS is deprecated post Java 9 and throws a warning if we use it. In Java 14, it’s completely dropped.

G1 GC

G1 stands for Garbage first. Available since JDK7, unlike other collectors, G1 collector partitions the heap into a set of equal-sized heap regions, each a contiguous range of virtual memory. During Marking phase, G1 shows a concurrent global marking phase to determine the liveness of objects throughout the heap. During sweeping, G1 knows which regions are mostly empty. It collects in these areas first, which usually yields a significant amount of free space. It is why this method of garbage collection is called Garbage-First.

G1 is designed for applications running on multi-processor machines with large memory space. It’s more performance efficient. It can be enabled as :

java -XX:+UseG1GC -jar Application.java


Low GC pause time for high heap applications

Phew !! Finally, it’s time to read about Super fast garbage collectors !!!

Super fast GC

Until now all the JDK GC’s are stop the world operations having issues with application latencies. With the latest versions of Java ( > JDK11), two new GC have been introduced, Shenandoah and ZGC. These GC’s allows java applications to run while it collects the garbage concurrently. Let’s have a look.

Shenandoah

Introduced in JDK12, Shenandoah’s key advance over G1 is to do more of its garbage collection cycle work concurrently with the application threads. G1 can evacuate its heap regions, that is, move objects, only when the application is paused, while Shenandoah can relocate objects concurrently with the application. To achieve the concurrent relocation, it uses what’s known as a Brooks pointer. This pointer is an additional field that each object in the Shenandoah heap has and which points back to the object itself.

Shenandoah does this because when it moves an object, it also needs to fix up all the objects in the heap that have references to that object. When Shenandoah moves an object to a new location, it leaves the old Brooks pointer in place, forwarding references to the new location of the object. When an object is referenced, the application follows the forwarding pointer to the new location. Eventually the old object with the forwarding pointer needs to be cleaned up, but by decoupling the cleanup operation from the step of moving the object itself, Shenandoah can more easily accomplish the concurrent relocation of objects.

The GC pause times are consistent with varying heap sizes i.e. A pause time with 200 MB heap would be same as 200GB heap. Yes you read it right !!

To use Shenandoah in your application from Java 12 onwards, enable it with the following options:

java -jar -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC Application.java

If you haven’t yet migrated to JDK 12, Shenandoah supports back ports to JDK 8 and JDK 12. Also, Shenandoah isn’t enabled in the JDK builds that Oracle ships, but other OpenJDK distributors enable Shenandoah by default

Use Shenandoah when low latencies are desired and application migrated to OpenJDK

ZGC

Introduced as an experimental feature in Java 11, ZGC is low latency and scalable garbage collector. ZGC has a consistent GC pause time of 2ms and a guaranteed max GC pause time of 10 ms, which makes it Super Fast and works with a heap in sizes of TB’s. Yes, in TB’s !!

ZGC pause time does not increase with an increase in heap size. ZGC allows a Java application to continue running while it performs all garbage collection operations except thread stack scanning. It also supports concurrent class unloading. Performing concurrent class unloading is complicated and, therefore, class unloading has traditionally been done in a stop-the-world pause. Determining the set of classes that are no longer used requires performing reference processing first. Reference processing is expensive and in the worst case would require the whole heap to be scanned. Well, ZGC does all this concurrently, hence there is no penalty on latencies during class unloading. ZGC can be enabled in the below manner:

java -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -jar Application.java

What is “Z” in ZGC ? Nothing it’s just a “Z” !!

Below are some comparative studies published by SPECJbb

ZGC is a good fit for applications that require large amounts of memory, such as with big data. However, ZGC is also a good candidate for smaller heaps that require predictable and extremely low pause times.

Summary

Thats all about the new super fast garbage collectors. They are super fast because they perform all the mark and sweep concurrently without stalling the application threads. Choose the garbage collector depending on your application needs. Don’t forget to benchmark your application’s performance with different GC mechanisms, you will be blown away to see the impact one JVM switch can have.

Hope these 2 cents on GC were helpful !!

Social Share !
Default image
Bhavik Banchpalliwar
In Main thread, Software Engineer @Amazon and in parallel thread, pursuing MTech from BITS Pilani