Notes about Garbage Collection

  • 24 August 2021
  • 0 replies
  • 511 views

Userlevel 5
Badge

GC actually is a complicated subject - for people who implement GC algorithms and such.  :)  But from the outside, observing gc activity, There are a couple of things we can use to frame the discussion. When you set -Xmx and -Xms you're "giving" this memory to the JVM (and the gc algorithms) to use as necessary.  How that memory is managed is supposed to be opaque to us as developers or JVM users.  As we all know, having some visibility into what's going on inside the JVM can help us debug systems and build more efficient systems through better memory utilization.

Probably the most important fact is that the gc algorithms do not usually act proactively - they are mostly reactive - operating asynchronously of the application, and with different triggers depending on the algorithm.

 
Memory in the heap can be categorized into three different states.  First, is used memory, these objects are in use and have a path of references back to a gc root. Second, we have free memory.  This memory is available if we need to create a new object.  The last category is garbage.  these are objects that have no references, so the application code cannot reach or use them.  They are waiting to be added back to free memory. 


The most important condition is low free heap space.  When the used memory exceeds a certain threshold, gc kicks in.  When a minor gc cycle can't reclaim memory by taking the garbage objects and added them back to free memory, a full gc kicks in.  A full gc cycle usually has at least one stop-the-world phase. When heap space is very low, and no object (or very few objects) can be freed, you'll see many repetitive full gc cycles, and the application will have a massive performance hit from all the time which is being spent in the garbage collection stop-the-world phases.   Often this results in OutOfMemoryErrors in the sdc.log and eventually when the gc algorithm determines no new object can be created, the JVM will generate a heap dump (if configured) then the JVM will terminate.  Essentially, the heap dump and JVM crash are side effects of failing to allocate new objects.   This screengrab is a graph of a memory leak causing full garbage collection and eventually a heap dump and JVM termination.

Screen_Shot_2020-05-01_at_9.41.38_AM.png 

There are two other conditions that can influence gc activity.  

First is the rate-of-change - essentially the application's demand for new objects.  As demand for new objects increases, the free memory decreases.  When the objects have been used and have been freed, but have not yet been collected this is the actual garbage.   If demand is very high, the amount of free memory decreases rapidly you'll find gc kicks in repetitively in full gc mode in an effort to return the maximum amount of unreferenced (freed) - but not yet collected memory (this is the actual garbage) to the free pool.   

The second factor is the "busyness" of the JVM.   There are cases when the JVM is not "too busy" where the gc algorithm will perform minor collection.

Since gc algorithms are not usually proactively "cleaning up" but are cleaning up at specific intervals or based on certain events and thresholds, it points to the sawtooth pattern you can see in the screengrab below.   The red triangles represent full gc events.  This is a normal gc profile and is not related to pipeline activity in the JVM.  gc runs asynchronously when some condition related to the heap triggers it.

Screen_Shot_2020-05-01_at_9.40.44_AM.png


There are a lot of additional details and information regarding JVM and gc on the web.  Some additional information can be found 
* the e-book from plumbr.io - https://plumbr.io/java-garbage-collection-handbook
* for G1GC specific details, this article is old but has a good overview of G1GC - https://www.dynatrace.com/news/blog/understanding-g1-garbage-collector-java-9/

 

Bob Plotts

April 27, 2021 19:42

This topic has been closed for comments