There have been cases reported particularly when using JDBC Multi-Table Origin - that SDC will run out of heap space and will lead to OutOfMemoryErrors as well as an eventual heap dump (if SDC is configured with:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/root/streamsets-datacollector-3.11.0/log/sdc_heapdump_1578400042.hprof
- these are both set by default when SDC starts.)
Looking at the heap dumps, it is determined that MS SQL Server JDBC driver had taken all the memory and has not released it. The buffers look like data records in the origin. This problem occurs because some versions of the MS SQL JDBC Driver will buffer the entire result set in memory. The rationale for this is described in this page:
https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-ver15
In part, the page mentioned above says this:
Normally, when the Microsoft JDBC Driver for SQL Server executes a query, the driver retrieves all of the results from the server into application memory. Although this approach minimizes resource consumption on the SQL Server, it can throw an OutOfMemoryError in the JDBC application for the queries that produce very large results.
To change this behavior and to buffer the result set on the server, fetching records as needed. add the following option to the “Additional JDBC Configuration Properties” fields in the JDBC tab.
selectMethod = cursor
Note:
There is another parameter in the MS SQL Server JDBC driver docs - "responseBuffering". Some additional testing with “responseBuffering = adaptive” instead of using “selectMethod=cursor” to the “Additional JDBC Configuration Properties” does not seem to work. “responseBuffering=adaptive” displays the same memory use profile - and eventually leads to OutOfMemoryErrors.