The streamsets pipeline fails with java.lang.OutOfMemoryError: Java heap space

  • 11 August 2022
  • 6 replies

The streamsets pipeline fails on a regular basis with the below error. Please advise if you ran into this issue and the resolution. We have increased Heap size couple of times but not helping.

“ERROR    A JVM error occurred while running the pipeline, java.lang.OutOfMemoryError: Java heap space”

6 replies

Userlevel 2

Hi @sujatha mogili 

The error basically means that the process runs out of memory, it could be due to a variety of things, it could be just that you are trying to use more memory that what you have allocated for it.

This is the kind of errors that usually require high troubleshooting, so if you are a customer, the best option would be to reach out to support.

Considering you are not a StreamSets customer, the following questions might help…

  1. Is it a Datacollector or Transformer pipeline?
  2. Which version are you using?
  3. How much memory do you have allocated?
  4. How many pipelines do you have?
  5. And which are the most common stages that you are using?


Sorry for the late response. Hopefully answers below would help.

  1. Is it a Datacollector or Transformer pipeline?    ---  Datacollector
  2. Which version are you using?     ------   StreamSets Data Collector 3.16.1
  3. How much memory do you have allocated?  -----  1GB
  4. How many pipelines do you have?  ----- 57 pipelines but 4 are running currently.
  5. And which are the most common stages that you are using?    ---  Origin,Processor,Executor and Destination.
Userlevel 2

@sujatha mogili,

My first recommendation will be to increase the memory allocated, 1 GB might be too low, some stages are very memory intensive (depending on the data since it will be basically stored in memory).

Apart from that, 3.16 is pretty old version, I’ve myself fixed some memory issues since then, and I’m pretty sure there are a ton of improvements and features you can benefit from basically upgrading to a newer version, my recommendation will be to 5.x line if possible.

Thank you Alex. We will look into the recommendations.

We saw new error today. Any thoughts on this? unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'

Userlevel 2

@sujatha mogili,

That seems totally unrelated, please ask a separate question, you might be able to get better help.

The issue seems to be a validation error due to a null check, is that possible?

Hi I am facing issue with streamset schedulars its not runiing the pipeline as per schedule and streamset ui also not loading