When you try to STOP the pipeline, it may become stuck in STOPPING state. There are a few reasons why one might face this issue, such as when a stage takes too long to process data or there is a bug in SDC.
To help us understand why the Data Collector got into this state you should:
TAKE A THREAD DUMP:
PLEASE NOTE: This must be done while the pipeline is in stopping state.
jcmd <pid> Thread.print
If you have an open support ticket regarding a Data Collector hung in STOPPING, please attach the output of the jcmd command to the ticket.
To get out of being hung in stopping state, one should follow these steps:
FORCE QUIT:
- If a pipeline remains in a Stopping state, you can force Data Collector to stop the pipeline immediately. To force a pipeline to stop from the Home page, click the More icon for the pipeline, and then click Force Stop. Or to force a pipeline to stop from the pipeline canvas, click Force Stop.
- When one does the force quit, the batch which is processing could be stopped abruptly and the offset for the batch might not be stored. When the pipeline is restarted the incomplete batch will start from the beginning which could cause duplication of data.
- If the force quit does not resolve the issue because of lack of resources in SDC to stop the pipeline then you need to change the pipeline state manually.
MANUALLY CHANGING THE PIPELINE STATE:
- One could manually stop the pipeline by going to
$SDC_DATA/runInfo/<pipeline_name>/0/pipelineState.json and changing the value of status to STOPPED and restarting the Data Collector:
"status":"STOPPED"
- If you are using Cloudera Manager the pipelineState.json file will be located /var/lib/sdc/data
January 23, 2020 14:29