Solved

Switch to another data collector in case of any error

  • 14 December 2022
  • 7 replies
  • 82 views

Hello Team,

Could you please assist for a solution where we can switch to Data Collector A to Data Collector B for a Job if Job having error in between and go for retry.

icon

Best answer by Ranjith P 16 December 2022, 12:23

View original

7 replies

Userlevel 2
Badge

Hi @yogesh0590 
Please go through the document: https://docs.streamsets.com/portal/controlhub/latest/onpremhelp/controlhub/UserGuide/Jobs/PipelineFailover.html?hl=job%2Cfailover

This helps you to achieve your use-case. Let us know if you have any queries.

@Ranjith P  :

 

Could you please share configuration for same as we are getting ,

 

 

 

S3 As Destination :

S3_21 - Unable to write object to Amazon S3, reason : com.amazonaws.SdkClientException: Unable to execute HTTP request: Broken pipe (Write failed)

 

Our Configs :

 

Failover Retries per Data Collector :1

Global Failover Retries : 1

 

But it is still trying to connect to same Data Collector. 

 

 

PS : Our use case is very simple which is  SFTP to S3.

 
Userlevel 2
Badge

@yogesh0590 Could you please let us know the value configured for this attribute Retry Attempts in the pipeline?

@Ranjith P :

Edited :

In S3 Stage (Destination) we have “Retry Count” as 3

 

For Pipeline we have “Retry Attempts” as -1

Userlevel 2
Badge

@yogesh0590 could you please try Retry Attempts = 2 on the pipeline and see whether the job is failovering from one SDC to another?

@Ranjith P Yeah, it is switching to another data collector now. But I have below query,

 

In case of

first error job is Switching from “Data Collector A” to “Data Collector B

Second Error job is Switching from “Data Collector B” to “Data Collector C

 

So How the offsetting works here ? let’s say at beginning offset is “0”

 

Does it save the offset after first error and job will start from there when Data Collector switching is going on ? Or New Data Collector will start job from “0” ?

 

If you can help with this.

Userlevel 2
Badge

@yogesh0590 Whenever the job failover from one SDC to another (i.e., A→B or B→C), It always starts from the last saved offset.

You can refer the documentation for more details: 

https://docs.streamsets.com/portal/controlhub/latest/onpremhelp/controlhub/UserGuide/Jobs/Jobs-Managing.html#concept_y2t_ll3_hy

 

Reply