Issue:
Pipeline with Kafka destination stops processing data with the following exception:
ERROR ProductionPipelineRunner - Pipeline execution failed
com.streamsets.pipeline.api.StageException: KAFKA_50 - Error writing data to the Kafka broker: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
This issue is caused if a producer client does not connect to the correct broker, i.e. to a follower instead of the leader (or to a broker that is not even a follower any longer), then this broker will reject the send request. This can happen if the leader changed but the producer still has outdated cached metadata about which broker is the leader for a partition.
Solution:
This exception causes the pipeline to stop.
There is, however, a producer configuration option 'retries' to cover transient errors. The default value for Kafka Producer 'retries' property is 0 (see Kafka 1.0 documentation here). If you want to use the Kafka producer client retries for transient errors, you can try setting this value.
To avoid the producer retrying immediately, you could try to set an associated 'retry.backoff.ms' property as a backoff time (default 100ms).
Both of these properties, you can configure in the Kafka Producer destination --> Kafka tab --> Kafka Configuration.
retries - Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries will potentially change the ordering of records because if two records are sent to a single partition, and the first fails and is retried but the second succeeds, then the second record may appear first.
retry.backoff.ms - The amount of time to wait before attempting to retry a failed fetch request to a given topic partition. This avoids repeated fetching-and-failing in a tight loop.