STARTING to RUNNING state of a pipeline

1 year ago
12 September 2022
0 replies
36 views

Userlevel 3

+1

Pradeep
StreamSets Employee
48 replies

When a data collector pipeline is in STARTING state it will validate each stage, create required clients and also do initialization.

Each stage has init() method which initializes the objects needed for stages to run.
Validation runs static operations. For example, running the query given in JDBC origin stages with limited results to validate the query, its schema, offset column etc. Do note that validation is specific to each stage.

If you notice any slowness in STARTING to RUNNING stage, it is possible that validation is taking sometime and it usually happens when the datasets in configured query are larger. This is an example of slowness for JDBC or Relational DB stages. If the slowness is unacceptable you might consider disabling the query validation.

0 replies

Be the first to reply!

Reply

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded