Skip to main content

I am using Python SDK to build pipeline and using Start Jobs as processor but I am not able to add “Identifier” with Job ID configuration using SDK also how to add authentication type as username and password using SDK itself.

Hi @pranay_bhoyar ,

You can find SDK tutorials here https://github.com/streamsets/tutorials/tree/master/sdk-tutorials/sch/tutorial-jobs/start-monitor-a-specific-job

I am not clear exactly what you wish to do.

Would you mind to explain a bit in details please?

e.g.

  1. Build a pipeline
  2. Start a job
  3. I am not clear what exactly were you saying about identifier

Regards-

Kirti


@pranay_bhoyar 

 

May i know if you are looking for the below config details to set the user and password for job execution.

Once you manage to connect SDC , then you can retrieve you job and execute it.

from streamsets.sdk import ControlHub
sch = ControlHub(credential_id='your_crediential_id', token='your_token_id')
sdc = sch.data_collectors.get(url='http://your_data_collector_hostname:18630')


job configuration .

pipeline = sdc.pipelines.get(title='Pipeline Name')

start_job_processor = pipeline.configurationo'processors']''job name']



start_job_processoro'configuration']''jobId']''identifier'] = 'job_id'

 

 


Hi @pranay_bhoyar ,

You can find SDK tutorials here https://github.com/streamsets/tutorials/tree/master/sdk-tutorials/sch/tutorial-jobs/start-monitor-a-specific-job

I am not clear exactly what you wish to do.

Would you mind to explain a bit in details please?

e.g.

  1. Build a pipeline
  2. Start a job
  3. I am not clear what exactly were you saying about identifier

Regards-

Kirti

Actually I am adding “start jobs” as a processor in streamsets sdk ...WHERE I want to add configuration whose name in sdk is “Identifier” using python sdk. How to set that arrtributes.


@pranay_bhoyar  I understand you are looking for something like below:

ControlHub.VERIFY_SSL_CERTIFICATES = False
sch = ControlHub(server_url=SCH_URL, credential_id=CRED_ID, token=CRED_TOKEN)
pipeline_builder = sch.get_pipeline_builder(engine_id=ENGINE_ID, engine_type=ENGINE_TYPE)
jobs = =
{
"jobIdType": "ID",
"jobId": "334cecc8-9b95-477a-8940-7a4857758068:cd4694f6-2c60-11ec-988d-5b2e605d28aa"
}
]
dev_raw_data_source = pipeline_builder.add_stage('Dev Raw Data Source')
start_job_processor = pipeline_builder.add_stage('Start Jobs')
start_job_processor.set_attributes(task_name='my_job',
control_hub_url=SCH_URL,
jobs=jobs,
auth_id=CRED_ID,
password=CRED_TOKEN)
trash = pipeline_builder.add_stage('Trash')
dev_raw_data_source >> start_job_processor >> trash
pipeline = pipeline_builder.build('Sanju_StartJob_Test')
sch.publish_pipeline(pipeline, commit_message='Testing start job processor')

please refer to approach described @ https://github.com/streamsets/tutorials/tree/master/sdk-tutorials/find-methods-fields to figure out the available methods/attributes for a given stage. Hope this helps


Reply