StreamSets Community Forum For Data Professionals | StreamSets Community

30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets

A

11 months ago

1,902 Topics
2,698 Replies
1,776 Members

Find answers to your questions
Stay up to date on the latest topics
Ask questions and help others

Ask your question

Quick Links

Updates on all things product

Upcoming events

User guides and tutorials

Get trained and certified

Contact support

Recently active
Help others

Community Articles and Got a Question?

Open source On Prem Control Hub package

Good afternoon, I’m currently using Streamsets Data Collector 3.14 (last open source version of sdc).I see that it is possible to connect sdc to a control hub and there is a lot of advantages to do so. Also there is a documentation for installing an on prem control-hub from an archive :https://docs.streamsets.com/portal/controlhub/latest/onpremhelp/controlhub/UserGuide/Install/InstallingDPM.html#concept_exg_11p_hbbUse the following command to extract the tarball:tar xzvf streamsets-dpm-<version>.tar.gz There was even a tutorial for how to build your own sch on prem from a github repo: Howerver the repo is no longer available :http://github.com/streamsets/domainserver Could it be possible to have again access to the repository or to the archive ? Thank you.

paulStreamSets Employee

Community Articles and Got a Question?

Details on SDC Metrics

QuestionHow are the SDC Metrics collected and what do they represent?AnswerThe graphs on the SDC Metrics page in StreamSets Data Collector and the Metrics tab of the Execution Engine page in Control Hub are system-level metrics collected from standard Java core libraries. They should roughly correspond with metrics reported by other system-level tools, such as top and uptime at the command line, as well as external monitoring tools that show system-level metrics. Note that the numbers reported by different tools won’t match each other exactly due to differences in reporting intervals and other factors, but there should be a correlation.

DolphinDiscovered Fame

StreamSets Academy

Python SDK Start job with Runtime Paramter

Hi I have pipeline and it has 1 runtime parameter about table name. I create a job on top of this pipeline, so it also has a runtime parameter, when create job from this pipeline, I provide a default parameter.I try to use python SDK to call the job, if I ran like below, the job will run with the default parameter(TEST_TABLE1) defined in pipeline and job, it is expected:job = sch.jobs.get(job_name = "MY_SDC_JOB")sch.start_job(job) However when I run like this, I want to overwrite the default runtime parameter, but it still using the default parameter(TEST_TABLE1) , how to make it run with the given parameter(TEST_TABLE2)?sch.start_job(job, TABLE_NAME=’TEST_TABLE2')

DolphinDiscovered Fame

StreamSets Academy

Streamsets API Credential not Work, return HTTP/1.1 403 Forbidden Error

HiI am following the steps generate API Credential, and run the content in the green frame directly, however it return error like below. curl -X GET https://eu01.hub.streamsets.com/security/rest/v1/currentUser -H "Content-Type:application/json" -H "X-Requested-By:curl" -H "X-SS-REST-CALL:true" -H "X-SS-App-Component-Id: $CRED_ID" -H "X-SS-App-Auth-Token: $CRED_TOKEN" -i % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 19 100 19 0 0 22 0 --:--:-- --:--:-- --:--:-- 22HTTP/1.1 403 Forbiddencontent-length: 19content-type: text/plaincontent-security-policy: object-src 'none';script-src 'self' https://cdn.cookielaw.org https://privacyportal.onetrust.com https://geolocation.onetrust.com https://app.intercom.io https://widget.intercom.io https://js.intercomcdn.com https://js.userflow.com https://cdn.userflow.com;style-src 'self' https://fonts.googleapis.com h

srinivasa_nanduruFan

Community Articles and Got a Question?

Global parameter

I want to create a parameter that I can use across multiple pipelines.For example, I have multiple pipelines that are configured to run on EMR cluster.Instead of providing the same EMR configuration values for all the pipelines, if there is a way for me to create parameters globally that I can refer to, in each of the relevant pipelines.Thanks !

SperchRoadie

Community Articles and Got a Question?

Is there a way to use StreamSets APIs without using the Python SDK for StreamSets?

Is there a way to create a pipeline to successfully call StreamSets API endpoints that does not require the Python SDK for StreamSets to be leveraged? I have tried the REST Service and HTTP Client origins, and even attempted (horribly) a Jython Scripting origin to no avail. I am fairly certain that authentication is the problem since the API works when I use them through the SCH RESTful API section manually, and receive the following as an error when attempting to use a pipeline stage:com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 2]My ask for a UI-only solution is due to 1) it would allow less experienced admins to troubleshoot without immediately calling on me if problems arise, 2) allow us to build sequences for business- or management-oriented users to

StreamSets Academy

Can someone help us on the error which i am getting while creating pipeline?

Organization 5c4e8df2-f1c4-11ee-9efd-39b2ca65500f already has 0 pipelines which is the maximum allowed. Please contact support : PIPELINE_STORE_07

jose.valdivialeonFan

Community Articles and Got a Question?

Streamsets Postman Collection

Hi all, Is there a Streamsets Postman collection available? If yes, would someone please share the download link?Thank you,

Show us your Pipelines

Error: [Local FS 1 - Directory Template] Base directory path could not be created (HADOOPFS_41)

I am trying to build a pipeline by following the steps mentioned in the streamsets docs- https://docs.streamsets.com/portal/platform-controlhub/controlhub/UserGuide/GettingStarted/Try.html#task_q3r_p2x_k4b but getting an issue while writing to the local fils system.the issue resides in the directory template [Directory Template info/HDFS_output/${YYYY()}-${MM()}-${DD()}-${hh()}-${every(5,mm())}codeHADOOPFS_41 - Base directory path could not be created]Can anyone help? Do we have a set of configuration that needs to be installed before running streamsets?

srinivasa_nanduruFan

Community Articles and Got a Question?

AWS EMR Configuration

I was able to configure EMR Cluster from my transformer pipeline and start the job. But the job does not finish.It fails with an errorERROR Client: Application diagnostics message: Shutdown hook called before final status was reported.Any ideas ?Thanks !Regards,Srinivasa Nanduru

Community Articles and Got a Question?

Open source On Prem Control Hub package

Good afternoon, I’m currently using Streamsets Data Collector 3.14 (last open source version of sdc).I see that it is possible to connect sdc to a control hub and there is a lot of advantages to do so. Also there is a documentation for installing an on prem control-hub from an archive :https://docs.streamsets.com/portal/controlhub/latest/onpremhelp/controlhub/UserGuide/Install/InstallingDPM.html#concept_exg_11p_hbbUse the following command to extract the tarball:tar xzvf streamsets-dpm-<version>.tar.gz There was even a tutorial for how to build your own sch on prem from a github repo: Howerver the repo is no longer available :http://github.com/streamsets/domainserver Could it be possible to have again access to the repository or to the archive ? Thank you.

paulStreamSets Employee

Community Articles and Got a Question?

Details on SDC Metrics

QuestionHow are the SDC Metrics collected and what do they represent?AnswerThe graphs on the SDC Metrics page in StreamSets Data Collector and the Metrics tab of the Execution Engine page in Control Hub are system-level metrics collected from standard Java core libraries. They should roughly correspond with metrics reported by other system-level tools, such as top and uptime at the command line, as well as external monitoring tools that show system-level metrics. Note that the numbers reported by different tools won’t match each other exactly due to differences in reporting intervals and other factors, but there should be a correlation.

DolphinDiscovered Fame

StreamSets Academy

Python SDK Start job with Runtime Paramter

Hi I have pipeline and it has 1 runtime parameter about table name. I create a job on top of this pipeline, so it also has a runtime parameter, when create job from this pipeline, I provide a default parameter.I try to use python SDK to call the job, if I ran like below, the job will run with the default parameter(TEST_TABLE1) defined in pipeline and job, it is expected:job = sch.jobs.get(job_name = "MY_SDC_JOB")sch.start_job(job) However when I run like this, I want to overwrite the default runtime parameter, but it still using the default parameter(TEST_TABLE1) , how to make it run with the given parameter(TEST_TABLE2)?sch.start_job(job, TABLE_NAME=’TEST_TABLE2')

DolphinDiscovered Fame

StreamSets Academy

Streamsets API Credential not Work, return HTTP/1.1 403 Forbidden Error

HiI am following the steps generate API Credential, and run the content in the green frame directly, however it return error like below. curl -X GET https://eu01.hub.streamsets.com/security/rest/v1/currentUser -H "Content-Type:application/json" -H "X-Requested-By:curl" -H "X-SS-REST-CALL:true" -H "X-SS-App-Component-Id: $CRED_ID" -H "X-SS-App-Auth-Token: $CRED_TOKEN" -i % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 19 100 19 0 0 22 0 --:--:-- --:--:-- --:--:-- 22HTTP/1.1 403 Forbiddencontent-length: 19content-type: text/plaincontent-security-policy: object-src 'none';script-src 'self' https://cdn.cookielaw.org https://privacyportal.onetrust.com https://geolocation.onetrust.com https://app.intercom.io https://widget.intercom.io https://js.intercomcdn.com https://js.userflow.com https://cdn.userflow.com;style-src 'self' https://fonts.googleapis.com h

srinivasa_nanduruFan

Community Articles and Got a Question?

Global parameter

I want to create a parameter that I can use across multiple pipelines.For example, I have multiple pipelines that are configured to run on EMR cluster.Instead of providing the same EMR configuration values for all the pipelines, if there is a way for me to create parameters globally that I can refer to, in each of the relevant pipelines.Thanks !

SperchRoadie

Community Articles and Got a Question?

Is there a way to use StreamSets APIs without using the Python SDK for StreamSets?

Is there a way to create a pipeline to successfully call StreamSets API endpoints that does not require the Python SDK for StreamSets to be leveraged? I have tried the REST Service and HTTP Client origins, and even attempted (horribly) a Jython Scripting origin to no avail. I am fairly certain that authentication is the problem since the API works when I use them through the SCH RESTful API section manually, and receive the following as an error when attempting to use a pipeline stage:com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 2]My ask for a UI-only solution is due to 1) it would allow less experienced admins to troubleshoot without immediately calling on me if problems arise, 2) allow us to build sequences for business- or management-oriented users to

StreamSets Academy

Can someone help us on the error which i am getting while creating pipeline?

Organization 5c4e8df2-f1c4-11ee-9efd-39b2ca65500f already has 0 pipelines which is the maximum allowed. Please contact support : PIPELINE_STORE_07

jose.valdivialeonFan

Community Articles and Got a Question?

Streamsets Postman Collection

Hi all, Is there a Streamsets Postman collection available? If yes, would someone please share the download link?Thank you,

Show us your Pipelines

Error: [Local FS 1 - Directory Template] Base directory path could not be created (HADOOPFS_41)

I am trying to build a pipeline by following the steps mentioned in the streamsets docs- https://docs.streamsets.com/portal/platform-controlhub/controlhub/UserGuide/GettingStarted/Try.html#task_q3r_p2x_k4b but getting an issue while writing to the local fils system.the issue resides in the directory template [Directory Template info/HDFS_output/${YYYY()}-${MM()}-${DD()}-${hh()}-${every(5,mm())}codeHADOOPFS_41 - Base directory path could not be created]Can anyone help? Do we have a set of configuration that needs to be installed before running streamsets?

srinivasa_nanduruFan

Community Articles and Got a Question?

AWS EMR Configuration

I was able to configure EMR Cluster from my transformer pipeline and start the job. But the job does not finish.It fails with an errorERROR Client: Application diagnostics message: Shutdown hook called before final status was reported.Any ideas ?Thanks !Regards,Srinivasa Nanduru

Community Leaderboard

📌 Start a conversation. Ask a question. Help others.

Become a leader!

Show full leaderboard

📌 Start a conversation. Ask a question. Help others.

Become a leader!

Show full leaderboard

Events calendar

Badge winners

vishwesh.margasahayamhas earned the badge Product expert
ajinkyahas earned the badge Innovator

Show all badges

Powered by Gainsight

Terms & Conditions

Sign up

Already have an account? Login

Social Login

or

Username *

E-mail address *

What I do... *

Data Leader Data Architect Data Engineer Data Scientist Other

Company *

Country *

Zip Code *

Marketing Communications

Yes No

Password *

I have read and Agree to the Website Terms of Service and I have read and acknowledged the Privacy Policy.

loginBox.register.email_repeat

Login to the community

No account yet? Create an account

Social Login

or

Username or Email

Password

Remember me

Forgot password?

Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.

Enter your e-mail address

Back to overview

Scanning file for viruses.

Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.

OK

This file cannot be downloaded

Sorry, our virus scanner detected that this file isn't safe to download.

OK