- - Knowledge base
Product Updates
Events

30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets

1 year ago

Home
Search

Unanswered questions

787 Topics

Recently active

Most views Most replies

YumuraFan

asked in Community Articles and Got a Question?

Getting running job metrics using python sdk

Is it possible to get the running job’s metric using Python sdk, e.g execution time, status etc?

1202

11 months ago

ChandFan

asked in StreamSets Academy

Regeneration of api credentials

I am trying to regenerate api credentials and store them in some location(text file). I tried it in many ways but seems to be not working

110

11 months ago

somFan

asked in Community Articles and Got a Question?

Control-Hub running all jobs on a single sdc.

ISSUE:We have 2 data collectors(sdc1 & sdc2) on different hosts configured with the same label(prod234) . This morning control hub was routing all jobs with job label prod234 to a single data collector (sdc2) though sdc1 was never failed. Troubleshooting:We have found the Issue is likely related to the default values for cpu (80%), memory (100%) and max pipelines (10K) which was set in the engine configuration.The running jobs are memory intensive and allocating new jobs to a collector that's even close to 100% is asking for trouble.We've adjusted the thresholds to cpu (80%), memory (75%) and max pipelines (10) and the issue have never reappeared.

11 months ago

somFan

asked in Community Articles and Got a Question?

Control-Hub running all jobs on a single data collector.

110

11 months ago

john.mcavoyStreamSets Employee

asked in Community Articles and Got a Question?

Reading Multi-line Text files using the HTTP Client Processor

BackgroundThe TEXT Data Format parser breaks-up text files containing multiple lines into individual records for each line. Unlike the HTTP Client Origin, the HTTP Client Processor creates one HTTP request for every input record, and only outputs one record per HTTP response. This means the HTTP Processor will only return the first record if the HTTP Response data is split into multiple records which can cause issues when trying to read multi-line HTTP Responses. SolutionIn order to read multi-line TEXT responses via the HTTP Processor, set the Data Format to BINARY which will read the entire HTTP Response as a single BYTE_ARRAY record. The BYTE_ARRAY data can then be converted into a String using the Field Converter Processor to output.

290

11 months ago

sudarshan_shinde1Fan

asked in Community Articles and Got a Question?

Export external jar using python sdk.

Hi,We need to export jar files/libraries of specific engine of one control hub to another control hub using Python SDK. How can we achieve it? Thanks.

180

11 months ago

simba01Fan

asked in Community Articles and Got a Question?

Why does the Speakatoo API function properly in POSTMAN but not within my integration?

The Speakatoo API functions correctly in POSTMAN, but it isn't working as expected in my integration for Text To Speech conversion. What might be the issue?

120

11 months ago

meowwFan

asked in Community Articles and Got a Question?

S3 pipelines on SDC v5.6.0 failing with AmazonS3Exception:SignatureDoesNotMatch after upgrading to v5.6.0 from v5.5.0

we upgraded data collector to v5.6.0 from v5.5.0 recently… but since then getting our s3 pipelines are failing with below ERROR.. we are using `Instance Profile` as authentication method..Also data collector with v5.5.0 which are deployed same (AWS acc)environment are working fines with s3 pipelines & instance_profile authentication method. anyone else faced similar kind of problem?RETRY: S3_21 - Unable to write object to Amazon S3, reason : com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: 3VFHG02TG4GR3KTR; S3 Extended Request ID: PePeTM0p51ZFcECJ/Mw6PU9nK4Km6AWNEaWhcykqj2ovDW/IbGcMtcaK+xcRxlMoPuRlCyxh2vo=; Proxy: null), S3 Extended Request ID: PePeTM0p51ZFcECJ/Mw6PU9nK4Km6AWNEaWhcykqj2ovDW/IbGcMtcaK+xcRxlMoPuRlCyxh2vo=

511

11 months ago

lakshmanFan

asked in Community Articles and Got a Question?

Reading CSV file it shows extra record .

My csv file has only Two Record But it shows ,extra records in stream sets Environment.this is my csv record data but Building Information column had a paragraph data.In stream sets 5 records Showing .anyone can suggest how to handle this file.

383

11 months ago

drozdse1Roadie

asked in Community Articles and Got a Question?

Set data type of a new added field (Expression Evaluator)

How can I change the type of a new added field in the Expression Evaluator? Because if I add a field like this: Then the field is defined as a string. But I need to have it as DATETIMEI know I could just not add this field but my goal is to make the pipeline creating not existing tables if needed. So I need to declare all attributes, even they are empty. Thanks

336

11 months ago

drozdse1Roadie

asked in Community Articles and Got a Question?

DATE type in Oracle origin database is casted into DATETIME on Snowflake?

Our goal is to sync between Oracle and Snowflake with CDC Oracle. Some of our source tables in Oracle are defined as DATE. So we created this tables also in Snowflake with the same type, DATE. But now I get this error message: But in this case the field ACTIVITY_TIME is DATE on Oracle and Snowflake.Later I noticed that we don’t have any problem if I convert all DATE fields into DATETIME attributes on Snowflake. But looking in the documentation, there is nothing mentioned that we have to use DATETIME in such cases. The type DATE is explicitly mentioned as supported type. Does anyone had the same issues? I mean it’s not so unusual to use the DATE type in a database

191

11 months ago

catchcharanFan

asked in Community Articles and Got a Question?

HL7 format

Hi Team, Does StreamSets support HL7 data format? I know there is a python way of doing this. Thanks,Meher

473

1 year ago

ajinkyaStreamSets Employee

asked in Share your Best Practices

MONGODB_13 - Error serializing record: java.lang.UnsupportedOperationException: BSON Converter cannot convert java.util.ArrayList to BSON Document

Issue:-While adding a json data to the MongoDB collection, following error has been observed.MONGODB_13 - Error serializing record: java.lang.UnsupportedOperationException: BSON Converter cannot convert java.util.ArrayList to BSON DocumentSolution:- MongoDB atlas was reading the record as a list. As the JSON data was enclosed with the [].We have to set the JSON Content to JSON array of objects at the Origin side.

120

1 year ago

MalathiFan

asked in Community Articles and Got a Question?

SDC Snowflake destination is not connecting

Hi,I am trying to load data into snowflake in SDC. But, I am getting below error. Can you please tell me where I am doing wrong because it is working fine in Transfomer for snowflake for the same snowflake account url. In SDC it is not working if I am using the locator on account tab. SNOWFLAKE_11 - Could not create SQL DataSource: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: JDBC driver encountered communication error. Message: Exception encountered for HTTP request: Certificate for <kxxxxxx.eu-north-1.snowflakecomputing.com> doesn't match any of the subject alternative names: [*.prod3.us-west-2.snowflakecomputing.com, *.us-west-2.snowflakecomputing.com, *.global.snowflakecomputing.com, *.snowflakecomputing.com, *.prod3.us-west-2.aws.snowflakecomputing.com]. Thanks,Malathi

400

1 year ago

lakshmi_narayanan_tDiscovered Fame

asked in StreamSets Academy

how to covert file and write into localfs as a XLSX file Format.

can you please suggest any other configuration or proseccor to achieve output.i reading my file in directory origin origin is :configuration my destination is localfs:configuration after execution of my pipeline :-out put file like - But my requirement is like how to acive my fie format ,by using any processor jython evaluvator or whole file transformer.

362

1 year ago

Dhanashri_BhateOpening Band

asked in Community Articles and Got a Question?

GCP: credential:get function - Are there limitations where this function can be used?

Hello, One of my customer is using StresmSets to access secrets saved in GCP. The SDC setup and GCP secrets setup is correct. For example, I have used secrets like below successfully in JDBC producer stage’s credential tab and pipeline works fine, picks up latest version of the secret as expected. ${credential:get("gcp", "group@org", "dbuser?latest")}${credential:get("gcp", "group@org", "dbpass?latest")}Now, they want to read a secret for which a json file was uploaded as value, and they want to extract credential information from this json. I tried to use credential:get in an Expression evaluator but it gave an error. (was planning to add more processors after this to extract userid and password from the json returned by credential:get) How can i use secret like this in my StreamSets pipeline? Thanks!

546

1 year ago

ccarimanFan

asked in StreamSets Academy

How StreamSets Stop or kill jobs running on Data Collector engine?

Would you like to share How StreamSets Stop or kill jobs running on Data Collector engine? (Control Hub)I ask this because I cannot access the job from the engine detail, in the "Running Pipelines" section. When trying to access the job instance I get the following error: "HTTP Status: 404 (Not Found): " maybe kill the job from command line? regards. Carlos.

480

1 year ago

FranciscoFan

asked in Community Articles and Got a Question?

HTTP Processor for POST request clearing header values

Hello! I’m using record header attributes as variables. For example, during pipeline execution, saving ${time:now()} in a header attribute with an Expression Evaluator, to be used in other stages. This is no problem for HTTP GET requests, as the data gets aggregated into existing records. The problem is with HTTP Processor for POST requests, where the response overwrites all existing data from before, body and headers.Is there anyway around this? I’m trying to use this header attribute value that I saved to be used in anohter HTTP POST request after the one I just mentioned, but the value is lost. Thanks in advance!

251

1 year ago

catchcharanFan

asked in Community Articles and Got a Question?

Certification - Proctor U code

Hi Team, I have paid for white belt certification and received a receipt number as well with payment confirmation for white belt exam.I never received any email from ProctorU which consists of code that I need to enter to proceed to schedule an exam. It’s been 15 days now.Can anyone help on how to proceed with this issue? Thanks,

361

1 year ago

NateFan

asked in Community Articles and Got a Question?

Platform SDK for Python Version 5.2.1 pipeline.committer return

I am using the python SDK and I when I use the method “.committer” or “.last_modified_by” on a pipeline object, I am expecting to get back an email that I am able to see under “Committed By” on the platform. However, I get back a long string that I can’t make out:sudo code of what I am doing: sch = ControlHub(credential_id="cred_id,token = token)print(sch.pipelines.get(commit_id = commit_id).last_modified_by) and instead of getting a string of an email returned, I am getting something like this returned:a83b3195-61f4-11ee-a6e4-…..@b5f95dcd-5f00-11ec-b405-…… Would you happen to know how I can turn this string into a valid email? I couldn’t find anything on this topic in the docs.

251

1 year ago

john.mcavoyStreamSets Employee

asked in Community Articles and Got a Question?

Kafka Stage fails with error "Request joining group due to: rebalance failed due to 'null' (DisconnectException)"

IssueA Kafka stage fails to connect to the broker due to the follow error stack trace:INFO ConsumerCoordinator - (Re-)joining groupINFO NetworkClient - Disconnecting from node xxx due to socket connection setup timeout. The timeout value is 8424 ms.INFO ConsumerCoordinator - Group coordinator is unavailable or invalid due to cause: null. isDisconnected: true. Rediscovery will be attempted.2023-09-27 13:07:58,776 INFO ConsumerCoordinator - Request joining group due to: rebalance failed due to 'null' (DisconnectException)BackgroundThis specific Kafka Client exception appears when the Kafka Client’s socket-connection-setup-timeout-max-ms is set too low, causing it to attempt to re-join the group.The maximum amount of time the client will wait for the socket connection to be established. The connection setup timeout will increase exponentially for each consecutive connection failure up to this maximum. SolutionWhen this property is set too low, the Kafka Client will time out during the

23720

1 year ago

shivrajRoadie

asked in StreamSets Academy

Facing an error while testing a pipeline on STF 5.x

We are facing some connection issue from STF side while executing a test code, that might be regarding the organization ID not sure why this issue is occurring it saying for organization the url STF calling is “api/security/v3/orgs/e4cbaa88-611a-11ed-b310-c96332d26f9a” but url must be ”http://api/security/v3/orgs/e4cbaa88-611a-11ed-b310-c96332d26f9a?”. I’m not sure how to fix this Error. E requests.exceptions.MissingSchema: Invalid URL

211

1 year ago

alejandro.alfonsoStreamSets Employee

asked in Share your Best Practices

Snowflake JDBC Driver v3.13.25 Behavior Change and SNOWFLAKE_11 Error

Error: If your Snowflake account name contains underscores and you're using Snowflake JDBC driver version 3.13.25 or higher, you might encounter the following error message:SNOWFLAKE_11 error. This error message indicates a communication problem with the message: "JDBC driver encountered a communication error. Message: Exception encountered for HTTP request: my-account.us-west-2.privatelink.snowflakecomputing.com: Name or service not known."Solution: To prevent this error, explicitly set allowUnderscoresInHost to 'true' in your JDBC connection string or properties. Refer to the Snowflake Community article for additional guidance.Affected Environments: This issue impacts environments meeting the following criteria:Snowflake account name contains underscores. JDBC is used for private link connections via the 'classic' regioned URL (e.g., https://my_account.us-west-2.privatelink.snowflakecomputing.com). Upgrading Snowflake JDBC driver from version 3.13.24 or lower to 3.13.25 or higher.

2020

1 year ago

Arti_RichhariyaFan

asked in Share your Best Practices

Reading File from s3 and saving it as a same name

I am reading a file from s3 and after doing some manipulation in file i wanted to save it different location in s3 but with same name .how can i do this?please help.

321

1 year ago

Priya151997Fan

asked in Community Articles and Got a Question?

After S3 Event need to capture file name in expression evaluator

Hi All,we are new in streamsets !!In our usecase we are uploading file in s3 which predefined name as abcd_sggshs_31-04-2018.csvwe need to capture this above parameter which are in our file name,we are generating events after uploading file in s3,using expression evaluator we are trying to capture this parameter in json file in s3.but we are not getting desired output.below are the expression we are using please suggest the correct expression to get the file name.

2798

1 year ago

Page 5 / 32

Badge winners

Ykyeeywkkwhas earned the badge Eager to help
vishwesh.margasahayamhas earned the badge Product expert
ajinkyahas earned the badge Innovator
Sanjeevhas earned the badge Eager to help
AkshayJadhavhas earned the badge Eager to help

Show all badges

Terms & Conditions

Sign up

Already have an account? Login

Social Login

Username *

E-mail address *

What I do... *

Data Leader Data Architect Data Engineer Data Scientist Other

Company *

Country *

Zip Code *

Marketing Communications

Yes No

Password *

I have read and Agree to the Website Terms of Service and I have read and acknowledged the Privacy Policy.

loginBox.register.email_repeat

Login to the community

No account yet? Create an account

Social Login

Username or Email

Password

Remember me

Forgot password?

Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.

Enter your e-mail address

Back to overview

Scanning file for viruses.

Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.

This file cannot be downloaded

Sorry, our virus scanner detected that this file isn't safe to download.