30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets
Hi ,I am creating a new Connection with SFTP protocol and trying to connect to the SFTP server using a private key .I inputted these while creating the Connection:Authentication as Private Key ,Private Key provider as plain text Private Key as the text that i copied from the ppk file Username the correct one there was no Passphraseand when i hit Test Connection I get the below errorStage 'com_streamsets_pipeline_lib_remote_RemoteConnectionVerifier_01' initialization error: java.lang.NullPointerException: Cannot invoke "net.schmizz.sshj.userauth.keyprovider.KeyProvider.getPublic()" because "this.kProv" is null (CONTAINER_0701) But alternatively if i input the same in the credentials tab of the pipeline and preview ,I am able to successfully connect to the sftp server and read the file there . Problem is when i create the same via a connection Kindly help me to resolve this issue .
I have a stored procedure that works when I call it manually, both within Snowflake and via snowsql. I am trying to automate the call of this procedure via StreamSets, but it continues to fail with the ambiguous error message:Technical details: An exception has arisen while executing the query 'CALL ETS_METRICS.TABLEAU_METADATA_COLLECTION();': SQL compilation error: Unknown user-defined function ETS_METRICS.TABLEAU_METADATA_COLLECTIONI am aware of this post and it is not relevant since it only is about how to do it, not troubleshooting error. That said, I did try making the query a stop event and the same error is thrown. What boggles my mind the most is that I am specifying the schema even though the defined connection already connects to the correct schema, and the same role that runs all our automation is used in the connection (and also happens to be the owner of the stored procedure).Any ideas that you might be able to offer are appreciated.Thank you in advance.
Hi All, I have a requirement to use the aggregation query using match,group,sum,min,max and counts on a MongoDB Atlas collection in a StreamSets pipeline probably in a lookup processor by passing the input fields.Is it possible? Thanks,Mahender
Deduplicate and keep latest based on timestamp field in streamsets ibm?
Hi, I have an API which takes page_no and page_size as parameter for pagination.API response is not a json data or had a records in list/array but a Binary Byte array as string.API returns a zip file.In response header it has total_pages, page_no and page_size. In streamsets for pagination with page number we have to provide result field path for incrementing to next page number. which is not possible in my case.refer doc https://learn.ocp.ai/guides/exports-api Could anyone please suggest solution for it.
Hi
I'm working with StreamSets and have three pipelines that each consume from different Kafka topics. Each pipeline writes to its own dedicated table (e.g., table1,table2, table3), but all three also write to a shared table . I'm using logic to avoid duplicates(update or insert). My concern is about concurrent writes — if all three pipelines try to write to the same row (shared table) at the same time, will StreamSets or the database handle it safely? I'm also wondering how to properly configure retry attempts, error handling, and backpressure in StreamSets to ensure that no data is lost or skipped if database locks or contention occur. What are best practices for configuring the JDBC and pipeline error handling for this kind of multi-pipeline write scenario?
We have an HTTP Client origin that pulls from our ServiceNow site records that would be returned if we ran a report in ServiceNow directly. The records pulled appear at first blush to all be accounted for, but unfortunately the origin just keeps looping and pulling the every record over and over. I reduced the batch size to as low as 100 records without effect (i.e., it still loops endlessly).The URL with parameters is:https://<intentionally_redacted>.servicenowservices.com/api/now/table/task?sysparm_query=closed_atISNOTEMPTY%5Eclosed_at%3E%3Djavascript%3Ags.dateGenerate('2023-01-01'%2C'00%3A00%3A00')%5Eclosed_by.department.nameSTARTSWITHPBO%20-%20ETS%5Eassignment_group!%3Daeb1d3bc3772310057c29da543990ea2%5Eassignment_group!%3D4660e3fc3772310057c29da543990e0b%5EnumberNOT%20LIKEGAPRV%5EnumberNOT%20LIKERCC%5Esysparm_display_value=true%5Esysparm_limit=100%5Esysparm_offset=0I have the stage set to pull in Batch mode, and for pagination I have tried all 5 modes including “None”. Since
Environment:StreamSets Data Collector v4.3. Control Hub v3.51.4. (Inactivity period for session termination = 30 Min) Issue:It has been observed that credentials time out is happening on the long running Streaming Pipelines when using Start Jobs stage. This can happen when using the User & Password Authentication type.The following error captured from control hub logs indicates that the token has expired. START_JOB_04 - Failed to start job template for job ID: START_JOB_04 - Failed to start job template for job ID: 1238759hvbv949:Test, status code '401': {"ISSUES":[{"code":"SSO_01","message":"User not authenticated"}]}, status code } Solution:After an internal investigation, found that we are encountering COLLECTOR-1125, which has been addressed and resolved in data collector v5.1.
Is it possible to query the SCH for the longest running jobs? For example, say you have 100 jobs, and you want to know the 10 longest running jobs out of the total (I’d happily take just reporting on averages if that is all there is). It seems like this would be a useful report or API endpoint to have.
Hello!!Currently, I’m using the SDK to extract metrics from some jobs. However, instead of having only {run_count, input_count, output_count, total_error_count} from sch.job.metrics, I would like to have some other metrics as well, such as counting each record for each step of my job. Is this possible? How can I achieve this?I’m also looking to get the last received record (for each pipeline) that we can see in the real-time summary. How can I get this metric from ControlHub using the SDK?
I am attempting to use Postman to query the runtime history of all jobs so that I can provide a list of the X slowest running jobs. I have been unable to configure Postman correct for this GET call, and would like to know from the hivemind what I am doing wrong here.Trying to use this API endpoint:https://{Control_Hub_URL}.streamsets.com/jobrunner/rest/v1/job/{jobId}/historyAuth is set to “No auth” based upon another forum post I read that stated we should include the ID and key for the API credentials used in the header, though I have tried it with API Key and providing the credentials there, as well.I have read through the 14 questions and 4 conversations that return when searching here for “Postman” and tried a few of the proffered solutions but without success. I have no parameters and the headers are where I have been trying to define API credentials. Note these credentials are used by automation in the Control Hub so they are valid. I think my issue lies in what header names I ne
I am using Data collector 5.8.1 . I am selected JDBC in my pipeline start event I am using this query in my start event: DECLARE @XYZWorkFlowID INT = (SELECT TOP 1 WorkFlowID FROM dbo.RefWorkFlow WHERE WorkFlowName = 'XYZ')INSERT INTO JobTracker(JobName,WorkFlowID,Campaign,LastExecutedTime,JobStatus)VALUES ('XYZ_ABC_Export',@XYZWorkFlowID ,'Dialer',GETDATE(),'InProgress')I need to get the identity value generated by JobTracker table and assign it to a ‘JobTrackerId’ pipeline parameter. I need to use the same parameter in the stop event jdbc to update JobTracker the table.I tried adding: ${JobTrackerId} = SELECT @@IdentityI'm pretty sure this is the wrong approach. Can anyone guide me through the right one?
Docs and at least one community thread have suggested that the following expression should work in a stream selector conditional:${record:exists('/ids') && length(record:value('/ids')) > 0}This returns:ELException: No function is mapped to the name "length".The field in question is a list of integers. What am I not understanding?Thanks
Become a leader!
Already have an account? Login
No account yet? Create an account
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.