Skip to main content

- - Knowledge base
Product Updates
Events

Create topic
Login/Register

30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets

A

2 years ago

Home
Community overview
StreamSets Platform

StreamSets Platform

Get inspired and gain all the knowledge you need

Community Articles and Got a Question?
Share your Best Practices
StreamSets Academy

1243 Topics

asked in Community Articles and Got a Question?

Oracle CDC Can't Running - Missing redo log files for time range

Issue:Oracle CDC Origin status can’t running because it got error message Missing redo log files for time range.But when I checked the log at TRACE level, it got the redo log files:TRACE LogMinerSession - Find logs (2022-04-28T04:58:59, 2022-04-28T04:58:59): +FRA_DG/ONLINELOG/group_1.18814.1103108341, seq: 1/46137, start: 11638474806971 (2022-04-28T02:02:25), end: 11638474807092 (2022-04-28T02:02:28), status: INACTIVE, online: true, archived: true, dictionary: no, discardedTRACE LogMinerSession - Find logs (2022-04-28T04:58:59, 2022-04-28T04:58:59): +FRA_DG/ONLINELOG/group_2.16414.1103109069, seq: 1/46138, start: 11638474807092 (2022-04-28T02:02:28), end: 11638475179615 (2022-04-27T21:58:59.367866), status: CURRENT, online: true, archived: false, diction

asked in Community Articles and Got a Question?

I want to shot one mail when my specific jon change its status from Active to Inactive or Inactive error.Please help

RishiStreamSets Employee

posted in Community Articles and Got a Question?

How to delete header attribute in JavaScript Evaluator

You can use the JavaScript Evaluator processor to read, update, or create record header attributes.Use a map when creating or updating a header attribute. If a header attribute exists, the script updates the value. If it does not exist, the script creates the attribute and sets it to the specified value. However, when it comes to delete due to the underlying Java interpretation (Nashorn engine API) and hence you need to use Map API within Nashorn to delete/remove elements. i.erecords[i].attributes.remove('delete_this_attribute')

asked in Community Articles and Got a Question?

JDBC_23 - Can't coerce '2022-04-25 09:46:50' of type 'STRING' to column 'last_update_date(DATETIME)'

my configuration:Use Multi-Row Operation is closeand Max Cache Size Per Batch (Entries) is “-1”origin:{ "table": "tableName", "op_type": "I", "op_ts": "2022-04-25 09:01:56.000000", "current_ts": "2022-04-25T09:02:01.014000", "pos": "00000000020000939589", "after": { “ROW_ID”:”123456”, "LAST_UPD": "2022-04-25 01:01:54" }}Destinations is jdbc producerI can not update the LAST_UPD into mysql column “last_update_date” please give me some advice

kapil24nagarFan

asked in StreamSets Academy

Lab: Build a JDBC Pipeline (zomato database is not found)

where can I get the SQL used to create the review tables under zomato database? or is there a way/script by which it can be made available in mysqldb docker?

asked in Community Articles and Got a Question?

Guidance with authentication token job

We have two Streamsets jobs. One job generates token continuously. Another job moves data from a processed Kafka topic to API endpoint. Alongwith, it sends a token in its header. Endpoint validates if token is valid, then process data. If token isn’t valid, data is lost. Token is regenerated every 8 mins. So if data reaches endpoint after 8 mins, token it carried will be invalid. We want a solution that provides near ZERO data loss. One Possible solution we are thinking is as below. Please advise if there is any better option. Retry on Kafka topic to API endpoint job using ‘HTTP Client’ Processor. Write errored records to a local file/S3/ Oracle/HDFS table. Run a daily job every 3 hours to move data from file/table to API endpoint.

sreenivasaraovakaFan

asked in Community Articles and Got a Question?

Control HUB Job runner REST API documentation

Can you please provide documentation about /jobrunner/rest/v1/job/{jobId} API response columns name? I want to know how to process each response output column. ThanksSreenivas

asked in Community Articles and Got a Question?

ODBC driver for OS db2 Z/0s version 12 m500.

helloIs streamset support DB2 database? if yes what version of ODBC driver need for this OS db2 Z/0s version 12 m500.please point me to right download place or provide the right version to download.

ajinkyaStreamSets Employee

posted in Community Articles and Got a Question?

How to change labels of jobs using python SDK?

Please make sure all the jobs are in inactive state. Changes following parameters as per the environment:-<CONTROL_HUB_URL> = Controlhub URL <USERNAME> = admin username <PASSWORD> = password <OLD_LABEL> = old label <NEW_LABEL>= new label from streamsets.sdk import ControlHub,DataCollectorDataCollector.VERIFY_SSL_CERTIFICATES = Falsesch = ControlHub('<CONTROL_HUB_URL>', username='USERNAME', password='PASSWORD')jobs_with_existing_label = sch.jobs.get_all(data_collector_labels=['<OLD_LABEL>'])print(f'jobs_with_existing_label = \n {jobs_with_existing_label}')for job in jobs_with_existing_label: job.data_collector_labels = ['<NEW_LABEL>'] sch.update_job(job)

asked in Community Articles and Got a Question?

Error when trying to Streamsets in Kubernetes cluster

I’m trying to deploy Streamsets in a AWS EKS cluster with several approaches, but none of them seem to be working.First I tried to deploy a control agent following the steps here: https://streamsets.com/blog/deploy-dataflow-pipelines-kubernetes-streamsets-control-agent/When I run the startup script for EKS with the required parameters I receive the following message:Running common-login.shFailed to authenticate with SCH :( If I debug the script, I can see the following error returned by the login service:{ "EXCEPTION": { "rawMessage": "java.lang.IllegalArgumentException: Invalid/unsupported hash version 'UNKNOWN'", "className": "java.lang.IllegalArgumentException", "message": "Invalid/unsupported hash

ApoorvaSharmaFan

asked in Community Articles and Got a Question?

Tablename for oracle jdbc origin in transformer pipeline

I am using Transformer pipeline to connect to JDBC origin.I need to fetch the tablename attribute which I have added in the parameters for the pipeline.The tablename to be compared to a value in stream selector so that it matches and then saves the table in hive.I am not able to fetch the table name which I have passed in parameters. Please help with retrieving the tablename by using spark or in stream selector how can I use the attribute for table ,when the origin is Oracle JDBC .

posted in Community Articles and Got a Question?

hello,i’m trying to capture time series data for the pipeline. i’m thinking about using the rest api to pull the data and store it in influxdb.having said this, if we set the pipeline/job configuration to write the pipeline statistics to control hub, can it be accessed so that it can be moved to influxdb or does it write to influxdb automatically? thxeric

asked in Community Articles and Got a Question?

curl: (56) Received HTTP code 407 from proxy after CONNECT

getting this error when i run cur …….any idead??? how to fix the proxy issue: thanksjames. Java 1.8 detected; adding $SDC_JAVA8_OPTS of "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Djdk.nio.maxCachedBufferSize=262144" to $SDC_JAVA_OPTSData Collector Engine: DIST : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1 HOME : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1 CONF : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1/etc DATA : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1/data LOG : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1/log SDC_EXTERNAL_RESOURCES : /root/.streamsets/install/dc/streamsets-datacollector-4.4.1/externalResources RESOURCES :

asked in Community Articles and Got a Question?

One REST request gets turned into multiple outbound requests

I have a pipeline that is basically a pass-through REST service. It takes the input, routes based on URL with a stream splitter, sends out a REST call to another service of ours, gets the response and returns it to the sender. I am getting the output as expected.However, I noticed that a call to the data collector REST endpoint results in 100+ http calls going out to our service (Call Deals / Generate Coupons in the image below).When reviewing the snapshot, only 1 record shows. I can see from the service’s logs that many calls are coming in and giving a 200 response.I am seeing the same behavior for both splits in the stream, so this is somehow related to the pipeline’s setup. Any ideas on what I could check? Thanks!

rvlozanoStreamSets Employee

asked in Community Articles and Got a Question?

How to connect JDBC Consumer Stage to MS SQL Server via AAD?

Question:How to connect JDBC Consumer Stage to MS SQL Server via AAD or Active Directory? Answer:Our product teams are currently evaluating to see this could be added natively. Also, our Product Teams has this on the future Product Road Map for native support.For now you will get the following error message if you try to use “authentication=ActiveDirectoryPassword” in your JDBC URL.JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Failed to load MSAL4J Java library for performing ActiveDirectoryPassword authentication.: hikariConfigBean.connectionStringThis is because the out of box Microsoft SQL Server Drivers comes directly from Microsoft. Microsoft does not package support directly for the authentication method ActiveDirectoryPassword. For this to work you must acquire MSAL4J and it’s dep

asked in StreamSets Academy

Engine is no created when a deployment is successfully created

As a part of DataOps Platform Fundamentals course, i was trying to set up environments and create deployments. When I run the auto generated Install Script from my Deployment to provision the engines, I get Exit :1 and see no Engines created on my Streamset Console. Kindly suggest.

Pradeep_BalaFan

asked in Community Articles and Got a Question?

Scala Spark Job "createOrReplaceTempView" error in Transformer job

Hi,We have a Scala spark Job where we read couple of CSV files into data frames and create temporary views ("createOrReplaceTempView") on those data frames. Then we write business rules logic in Spark SQL using the tempororary views created earlier and write the result into another dataframe.We tried to incoroporate this Scala Spark Job in Transformer using Scala Stage. But we are facing an error while creating “createOrReplaceTempView”. Could you please help resolve the issue and let us know how we can incorporate Scala Spark Job in Streamsets Transformer? Thanks,Pradeep

asked in Share your Best Practices

Hello new to Streamsets. my question : In one server can i install 2 DC by creating 2 Env’s . ? (DevEnvDC and UATEnvDC .) or 1 server 1 DC installation only? thanksJames.

asked in Community Articles and Got a Question?

Store Event Generated using Jython into Postgres

Hi Team, I am subscripting to platform event in Salesforce. Once the events are over I would like to generate event to stop the pipeline and also use the same event , append some fields and insert into Postgres.I followed below article https://stackoverflow.com/questions/61800532/how-to-transition-a-streamsets-pipeline-to-finished-state-if-the-origin-does-not and I was able to generate event using Jython and stop the pipeline. When I am trying to add fields to the event using Expression evaluator it throws java.illegalArgumentException. Request for assitance on how I can get this fixed

Ashish_chandFan

asked in Community Articles and Got a Question?

zomato source directory

not sure how to get zomato source directory. could you please help

Srinivasan SankarHeadliner

asked in Community Articles and Got a Question?

Ingest data from files that reside in a SharePoint location

Hi there,What options do we have to ingest data from excel files that reside in SharePoint using StreamSets Data Collector? Thanks and regards,Srini

asked in Share your Best Practices

Informatica to Streasets

Hi Team,We have a requirement to transform the informatica mappings into Streamsets. Since there are more than 200 mappings, so it is not possible to create each one of them seprately in Transformer.Can you suggest the best approach and any related link or documentation for this.We are planning to use the Python SDK, but there are lots of challenges in it.

asked in Community Articles and Got a Question?

performance is very slow while using more number of jdbc producers

Need few inputs how we can improve the performance issue.

HEMANTH14194Fan

asked in Community Articles and Got a Question?

Performance is very slow when I am using JDBC Lookup processor in Data Collector

I am trying to filter records by looking at a specific value from a table using jdbc lookup processor. My source has around 100k records and my jdbc lookup has only 1 record, but the pipeline is taking 2 minutes to write 1000 records to Hive. Overall the execution time is around 200 minutees for 100K records which is very bad. I request you to help me with this.

asked in Community Articles and Got a Question?

Denodo JDBC Multi table origin offset issue

Hi, I am using a multi table JDBC Origin to connect to Denodo and extract data. The pipeline is running fine but hangs after 5-10 minutes. Getting the below error for offset column I am using.Please let me know if I have to change any settings. The same works perfectly fine when I am using a single table JDBC consumer.Pipeline Status: RUNNING_ERROR: JDBC_75 - Jdbc Runner Failed. Reason java.util.concurrent.ExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.NumberFormatException: For input string: "2020-04-05 19:18:11.0"

1
...
38
39
40
41
42
43
44
...
50

Page 41 / 50

Badge winners

Sperchhas earned the badge Eager to help
vishwesh.margasahayamhas earned the badge Product expert
ajinkyahas earned the badge Innovator
Sanjeevhas earned the badge Eager to help
AkshayJadhavhas earned the badge Eager to help

Show all badges

Powered by Gainsight

Terms & Conditions Accessibility statement

Sign up

Already have an account? Login

Social Login

or

Username *

E-mail address *

What I do... *

Data Leader Data Architect Data Engineer Data Scientist Other

Company *

Country *

Zip Code *

Marketing Communications

Yes No

Password *

I have read and Agree to the Website Terms of Service and I have read and acknowledged the Privacy Policy.

loginBox.register.email_repeat

Login to the community

No account yet? Create an account

Social Login

or

Username or Email

Password

Remember me

Forgot password?

Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.

Enter your e-mail address

Back to overview

Scanning file for viruses.

Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.

OK

This file cannot be downloaded

Sorry, our virus scanner detected that this file isn't safe to download.

OK