- - Knowledge base
Product Updates
Events

30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets

1 year ago

Home
Community overview
StreamSets Platform

StreamSets Platform

Get inspired and gain all the knowledge you need

1209 Topics

Most replies

Most views

Newest first

Srinivasan SankarHeadliner

asked in Community Articles and Got a Question?

Can I extract data from a DynamoDB source using Data Collector?

Has anyone used Data Collector to extract data from DynamoDB source? How do we connect to a DynamoDB source using Data Collector Stage library?

1211

2 years ago

Srinivasan SankarHeadliner

asked in Community Articles and Got a Question?

Update Offset file for a single job instance in Control Hub

I have a Job template in Control Hub that gets called 7 times with different set of parameters i.e. 7 different job instances. Can I update the offset file for a single job instance in Control Hub?

2421

2 years ago

xarntheheroFan

asked in Community Articles and Got a Question?

Is the http client destination missing the http body input?

The http client processor piece has an area to put the http body, but the http destination doesn’t seem to have that same input. Using a PUT method for both.

811

2 years ago

gowrinadhFan

asked in Community Articles and Got a Question?

Producing single S3 output file

Hi,We are processing S3 files with the batch size of 1000, the output also we are planning to store in S3 only. But we have he input file of 10000 records so we are seeing 10 output files in s3. As per client requirement we need to create a single file.Is there a way to create a single S3 output file from Streamsets.

1054

2 years ago

Abhishek SardaFan

asked in Community Articles and Got a Question?

How to pass Table Name as an attribute while producing events from hadoop fs?

My pipeline is configured to pick data from JDBC Multitable Consumer Origin and put it in Hadoop FS Destination. My requirement is to rename the output file at destination with TableName_TimeStamp.Able to achieve TimeStamp using Expression Evaluator. How do I get TableName as an event passed from Hadoop Fs to be used in HDFS File Metadata?

1264

2 years ago

brian parkerFan

asked in StreamSets Academy

Issue with Streamsets Tutorial

In the setup deployment step of the streamsets tutorial, when I run the update-nodes.sh script in the strigo environment, I get the following error every time:Error response from daemon: enpoint with name wonderful_volhard already exists in network streamsets-coreError response from daemon: enpoint with name wonderful_volhard already exists in network streamsets-integrationsError response from daemon: enpoint with name wonderful_volhard already exists in network streamsets-cookedAlso, the tutorial says I should see 3 engines in the control hub, but I only see 2.

712

2 years ago

jmazariegosFan

asked in Community Articles and Got a Question?

Dynamic SQL

Hi, I have a product table named ‘product’ in MySQL as follows:product_id | Product | FieldName1 Milk milk2 Water water3 Coffee coffee Then, I have a source fully de-normalized table named ‘raw_transaction’ as follows:transaction_Id | Date | customer | milk | water | coffee | 1 1/1/2021 John 12 1/1/2021 Mary 1 13 1/1/2021 Anna 1 Can you give me a hint on how I can create a pipeline in StreamSets so that I can use the product table as meta-data in creating a dynamic query so that I can populate a ‘FactCustomerProduct’ as follows For each product in products INSERT INTO FactCustomerProduct (product_id,date_id,customer_id,transaction_id,quantity) SELECT p.product_id,r.date_id,customer_id,r.transaction_id,r.<fieldName> FROM ‘raw_transaction’ r [...] WHERE r.<fiel

3031

2 years ago

shownFan

asked in Community Articles and Got a Question?

ORA-00942 While running Oracle CDC on 19c Databases

The configuration is made according to the linked document CDC documentationhttps://docs.streamsets.com/portal/datacollector/3.17.x/help/datacollector/UserGuide/Origins/OracleCDC.html#concept_rs5_hjj_tw run into the below errorJDBC_52 - Error starting LogMiner Caused by: com.streamsets.pipeline.api.StageException: JDBC_603 - Error while retrieving LogMiner metadata: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist

1222

2 years ago

mblahayDiscovered Fame

asked in Community Articles and Got a Question?

How does JDBC destination handle fields without matching columns?

How does JDBC destination handle fields without matching columns? I am encountering a situation where it appears that the fields without matching columns are ignored, but this is not defined in the documentation and seems counter to what I would expect (an exception complaining about a column not existing in the table). Please provide a deeper description of what is happening here.

260

2 years ago

mblahayDiscovered Fame

asked in Community Articles and Got a Question?

Maximum size of pipeline title

Is there a maximum size for a pipeline title?

251

2 years ago

Yago AparecidoDiscovered Fame

asked in Community Articles and Got a Question?

Erro ao usar um diretório criado no IIS (Windows) e publicado via FTPS

Existe um diretório criado no IIS (Windows) e publicado via FTPS.Ao tentar usar este diretório em StreamSets usando o componente SFTP|FTP|FTPS, ele retorna o seguinte erro: "REMOTE_11 - Unable to connect to remote host 'ftps://ftps.hostname.net:921/PLV' with given credentials. Please verify if the host is reachable, and the credentials and other configuration are valid. The logs may have more details. Message: Could not list the contents of "ftps://ftps.hostname.net:921/PLV" because it is not a folder. : conf.remoteConfig.remoteAddress" As credenciais estão OK, pois permite abrir este diretório usando o serviço LFTP no LinuxEstá faltando alguma configuração de StreamSets para corrigir esse problema?

140

2 years ago

anirbanchFan

asked in Community Articles and Got a Question?

List all pipelines created by an user

Hi,How do we extract all the pipeline names authored and jobs commited by an user from Control HUB? Regards, Anirban

571

2 years ago

harshithDiscovered Fame

asked in Community Articles and Got a Question?

Dynamically creating folders in S3 using streamsets

im using oracle as a source and s3 as destination.im ingesting the records from the source and adding table name as a column through expression evalutaor. i want to use this table name and create a folder in s3 dynamically before dropping those records in s3 bucketso the folder name should be created dynamically through streamsets fetching the name of the table. what should be the approach for the same?for ex: if I'm fetching records from table abc , I need to create a folder called “abc” and drop all the records inside that folder.

761

2 years ago

JagadeeshFan

asked in Community Articles and Got a Question?

Groovy establish connection to mongodb

HI,Could you please provide a snippet of connection establish to mongodb from groovy script.

3053

2 years ago

lr123Fan

asked in Community Articles and Got a Question?

jdbc query used toTable data emptying

When the JDBC query component in executors is used to empty table data, it will not stop after starting the task. Note: pipeline finisher has been used on the java script component SQL QUERY:delete from depart_passenger_info

1783

2 years ago

Raman GoyalFan

asked in Community Articles and Got a Question?

How to check the Retry records in http client

Can you please help me on below points.How we can get to know how many times StreamSets has retried for failure records ?what is their data ?What value we should give in Base backoff interval field ? and what all settings we have to do. bcz for me i am not seeing the incoming data to streamsets is matching with processed records. the difference between both is keeps on increasing. Can you please suggest something on this.

1011

2 years ago

mySSnameFan

asked in Community Articles and Got a Question?

All Files are not copying from one folder to another folder in the same s3 bucket

HI, I have tried copy all the files from one folder to another folder with in same s3 bucket using streamsets job. But I am seeing only 1 or 2 files copied into destination folder compared to source folder(like in source folder if 7 files are there, but in destination folder I am seeing 1 or 2 files are copying). Can any one help me on this issue. ThanksMurali

2071

2 years ago

ashok vermaDiscovered Fame

asked in Community Articles and Got a Question?

how to know what values are present in action for filed remover compoennet in SDK

in Control hub i will know what are values are present in action for Field Remover but in SDK how to know.field_remover = pipeline_builder_14.add_stage('Field Remover')for field_remover.action, what values are present how i will know through SDKThanks,Ashok.

353

2 years ago

jerriRoadie

asked in Community Articles and Got a Question?

Lookups (into DeltaTable) delivering extremely bad performances when used in Transformer

Lookups (into DeltaTable) giving extremely bad performances (sometime it stays in pre-execution stage forever) when used in Transformer with origin of 1000 records, although, it works decent enough in streaming mode which i guess is due to the lesser number of incoming records.

261

2 years ago

bonthunagireddy03Fan

asked in Community Articles and Got a Question?

how to add new stages to an existing pipelines using streamsets python SDK?

Hi team,how to add new stages to an existing pipelines using streamsets python SDK?

2035

2 years ago

ashok vermaDiscovered Fame

asked in Community Articles and Got a Question?

i want to extract multiple fields from JSON/XML using XML Parser etc..

i want to extract multiple fields from JSON/XML using XML Parser etc..i am able to extarct with groovy but i want to achive like belowreading a file from S3 using data_format as XML extarct multiple fileds from XML in step 2<body><head>1</heaad><m>3</m><tail>2</tail><body>in step 2 i want to have 2 values in my output with out using any groovy etc..i want to achive using XML parser or filed mapper etc.. as of today i see only one value i can extarct these ex : /body/headbut i want to extarct both /body/head/body/tail

4467

2 years ago

xarntheheroFan

asked in Community Articles and Got a Question?

View job version differences for same pipeline

I’m looking for a tool to help prevent missing job version updates when moving code from one environment to another (development / UAT / production).If we update a pipeline and job in our development environment but move ancillary code to our test environment a week later, I’ve seen it is easy for our team to miss the test environment job update, resulting in wasted testing time. It would be helpful if we could see at a glance the differences in job versions before doing a deployment to help validate that we have the correct list of jobs to update as a part of that deployment.I see the REST API and could put together a script to compare versions, but I was wondering if there is any way we could visually see that in Control Hub, or even a way to build a pipeline or report that could be run to give this information. Thanks in advance!-Spyder

431

2 years ago

mblahayDiscovered Fame

asked in Community Articles and Got a Question?

Error Records: Send Response to Origin

A pipeline’s origin is an S3 bucket. Error records are configured to “Send Response to Origin.” What exactly happens to the error records in this instance?

683

2 years ago

krishnankannanFan

asked in Community Articles and Got a Question?

build pipeline in transformer

I am trying to create transformer pipeline using python sdk , unable to connect transformer engine.. i am getting two id and url for sch.tramsformers command . Please help me

1358

2 years ago

pranavkatkarFan

asked in Community Articles and Got a Question?

Streamsets asking for connect to control hub or to enter registration code after upgrading to 4.2.0 version.

Hi Team,To fix log 4j vulnerability , I upgraded the streamsets version from streamsets/datacollector:3.18.1 to streamsets/datacollector:4.2.0. After that, I am not able to create a new /import pipeline. User Interface asks me to connect to control hub or enter activation code which was not the case in version 3.18.1.

1551

2 years ago

Page 45 / 49

Badge winners

Ykyeeywkkwhas earned the badge Eager to help
vishwesh.margasahayamhas earned the badge Product expert
ajinkyahas earned the badge Innovator
Sanjeevhas earned the badge Eager to help
AkshayJadhavhas earned the badge Eager to help

Show all badges

Terms & Conditions

Sign up

Already have an account? Login

Social Login

Username *

E-mail address *

What I do... *

Data Leader Data Architect Data Engineer Data Scientist Other

Company *

Country *

Zip Code *

Marketing Communications

Yes No

Password *

I have read and Agree to the Website Terms of Service and I have read and acknowledged the Privacy Policy.

loginBox.register.email_repeat

Login to the community

No account yet? Create an account

Social Login

Username or Email

Password

Remember me

Forgot password?

Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.

Enter your e-mail address

Back to overview

Scanning file for viruses.

Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.

This file cannot be downloaded

Sorry, our virus scanner detected that this file isn't safe to download.