Got a Question?
Can't find what you're looking for? Ask it here!
- 563 Topics
- 1,368 Replies
Hi,Yesterday I created one Pipeline for Aurora Postgresql CDC client with JDBC connection.It worked fine.Today all of sudden I started getting below error.JDBC_00 - Cannot connect to specified database: org.postgresql.util.PSQLException: ERROR: replication slot "sdc" is active for PID 22025Though I checked nc command. It is successful.I did not get enough resolution for the above error.Please provide the resolution.
Hi Team,We have a situation where backend server is not responding on time/network latency happening intermittently and HTTP client processor is stopping the execution and throwing the below error.Looking at the error its coming back with a NULL HTTP status code which is not letting us setup a Pre-Status Action for this scenario. Could you pls suggest us if there is a better way we could handle this error and retry the request back so we don’t loose the data and continue with the process.com.streamsets.pipeline.api.base.OnRecordErrorException: HTTP_03 - Error fetching resource. HTTP-Status: NULL Reason: java.util.concurrent.ExecutionException: javax.ws.rs.ProcessingException: java.net.SocketTimeoutException: Read timed out at com.streamsets.pipeline.stage.processor.http.HttpProcessor.processResponse(HttpProcessor.java:825) at com.streamsets.pipeline.stage.processor.http.HttpProcessor.process(HttpProcessor.java:358) at com.streamsets.pipeline.api.base.SingleLaneProcessor.proces
It happens often that a developer creates a pipeline and a job but forgets to give permissions to the team (sharing) so that the pipeline/job can be seen by others. When the job in PRD fails, we would have a big problem.It would be nice if we can use a python script to query the REST API and retrieve permissions on each pipeline/job so that we can fix the sharing problem proactively.Can the permission info for each pipeline/job be retrieved from the REST API? If the API cannot provide it, can the info be retrieved by other means?Thanks a lot in advance! Have a great day!
I am upgrading streamsets data collector from 220.127.116.11 to 3.22.3. Initially it was asking for registration and I added SDC_CONF_http_authentication=form as env variable, it worked with admin/admin creds and did not ask for registration. Now I configured it with ldap authentication and it’s asking for registration again.Is there anyway to skip the registration or automate the registration on first run as I’ll be running this on kubernetes and won’t be able to register manually each time the pod gets deployed.
Hi,As building Microservice is the part of Yellow Belt Certification syllabus.As this exercise in not the part of the lab provided by Streamsets Academy, would like to know is there any REST API built-in the Strigo lab environment?If yes, what is the end point for the same?Looking forward to your reply. Regards,Ankur
Hi! I have a few problems with my pipelines regarding the random data generator when trying to write both to Postgres and to Mongodb. So creating the pipeline is working and sending data to them works. But when I changed the data collector to a new one because of some erros I have got a few new UI bugs that prevent me to add more records with the random data generator look picture below. The UI have changed on all my browsers as well, Chrome and Safari. And further when I press anything on the pipeline after this I get an unexpected error. Regards, Fredrik
I am using the Directory to readthe *.log.gz files from ther servce using lexicographically ascending file names as teh read order. Set the data format as ‘Text’ and the compression format is compressed file with 1024000000 as the max line length. I met the SPOOLDIR_01 Error code and the error message said ‘Failed to process file ‘…...log.gz’as posion ‘0’: com.streamsets.pipeline.lib.dirspooler.BadSpoolFileExcetion:com.streamset.pipline.lib.parser.DataParserException: TEXT_PARSER_00-Cannot advace file ‘…...log.gz’ to offset ‘0’ ’I am not sure why this problem happened and anyonehas idea?
In my FIELD RENAMER processor my input 6 columns looks like (1'result.name' , 2'result.adress.street' , 3'result.adress.city' , 4'result.adress.state', 5'result.adress.zip' , 6'result.phone' ) BUT I WANT these output as a column name like(1’name' 2'street' 3city' 4'state' 5'zip' ,6'phone' )And also want to achieve these output in single processor field renamer
We are using snowflake executor to truncate the table before loading data into it.We have configured snowflake destination and executor with same credentials by using private key file. If I remove executor, pipeline is able to load data into snowflake but when I am adding snowflake executor, its giving error as “password can’t be empty” for executor.kindly help. Thanks in advance.
HiI am a newbie at Streamset and its DC. I am currently trying to connect to Cumulocity IoT and streams device measurement to store them into a file. Cumulocity offers a public REST API and it is possible to subscribe to real-time measurements once the client is connected. So what i m trying to achieve in my pipeline is as per below:create an handshake with Cumulocity IoT Subscribe to the measurement channel with the client ID generated in the response of step 1 Receive the measurements and store them For step 1, I used as Origin the HTTP Client. It does connect to Cumulocity and does generate a clientId. However, I would like for the origin to just send 1 request and stops sending any other handshake request afterwards. At the moment it is sending handshake request continuously and therefore generating continuously different clientId which is not what I want. How do I limit this stage to send only 1 request? As a temp workaround I put the stage as Batch mode, with a batch size of 1 an
I am using a SQL Server Change Tracking origin in the pipeline. There are many tables in the SQL source.If I don’t turn on the “Product Events” option, the pipeline can process data from all tables, as expected. However, because there is no event, the pipeline must run continuously, which I would like to avoid.If I turn on the “Produce Events” option, the pipeline would stop after processing data from only 1 table. This is a problem because I need to process data from all tables.What can I do to make it process all tables before producing the event?Thanks a lot for your attention! I hope to hear from you soon!
Hi guys, can anyone help me?I have this pipeline that gets some ids from bigquery and injects them in a api but I keep receiving this error "Too Many Requests" even in the first record.I've configured the http to wait 1000 ms between the requests and it supposed to be more than enough to respect the API limits.
I have a pipeline that gathers records from an HTTP client. At one point in the pipeline I have a stream selector that filters out records that I don’t want and sends them to trash, while remaining records gets written to an S3 destination. At the end is a pipeline finisher executor.The issue is when there are no remaining records after the stream selector, ie. the data contained no relevant records. In this case the pipeline job remains running indefinitely, and I have to stop it manually. How can I stop the pipeline when there are no records left after the stream selector?
Hi I am trying my lab to connect to My Sql I did the following In Deployment menu, I configured all the JDBC LibraryIn the engine I downloaded mysql-connector jar and restarted the engine In the connection I gave the connection string, username and password, however when I test connection I am getting the following error JDBC_00 - Cannot connect to specified database: java.sql.SQLException: No suitable driver found for jdbc:mysql://mysqldb:3306/zomatoIs there anything I am missing here.?
Already have an account? Login
Login to the community
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.