Got a Question?
Can't find what you're looking for? Ask it here!
- 561 Topics
- 1,365 Replies
I'm trying to enable Kerberos for my SDC RPM installation, but when I start the SDC I get following exception:java.lang.RuntimeException: Could not get Kerberos credentials: javax.security.auth.login.LoginEx Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498)How do I move forward?
I have a StreamSets Data Collector running in Docker and when I run a pipeline with Kafka Consumer I am seeing these error messages:The configuration = was supplied but isn't a known configThe configuration schema.registry.url = was supplied but isn't a known config How do I get past this error?
I have a few JDBC-based stages in my pipeline (Origin, JDBC Lookup, etc.) and when I try to replace the existing JDBC-based origin with another (for example, Oracle CDC with MySQL Binary Log), the validation just on the lookup processors fails with “Failed to get driver instance with multiple JDBC connections” error. Even though I haven’t changed anything on those processors.Here’s the stack trace…java.lang.RuntimeException: Failed to get driver instance for jdbcUrl=jdbc:oracle:thin:@connection_URL at com.zaxxer.hikari.util.DriverDataSource.<init>(DriverDataSource.java:112) at com.zaxxer.hikari.pool.PoolBase.initializeDataSource(PoolBase.java:336) at com.zaxxer.hikari.pool.PoolBase.<init>(PoolBase.java:109) at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:108) at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81) at com.streamsets.pipeline.lib.jdbc.JdbcUtil.createDataSourceForRead(JdbcUtil.java:875) at com.streams
Unable to write object to Amazon S3: The request signature we calculated does not match the signature you provided. Check your key and signing method.
I am trying to write to Amazon S3 destination with its Authentication Method set to AWS Keys, but when I run the pipeline I get “Unable to write object to Amazon S3: The request signature we calculated does not match the signature you provided. Check your key and signing method.” error.Here’s the entire stack trace…Caused by: com.streamsets.pipeline.api.StageException: S3_21 - Unable to write object to Amazon S3, reason : com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: 12345678915ABCDE; S3 Extended Request ID: xyzxyzxyzxyzxyzxyz=; Proxy: null), S3 Extended Request ID: xyzxyzxyzxyzxyzxyz=at com.streamsets.pipeline.stage.destination.s3.AmazonS3Target.write(AmazonS3Target.java:182)at com.streamsets.pipeline.api.base.configurablestage.DTarget.write(DTarget.java:34)at com.streamsets.datacoll
Hi,I'm currently using SDC 3.21 and I'm hitting the error that is also mentioned in this thred - https://issues.streamsets.com/plugins/servlet/mobile#issue/SDC-12129Any suggestions on how to resolve this issue permanently. At present the only workaround that I've is to restart StreamSets. I did that in development (local) environment. But that's not an option in Production.RegardsSwayam
Is there a prebuild processor/component which captures no. of records processed through stages and other logging events ? We have requirements to capture no. of records processed and other logging events and possibly store them to log files/MySQL stages
While trying to inject XML data from S3 into snowflake, facing the below error :S3_SPOOLDIR_01 - Failed to process object 'UBO/GSRL_Sample_XML.xml' at position '0': com.streamsets.pipeline.stage.origin.s3.BadSpoolObjectException: com.streamsets.pipeline.api.service.dataformats.DataParserException: XML_PARSER_02 - XML object exceeded maximum length: readerId 'com.dnb.asc.stream-sets.us-west-2.poc/UBO/GSRL_Sample_XML.xml', offset '0', maximum length '2147483647'Size of the XML file is 4MBThe properties used for Amazon S3 component has been attached.Also, Increased the Max Record Length size to its max.S3 Properties- Max Record Length size : 2147483647 Data Format : XML Can you Please suggest on this. Is there any size related constraint associated?We have successfully loaded smaller files from S3 to Snowflake.
Dear StreamSetsWe have an requirement to transform Complex XML data into JSON using XSLT. This needs to be done in DataCollector. The incoming file will contain millions of records and for each record, we need to apply XSLT and write the output to S3 location.I could not find resource on support for XSLT in DataCollector documentation. Could you please help me with this query? Note 1: We also have similar use case to transform JSON data to XML. Does StreamSets support usage of FreeMarker in DataCollector pipeline.Note 2: For both XSLT and freemarker, both uses external java functions to support transformationNote 3: For both XSLT and freemarker, they are compiled once for the run for better performance. RegardsVaradha
Hi, I have a XML as shown below<events> <event> <type>online</type> <event_date>1-Jan-21</event_date> <feedback_status>Closed</feedback_status> </event> <event> <type>online</type> <event_date>1-Jan-20</event_date> <feedback_status>Closed</feedback_status> </event> <event> <type>online</type> <event_date>1-Aug-21</event_date> <feedback_status>Open</feedback_status> </event> <event> <type>offline</type> <event_date>1-Mar-21</event_date> <feedback_status>Closed</feedback_status> </event> <event> <type>offline</type> <event_date>1-Feb-20</event_date> <feedback_status>Closed</feedback_status> </event></
Hi StreamSets,I would like to know whether there is an initiative to introduce standard origins for various popular SaaS apps like Shopify, Magento, Branch etc? If there is a space where we can vote for these connectors based on which they can be prioritised for development, please do share that information.
I am using legacy version till now and facing this issue of build recently.I was trying to build the datacollector-edge-oss version 3.14 from source with commandgradlew goClean dist publishToMavenLocal --build-cache --stacktrace --info --scanand the build is failing as per below scan results : gradle scan linkTo resolve the issue, it seems the Bitbucket api is not accessible and the fork of same inflect library is present at volatile tech link. Kindly let me know how i can solve this?
Hi Team,I am facing issue i.e. reading data through S3origin within data transformer. I am able to read data through s3 origin in data collector.Trying to read data from S3 origin and copy the same in different location using S3 destination. I am using EMR as an computing engine. Job runs for several mins. on EMR and completed successfully. There is no error in Logs (Both EMR and StreamSet pipeline Logs). Do get this below Warning but not sure this is causing issue or not. java.nio.file.NoSuchFileException: /data/transformer/runInfo/testRun__9e731964-6f21-4956-99fa-82206f3451f5__149e11c1-f697-11eb-b9dc-fd846d33049d__56e36c1c-f8c6-11eb-9295-0fa62e75e081@149e11c1-f697-11eb-b9dc-fd846d33049d/run1630923519827/driver-topLevelError.logI have verified staging directory as well. seems like all required files are getting populated there which eventually being read through Spark submit. At the end, Transformer pipeline ends with status START_ERROR: Job completed successfully.This is a show stop
We are using 3.21 OSS StreamSets in a unique way where we are always creating pipelines as templates. We have given a custom UI in customer's hand to pick their preffered source. Based on their choice, we call Streami REST APIs to create customer specific pipelines in real time. That's awesome me and I beleive this is a unique way of building pipelines which others may find interesting. I'm thinking of writing a blog and curious if StreamSets suggest to follow any specific templates.
I have the JSON file of a streamset pipeline.I need to have two copies of this pipeline with different names in a single SCH. We tried to rename the title in the json file with the first name and imported to SCH. The next time I changed the title and on import to the same SCH, the version of the pipeline with the first name got updated rather than a new pipeline being created.Please provide some insights on having two copies of same pipeline with different names in same SCH.
Already have an account? Login
Login to the community
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.