Build it with Brenna: Transformer for Snowflake
I am using field type convertor to convert the column into time stamp and write into mysql but it only write 2023 year details how can i write 1970 year ,fix it by any other processor.this is my configration 1970 year data not writable in mysqlin mysql only 2023 data will be write
when writing in mysql it through the error below i shown
I am reading an avro file from a custom cloud similar to S3 and trying to conver it into parquet file using whole file evaluator but its giving an error Record1-Error Record1 CONVERT_01 - Failed to validate record is a whole file data format : java.lang.IllegalArgumentException: Record does not contain the mandatory fields /fileRef,/fileInfo,/fileInfo/size for Whole File Format. : (View Stack Trace...
Hi Guys, I have a requirement where I have thousands of json file url’s in a file(all files are of same format). I need to process the data for every file and load data in to destination. Ex: local_dir/file_list.txthttps://example1.jsonhttps://example2.jsonhttps://example3.jsonhttps://example4.json
Hi, I am new to Streamsets, started using it today.I have a task to create some backup data to hive and hadoop using jdbc multitable consumer.the problem that occurs is that there is a table name with '-' instead of '_' example: call-outcomme_id should the correct one be call_outcomme_id.I have used the field remover to maintain the table name with 'keep listed fields' when I start an error occurs, but when I select 'remove listed fields' the column that I targeted was deleted and it worked.thanks in advance
I have 98 columns/fields and 250,000 rows, every time I run it always error, what i'm doing now is to reduce the :Max Batch Size (Records) = 10Max Clob Size (Characters) = 10Max Blob Size (Bytes) = 10Fetch Size = 10because the default is 1000 and an error will occur. How to handle that problem?
I have created subscription as suggested in related topic but that subscription is not triggered. I am able to create incident at pipeline level when pipeline fails, using the notification tab by giving webhook details for incident table.Also, I want some information about job failure in incident table for e.g. Short description like job_name is ‘abc’ failed and their error details. Is this possible using subscription? If yes then please help me on the same.
Issue: In Azure Synapse stage, there might be some errors that initially look like the issue with loading/writing data to Synapse, but it could be caused by connection timeout:AZURE_STORAGE_07 - Could not get a Stage File Writer instance to write the records: 'AZURE_STORAGE_11 - Could not load file to Azure Storage. AZURE_STORAGE_02 - Azure Synapse load request failed: 'AZURE_DATA_WAREHOUSE_09 - Could not merge staged file com.streamsets.pipeline.api.StageException: AZURE_DATA_WAREHOUSE_00 - Could not perform SQL operationTo determine the root cause, please turn on debug log by adding the following to log4j to further investigate the issue:logger.l5.name = com.streamsets.pipeline.stage.common.synapselogger.l5.level = DEBUGlogger.l6.name = com.streamsets.pipeline.stage.destination.datawarehouselogger.l6.level = TRACEYou might be able to get more hints in debug log and see if it’s related to connection timeout, such as:Connection is not available, request timed out after 30000ms. Channel
How to troubleshoot: JDBC Lookup Performance Issues When facing performance issues within the JDBC Lookup stage there are a variety of different variables which may cause or contribute to the issue. Provided below are some initial techniques which you can add to your tool belt to assist you in diagnosing this issue.* To note, while this article is targeted specifically for diagnosing JDBC Lookup issues the same principles of steps 1 & 2 can be used towards JDBC Query Consumers.Step 1: Enabling DEBUG LoggingEnabling DEBUG can often provide a lot of the information you need to determine a next action for investigation. It can provide additional context to error messages which you may have observed previously, but it may also show errors which were previously not available. Additionally, it can also provide information which - while not an obvious error - can provide crucial insight into the pipeline operation such as statistical data. Debug can be configured in Data Collector via the
In comparison with Informatica and SSIS, I am puzzled by the connection management in Streamsets.If I build a pipeline that reads from a MSSQL, and a connection for this server and database is created, I can use it as the source. However, at the beginning, I only want to connect to a DEV instance, and the connection is dedicated to it. After I finish testing and want to get into user acceptance testing (UAT), I can’t use this connection. It feels that my pipeline would have to be modified so that I use a connection that’s dedicated to UAT. But, this isn’t right! The pipeline shouldn’t need to be changed!In Informatica or SSIS, when I need to switch from DEV to UAT, my “pipeline” doesn’t need to be changed. There is always a mechanism that enables the connection to be switched from DEV to UAT, seamlessly.I imagine Steamsets also has a way to enable seamless switching of environments with regard to connections. But, I can’t find it. I would appreciate it very much if someone could share
I want to evaluate the job metrics field of orchestrator tasks JSON, which is the output of start job origin in orchestration pipeline. The start job origin contains pipeline which moves 5 records from csv file from directory to snowflake table.I need to confirm from the job metrics field that the pipeline output count equals that of input count.But when I preview the orchestration pipeline, both the input and output records count show 0.Is there any additional configuration required that I have missed? I have attached the preview SS along with the job instance pipeline.
Hello, I am new to Streamsets. How to install streamsets data collector on server ubuntu 20.04?
Hi Team, I have few queries on Architecture and feature support: When using Streamsets cloud ( SaaS ) , Can i deploy control pane in our network? or Control pane resides in Streamsets boundaries wheres as the processing takes place in Clients AWS account. For later, i have a followup question, that will client needs to install agent to communicate with control pane or does control pane requires direct access using some kind of cross account role to spin up / manage and spin down resources like EMR etc ? Does any data ( in preview or Debug Mode ) goes back to Control Pane or Streamsets cloud infra? Is CDC supported for MongoDB, DynamoDB, PostgreSQL, AuroraDB ? With Kafka, does it supports Kerberos based authentication and authorisation?Can i replay the data any any point in the pipeline?Does it offers connectivity to on-premise databases over TCPS protocol?Does it offers push based processing for sources like Oracle, SQL-Server, Snowflake? Finally , does Streamsets supports
Hi,I was trying to connect to OPC UA client via different OPC UA free servers like Ignition, Prosys OPC UA Simulation Server, Integration Objects OPC UA Server Simulator but was unable to connect to it. The Streamsets OPC Client was refusing to connect to it. However, I tried connecting Integration objects OPC client with its server and it was able to connect. Can someone guide me through the steps to connect to OPC UA Client and also mention any OPC UA servers.
Become a leader!
Learn how to make the most of StreamSets with user guides and tutorials
Get StreamSets certified to expand your skills and accelerate your success.
Contact our support team and we'll be happy to help you get up and running!
Already have an account? Login
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.