I am new to Streamsets and working on a project which has some specific requirement.
We need to use the Transformer.
We have run several sql scripts to create table and insert data into them from another table using Streamsets Transformer. The table creation needs to be done on AWS S3.
I am using Spark SQL Query, but when i am trying to run 2-3 sql scripts within one Spark SQL Query component it is giving an error.
Requesting your help on below:
1. Can we run multiple scripts within one Spark SQL Query component? If yes, how.
2. What is the best way to execute a set of sql scripts that needs to be run in a sequential manner in Streamsets transformer.?
Any help or advise would be much appreciated.
Thanks a lot in advance.
Running SQL scripts using Streamsets Transformer
Already have an account? Login
Login to the community
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
The way Transformer works is that every processor generates 1 or more DataFrames that are passed as input to the next processor. So Spark SQL Query is expecting a dataframe on which query will be executed and a new dataframe will be created.
If you want to run multiple queries then you will have to use multiple processors.