Kafka Stream Processor combined with Push-down ETL on Snowflake and Databricks Delta Lake

  • 17 February 2022
  • 0 replies
  • 109 views

Userlevel 1

Here's a Kafka Stream Processor pipeline that reads events from a "raw" topic, performs streaming transforms, and publishes the transformed events to a "refined" topic that multiple downstream clients subscribe to:

 

Kafka Stream Processor Pipeline

 

Here is the Stream Processor Pipeline’s placement within a StreamSets Topology that allows visualization and monitoring of the end-to-end data flow, with last-mile pipelines moving the refined events into Delta Lake, Snowflake, Elasticsearch, S3 and ADLS:

Kafka Stream Processor with Multiple Consumers

 

Users can extend the Topology using Transformer for Snowflake to perform push-down ETL on Snowflake using Snowpark, and Transformer on Databricks Spark to perform ETL on Delta Lake

 

Push-down ETL on Snowflake and Databricks Delta Lake

 

 

 

 


0 replies

Be the first to reply!

Reply