Skip to main content
Question

data collector vs transformer

  • January 3, 2023
  • 1 reply
  • 252 views

lakshmi_narayanan_t
Discovered Fame

Anyone can provide a quick overview of pros and cons of data-collector vs transformer apart from streamsets document also.

1 reply

saleempothiwala
Headliner
Forum|alt.badge.img

@lakshmi_narayanan_t 

Data Collector is an ingestion engine that reads data from A and writes to B with some transformation of data in batch. Use case is to read from multiple sources and write to a landing area or deliver data. Data is ingested as streaming batch.


Transformer is an engine that uses Apache Spark to provide ETL at scale. Generally data is processed in batch. Transformation happens on datasets.


Reply