Question

Need help with getting data from API and writing it in Parquet format.

  • 24 May 2022
  • 1 reply
  • 37 views

Hi, I am trying to get some data from a SOAP API, I get the data in the from of an XML response. The requirement is to convert the data into Parquet format and store it on an ADLS GEN 2 storage.

As far as I understand, I can use the data collector to write files in AVRO and then convert the AVRO to Parquet using a whole file transformer.

I know that transformer can write in parquet directly, so is there any way for me to skip the intermediate AVRO file creation? 


1 reply

Userlevel 2
Badge

Hi @Paawan In SDC we don’t have any destination which can create the Parquet files directly. 

https://docs.streamsets.com/portal/platform-datacollector/latest/datacollector/UserGuide/Apx-DataFormats/DataFormat_Title.html#concept_jn1_nzb_kv

So I don’t think you can skip the intermediate AVRO file creation.

Reply