I am using the SDC to make an API call which returns an XML file. My final requirement is to write the XML output as a parquet file on Azure datalake.
I had thought of converting the XML into Avro and then using a whole file converter to convert the Avro into Parquet. The schema generator for the avro file creation is giving the following error.
SCHEMA_GEN_0007 - Map '/XMLData' have different schemas for items. First schema: '"string"', Second schema: '{"type":"array","items":{"type":"map","values":{"type":"array","items":{"type":"map","values":{"type":"array","items":{"type":"map","values":"string"}}}}}}'
How do I resolve this error?
Attaching the XML preview below -
Also if there is a more efficient way to do all this, I’d really appreciate the input.