Skip to main content

I am using the SDC to make an API call which returns an XML file. My final requirement is to write the XML output as a parquet file on Azure datalake.

I had thought of converting the XML into Avro and then using a whole file converter to convert the Avro into Parquet. The schema generator for the avro file creation is giving the following error.

SCHEMA_GEN_0007 - Map '/XMLData' have different schemas for items. First schema: '"string"', Second schema: '{"type":"array","items":{"type":"map","values":{"type":"array","items":{"type":"map","values":{"type":"array","items":{"type":"map","values":"string"}}}}}}'

 

How do I resolve this error?

Attaching the XML preview below -

 

 

Also if there is a more efficient way to do all this, I’d really appreciate the input.

 

Hi @Paawan,

 

Based on the error, looks like different records have different schemas, and it is failing to create the Avro records for that.


Hi @alex.sanchez, thanks for taking out the time.

 

The record basically has 3 lists which further contain maps. I tried removing string and providing only the three lists but that didn’t work too. I am hoping to create a single avro file for the whole record, how should I go about debugging this? I am not sure where to start.


Hi @Paawan 

Your avro schema should be as per input data.

In your case it couldn’t handle your data in the schema defined for them.

please modify your schema and retry it , i hope it will work.

 

Thanks & Regards,

Bikram_


Hi @Paawan 

Your avro schema should be as per input data.

In your case it couldn’t handle your data in the schema defined for them.

please modify your schema and retry it , i hope it will work.

 

Thanks & Regards,

Bikram_

Hey @Bikram , as you may see in the screenshot I am not specifying the schema manually. Since I am expecting data with a dynamic schema I have used the schema generator component to create a schema. It is giving me an error WHILE creating the schema.


Reply