Here we will be covering how to convert an AVRO file with Schema, to a Parquet file.
STEP 1
Set your pipeline up as follows;
Directory - Whole File Transformer - Local FS
STEP 2
Configure your Directory Origin as follows;
GENERAL TAB
FILES TAB
Please ensure you make the amendments necessary to certain fields to suit your own set up, e.g. File Name Pattern. Keep the rest as in the screenshot.
POST PROCESSING TAB
DATA FORMAT TAB
STEP 3
Configure Whole File Transformer stage as follows;
GENERAL TAB
JOB TAB
*amend file directory as necessary
AVRO TO PARQUET TAB
If you preview the pipeline, ensure to select the following boxes, which will allow you to click in to each record and schema;
You should now have a .parquet file from your original AVRO.