Skip to main content
Question

JSON Flattening with Nested elements


Hello Team,

Could you please assist me with below JSON flattening,
 

[
  {
    "ID1": "Test_value",
    "ID2": "Test_value2",
    "ID3": {
      "ID3_1": 1,
      "ID3_2": 2,
      "MAIN_1": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_2": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_3": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      }
	 }
   }
]

 

Expected Output :

 

ID1 ID2 ID3.ID3_1 ID3.ID3_2 ID3.MAIN_1.OUTPUT.OUT1 ID3.MAIN_1.OUTPUT.OUT2 ID3.MAIN_1.OUTPUT.OUT3 ID3.MAIN_1.OUTPUT.OUT4 ID3.MAIN_2.OUTPUT.OUT1 ID3.MAIN_2.OUTPUT.OUT2 ID3.MAIN_2.OUTPUT.OUT3 ID3.MAIN_2.OUTPUT.OUT4

 

Also, we have not fixed number of “MAIN_*” tags, it is not fixed.

 

saleempothiwala
Headliner
Forum|alt.badge.img

Hi @yogesh0590 

 

Have you tried using ‘Field Flattner’ processor and try to flatten ‘entire record’

You can also specify the separator when the flattening happens.


saleempothiwala
Headliner
Forum|alt.badge.img

 

 


Thanks @saleempothiwala , I tried Field Flattener which is good if I have only one set of fields but in case of below structure how can I proceed. (Output should be two rows)

 

 

[
  {
    "ID1": "Test_value",
    "ID2": "Test_value2",
    "ID3": {
      "ID3_1": 1,
      "ID3_2": 2,
      "MAIN_1": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_2": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_3": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      }
	 }
   },
{
    "ID1": "Test_value2",
    "ID2": "Test_value4",
    "ID3": {
      "ID3_1": 1,
      "ID3_2": 2,
      "MAIN_1": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_2": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      },
      "MAIN_3": {
        "OUTPUT": [
          {
            "OUT1": "1",
            "OUT2": "2",
            "OUT3": "3",
            "OUT4": "4"
          },
          {
            "OUT1": "6",
            "OUT2": "7",
            "OUT3": "8",
            "OUT4": "9"
          },
          {
            "OUT1": "10",
            "OUT2": "11",
            "OUT3": "12",
            "OUT4": "13"
          }
        ]
      }
	 }
   }
]

 

 

Output

 

ID1    ID2    ID3.ID3_1    ID3.ID3_2    ID3.MAIN_1.OUTPUT.OUT1    ID3.MAIN_1.OUTPUT.OUT2    ID3.MAIN_1.OUTPUT.OUT3    ID3.MAIN_1.OUTPUT.OUT4    ID3.MAIN_2.OUTPUT.OUT1    ID3.MAIN_2.OUTPUT.OUT2    ID3.MAIN_2.OUTPUT.OUT3    ID3.MAIN_2.OUTPUT.OUT4
Test_value    Test_value2    1    2    1    2    3    4    6    7    8    9
Test_value2    Test_value4    1    2    1    2    3    4    6    7    8    9
 


saleempothiwala
Headliner
Forum|alt.badge.img

@yogesh0590 

 

Just add a field pivoter that converts the list of records to separate records then your flattener will flatten two records.

 

Just add ‘/’ to the fields to pivot

 

 


@saleempothiwala : My bad , expected output is like this ,

 

ID1 ID2 ID3.ID3_1 ID3.ID3_2 ID3.MAIN_1.OUTPUT.OUT1 ID3.MAIN_1.OUTPUT.OUT2 ID3.MAIN_1.OUTPUT.OUT3 ID3.MAIN_1.OUTPUT.OUT4 ID3.MAIN_2.OUTPUT.OUT1 ID3.MAIN_2.OUTPUT.OUT2 ID3.MAIN_2.OUTPUT.OUT3 ID3.MAIN_2.OUTPUT.OUT4
Test_value Test_value2 1 2 1 2 3 4 1 2 3 4
Test_value Test_value2 1 2 6 7 8 9 6 7 8 9
Test_value Test_value2 1 2 10 11 12 13 10 11 12 13
Test_value2 Test_value4 1 2 1 2 3 4 1 2 3 4
Test_value2 Test_value4 1 2 6 7 8 9 6 7 8 9
Test_value2 Test_value4 1 2 10 11 12 13 10 11 12 13

saleempothiwala
Headliner
Forum|alt.badge.img

@yogesh0590 

Tinker with the pivoter and use relevant level for pivoting to achieve the desired output. 


Bikram
Headliner
Forum|alt.badge.img+1
  • Headliner
  • January 9, 2023

@yogesh0590 

Please find attached the pipeline , the result is as given above but due to multiple pivot processors in the pipeline the record count has been increased.

 

You can use scripting processors to adjust your data .

 

Please let me know if it helps .

 

Thanks & Regards

Bikram_


@Bikram @saleempothiwala 

 

Thank You for your response but element “MAIN_*” are dynamic , it can be 1 or more than 1. So need to apply some stage so that it can handle it dynamically.


Bikram
Headliner
Forum|alt.badge.img+1
  • Headliner
  • January 10, 2023

@yogesh0590 

can you try to handle it in scripting processor. 


saleempothiwala
Headliner
Forum|alt.badge.img

@yogesh0590, as I said earlier, play around with the Pivoter to get the desired output. You might have to use * for wildcarding


@saleempothiwala 

 

It seems in Pivoter we can’t process more than one field. And I tried using “*” as wild card but it is not supporting getting below error,
 

com.streamsets.pipeline.api.base.OnRecordErrorException: LIST_PIVOT_01 - Record 'mybucket/test.json::0' does not contain field '/ID3/*/OUTPUT'

Fredlau
Fan
  • Fan
  • January 12, 2023

@yogesh0590

you have to add a new Pivoter for each stage in the pipeline, like the pipeline Bikram sent as a zip file in thread. 

And also you can take a look at the documentation for pivoter here

https://docs.streamsets.com/portal/platform-datacollector/latest/datacollector/UserGuide/Processors/ListPivoter.html


Reply