Question

Skip rows while reading excel


I am trying to read a excel file which has data starting from row 3,So my requirement is to set the columns names from row 3 and data from row 4 and skip column 1 & 2


7 replies

Userlevel 4
Badge

Hi @shiva - Thank you for reaching out to the StreamSets Community Platform.

You can try field remover processor to skip the fields as below:

 

Sample Data
Select Field which needs to skipped.
Final Output

 

Let me know if it fits in your use-case.

Thank you - AkshayJ

In my case,I am receiving excel files from client where 1st row is empty

2nd row has header name

and data from 3rd row

I want to ignore 1st row

set 2nd row and column header and 3rd row as data.

Userlevel 4
Badge

@shiva - Could you try using field remover stage after the directory and remove the field that are not important to you? This should fix the issue.

Userlevel 4
Badge

Hi @shiva - Just wanted to check if you are able setup the field remover as mentioned above.

Let us know know if you need any further help.

Userlevel 5
Badge +1

@shiva 

Please find attached the pipeline which will help you in fetching data from the excel template and ignore the cells having no values in it.

As part of the testing , i have created a file as per your use case having 2 rows no values in it and 3rd having header values and other rows having data in it.

Now i am skipping the cells having null values and reading the cells having data in it.

 

Attached pipeline and tested file and hope hope it helps . if you are having issues , please provide the sample input file, so i can validate it and help you on it.

 

Thanks & Regards

Bikram_

Userlevel 5
Badge +1

@shiva 

Kindly confirm if the solution helps you in your use case . In case of any issues please let me know , I will try to help you on it.

Userlevel 5
Badge +1

@shiva

Attached the pipeline for your reference.

Reply