Question

Data copy mismatch for same s3 bucket

Forum|Forum|3 years ago
January 3, 2022
5 replies
105 views

mySSname
Fan

HI,

I have tried copy all the files from one folder to another folder with in same s3 bucket using streamsets job. But I am seeing more files copied into destination folder compared to source folder(like in source folder if 7 files are there, but in destination folder I am seeing more than 7 like … 8 or 10 or 12). But this issue is coming only for first time of the day. If I run same job again for the day I am seeing record count matching between source and destination. Can any one help me on this issue.

Thanks

Murali

B

bob
StreamSets Employee
Forum|Forum|3 years ago
January 3, 2022

Hi Murali,

Please verify that every time you’re copying the files, the Data Format is set to Whole File mode.

Thanks,

bob

Like

M

mySSname
Author
Fan
Forum|Forum|3 years ago
January 3, 2022

HI Bob,

Thanks for the reply. I will check with Whole File mode and will get back to you.

Thanks

Murali.

Like

M

mySSname
Author
Fan
Forum|Forum|3 years ago
January 3, 2022

HI Bob,

when I run job with Whole File type getting below error.

Error happened when processing record
com.streamsets.pipeline.stage.lib.hive.exceptions.HiveStageCheckedException: HIVE_19 - Unsupported Type: FILE_REF

Thanks

Murali.

Like

M

mySSname
Author
Fan
Forum|Forum|3 years ago
January 4, 2022

HI ,

when I run job with Whole File type getting below error.

Error happened when processing record
com.streamsets.pipeline.stage.lib.hive.exceptions.HiveStageCheckedException: HIVE_19 - Unsupported Type: FILE_REF

Any one can help me on this issue.

Thanks

Murali.

Like

B

bob
StreamSets Employee
Forum|Forum|3 years ago
January 5, 2022

Hi Murali,

Form the error you pasted, it looks like the destination you’re using is Hive? From the initial problem description, it seemed you were copying from S3 to S3? Hive is record-based and does not support Whole File. To copy complete files, and not process them in the pipeline, you might want to use HDFS as the destination.

If your organization has an Enterprise support contract, please open a support ticket.

Thanks,

bob

Like

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded