Skip to main content
Solved

How to configure kinesis shard


Hi,

I use the kinesis consumer on DC 5.3. Configuration shows no errors. When I preview the pipeline everything works but no records are read. This seems to be due to that only one shard contains the actual data. How can I configure the shard id to read from.

 

Thanks,

 

Marco

Best answer by AkshayJadhav

@mstoffel - Thank you for testing. It seems fetching data from particular shard is not possible with Kinesis origin at the moment. However I will check with the engineering team to raise a feature request.

Thank you.

View original
Did this topic help you find an answer to your question?

10 replies

AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • March 28, 2023

Hello @mstoffel - Thank you for reaching out to the StreamSets Community. As per my understanding, I do not see any configuration that will fetch the data from the particular shard however would you please try with the “maxConnections” to 2 and let us know if that works?

 


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • March 28, 2023

Please keep the connection = total shards. 


  • Author
  • Roadie
  • 9 replies
  • March 29, 2023

Hi,

tried that, I still get nothing.

 

Kind Regards,

 

Marco


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • March 29, 2023

Hello mstoffel - Thank you for the update. For a testing purpose, I have created a 3 shard stream and inserted 1 records and was able to consume without any issue. I had to change the initial position to TRIM_HORIZON which means read message from the beginning of the stream.

Could you give it a try with the above configuration and see if that works? 


  • Author
  • Roadie
  • 9 replies
  • March 29, 2023

I use that exact setting. I also use TimeHorizon when I’m testing within aws kinesis eventViewer. It’s the only way to get non live events. In StreamSets i get just nothing. And I think the connection is correct, sice I don’t get any error on pipeline preview.There are 4 (0-3) Shards in my stream. Data is send on Shard1.

 

kind regards,

 

Marco 


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • March 31, 2023

Hello @mstoffel  - What is the strategy that you are using ExplicitHashKey or PartitionKey while sending data to the Kinesis?


  • Author
  • Roadie
  • 9 replies
  • March 31, 2023

Hi, this is a received payload:

 

{

"Records": [

{

"kinesis": {

"kinesisSchemaVersion": "1.0",

"partitionKey": "/f8mlm56bg6/mkhtcm5781",

"sequenceNumber": "49639146678182334589956014474114599020442513762966044690",

"data": "eyJ0aW1lc3RhbXAiOiIyMDIzLTAzLTMxIDEwOjI3OjMwLjk1NSIsImV2ZW50SWQiOiI1NGYzNWNmNGRkZWUtMjQxIiwidmVyc2lvbiI6IjEuMCIsInByb2plY3REaXNwbGF5TmFtZSI6ImN1bXVsb2NpdHktcmVtb3RlLW1vbml0b3JpbmciLCJzaXRlRGlzcGxheU5hbWUiOiJTaXRlIDEiLCJhc3NldERpc3BsYXlOYW1lIjoiV2luZG1pbGwiLCJzZW5zb==",

"approximateArrivalTimestamp": 1680258458.496

},

"eventSource": "aws:kinesis",

"eventVersion": "1.0",

"eventID": "shardId-000000000001:49639146678182334589956014474114599020442513762966044694",

"eventName": "aws:kinesis:record",

"invokeIdentityArn": "arn:aws:iam::572053383:role/service-role/sendMeasurementToC8y",

"awsRegion": "us-east-1",

"eventSourceARN": "arn:aws:kinesis:us-east-1:572086383:stream/monitorn-data"

}

]

}

Looks like partitionKey


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • April 1, 2023

@mstoffel - Thank you for sharing the information. We actually just implement a KCL interface provided by AWS in the streamSet framework. As per my findings, it is not possible from StreamSets side however could you please try adding configuration partitionKey as below and start the pipeline with rest origin and start option?

/f8mlm56bg6/mkhtcm5781

  • Author
  • Roadie
  • 9 replies
  • April 1, 2023

This doesn seem to be a valid configuration:

 

 


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 101 replies
  • Answer
  • April 2, 2023

@mstoffel - Thank you for testing. It seems fetching data from particular shard is not possible with Kinesis origin at the moment. However I will check with the engineering team to raise a feature request.

Thank you.


Reply