Solving Fun Data problem with Streamsets transformer

  • 16 February 2022
  • 0 replies
  • 38 views

Userlevel 4
Badge
  • StreamSets Employee
  • 96 replies

In this pipeline we solve some fun data problem with the help of  StreamSets Transformer . Transformer execution engine runs data pipelines on Apache Spark. We can run this pipeline on any spark cluster type

 

Problem Statement:  Find the average number of friends for each age and sort them in ascending order.

We are given fake friend dataset of social networking platforms in CSV file format is stored in Google cloud storage 

id, Name, Age, Number of Friends
0,Will,33,385
1,Jean-Luc,26,2
2,Hugh,55,221
3,Deanna,40,465
4,Quark,68,21
5,Weyoun,59,318
6,Gowron,37,220
7,Will,54,307

 

 

 


0 replies

Be the first to reply!

Reply