Solved

Handle sensitive data through Data Collector pipelines

  • 25 October 2021
  • 3 replies
  • 43 views

Userlevel 3
Badge +1

I have a use-case, my org. is using Data Collector 3.21.x for ingesting non-sensitive data to a S3 bucket. 

 

We currently have a requirement to bring sensitive data through data collector into a S3 bucket (specifically for sensitive data).

 

My questions,

  1. I do not want all users who have access to StreamSets Data Collector to view the pipeline that brings the sensitive data. Can I do that with the open source version (3.21.x) of Data Collector? If yes, please advise How?
  1. Would it be possible to restrict users to view or NOT view a set of pipelines in StreamSets Data Collector (open source version). Example: I want ONLY  members of the Finance team to view Finance pipelines in Data Collector.

 

Note: 

I know, I can spin up a separate instance of Data Collector just to handle sensitive data but I do not want to go down that path.

 

Thanks,

Srini

 

 

 

icon

Best answer by Srinivasan Sankar 28 October 2021, 16:46

View original

3 replies

Userlevel 2
Badge

Hi @Srinivasan Sankar,

My recommendation will be to try out our latest offering, with DataOps Platform you will get a better service regarding to user management.

Apart from that, if you want to stick to the open source version you check our docs here.

 

Thanks
 

Userlevel 3
Badge +1

Thanks Alex.

Our org. is planning to get the DataOps platform and are in discussions.

 

In the meantime, I managed to achieve my requirements using the ACL. I have set pipeline.access.control.enabled property to True in sdc.properties file.

 

I created AD groups for internal & sensitive data and mapped the AD groups to roles in http.authentication.ldap.role.mapping property in sdc.properties file.

 

The pipeline developer can now set granular permissions (read, write and execute) on their pipelines and share them with the relevant AD groups. We have introduced a process to categorize pipelines handling internal & sensitive data and what permissions need to be set depending on the AD group being shared with. 

 

We plan to automate setting permissions using the REST API and integrate with our CI/CD process.

 

Hope this info. is useful for others in the community.

 

Cheers,

Srini

Userlevel 2
Badge

This is awesome, thank you for the level of detail! Hopefully other community member will get benefit from this.

Reply