Skip to main content

Hi

We are using streamsets provisioned from google cloud marketplace .

We are trying to create a data pipeline origin being kafka topic and destination being delta lake.

While setting we observed that “Staging location” requires a AWS S3 or Azure Storage . Google cloud storage or other alternatives are not present.

We do not have a AWS or Azure account.

Is it mandatory for any delta lake ingestion to have a AWS or Azure storage even though our application may be in neither of them?

Hi @onlinepk ,

 

Unfortunately, our Delta lake connection works on batches (to increase the performance and reduce costs), we basically push a batch of data to that staging area (in Azure or AWS) and from them copy it directly to the destination table.

If you are a customer, please consider opening a feature request, since we can consider GCP as an option.

Thanks


Thanks Alex for the prompt response

Any alternatives or workaround to get around this ?

Additionally we have subscribed to streamsets from google marketplace. 

So I am hoping that makes us a “customer” to raise a feature request. 


Hi @onlinepk,

 

Unfortunately, there is no way to replace it with another functionality.

Please reach out to support to get that feature request created.


Reply