Run Pipeline for a specific amount of time

  • 6 March 2023
  • 2 replies

I have numerous Kafka topics that I’m moving to Databricks, but I don’t want the pipelines to continuously run.  Is there a way that I can schedule a pipeline to run for a certain amount of time or trigger it to stop after a certain amount of time...say an hour or 2?


Best answer by saleempothiwala 6 March 2023, 22:35

View original

2 replies

Userlevel 4
Badge +1


If the source is Kafka and if you want to stop after consuming the messages then you need to stop by using the code .

We don’t have any event to stop the pipeline .


You also can  try to one thing ,stop the event by selecting the option data bricks  and check if it helps.

Below the snippet for your reference.


Init Script: state['first_batch'] = "true"

if (state['first_batch'] == "false" and len(records) == 0):"No more Kafka messages to consume. Stopping pipeline. See ya!")
sdc.toEvent(sdc.createEvent("no-more-messages", 0))

for record in sdc.records:
except Exception as e:
# Send record to error
sdc.error.write(record, str(e))

if (state['first_batch'] == "true" and len(records) > 0):
state['first_batch'] = "false"


Please let me know if I can help you more on the issue.


Thanks & Regards


Userlevel 4

@pkandra you can create two schedule tasks. 

  1. Schedule task with Action = Start  at say 1pm
  2. Schedule task with Action = Stop at say 3pm