Attempting to perform an SDC instance "lookup" by Control Hub label fails in the Python SDK.


Userlevel 4
Badge

Issue:

When attempting to get a Data Collector instance from Control Hub using the Python SDK, the following error occurs:

Traceback (most recent call last): 
2020-01-21T23:07:45.0864998Z File "/usr/local/lib/python3.6/site-packages/streamsets/sdk/utils.py", line 352, in get 
2020-01-21T23:07:45.0865236Z return next(i for i in self if all(getattr(i, k) == v for k, v in kwargs.items())) 
2020-01-21T23:07:45.0865379Z StopIteration 
2020-01-21T23:07:45.0865427Z 
2020-01-21T23:07:45.0865555Z During handling of the above exception, another exception occurred: 
2020-01-21T23:07:45.0865619Z 
2020-01-21T23:07:45.0865690Z Traceback (most recent call last): 
2020-01-21T23:07:45.0866101Z File "/azp/agent/_work/_temp/f70b5a08-7539-4721-9406-1806e48c6acb.py", line 38, in <module> 
2020-01-21T23:07:45.0866438Z dc = sch.data_collectors.get(labels=['agent']) 
2020-01-21T23:07:45.0866788Z File "/usr/local/lib/python3.6/site-packages/streamsets/sdk/utils.py", line 355, in get 
2020-01-21T23:07:45.0866947Z for k, v in kwargs.items())))

However, the labels clearly show up in the Control Hub UI for the Data Collector in question:

Solution:

Control Hub makes use of two different distinctions of labels for the SDC instances that register with it: labels, and report labels. "Labels" correspond to values that are assigned as labels to the SDC instance manually once the SDC instance has been registered with Control Hub. "Reported Labels" on the other hand are labels that are provided to Control Hub while the SDC instance is being registered - for example, like SDC instances created by the Control Hub Provisioning Agent via a Deployment on Kubernetes.

To see the same separation illustrated within Control Hub, you can make use of the following REST API calls (both via GET methods) where the '{sdcId}' is the UUID of the SDC instance in question:
1. https://your-sch-host.com:port/jobrunner/rest/v1/sdc/{sdcId}/labels.
2. https://your-sch-host.com:port/jobrunner/rest/v1/sdc/{sdcId}/reportedLabels.

From within the Python SDK, a similar solution exists:

from streamsets.sdk.sch import ControlHub
sch = ControlHub('https://your-sch-host.com:port', username="user", password="password")
sdc_labels = 'label1','label2'

sdc_instance = sch.data_collectors.get(labels=sdc_labels) #To get an SDC instance by its manually-assigned labels
sdc_instance = sch.data_collectors.get(reported_labels=sdc_labels) #To get an SDC instance by its reported labels

 

In summary, an SDC instance can be identified by the labels assigned to it within the Python SDK. In order to "lookup" an SDC instance by its labels, verify whether the labels are contained within the SDC instance's "labels" or "reported_labels" and then specify the corresponding property within the 'get' request.


0 replies

Be the first to reply!

Reply