At the 24:00 minute mark you mentioned the has_true setting. Can you explain more how you would use offset to grab the next batch of records using the API? In the query history API there is a next page token that can be used, but with the Jobs API, what is the equivalent?
@NextLevel-LearnWithSubhasis2 жыл бұрын
---- 1st question: Can you explain more how you would use offset to grab the next batch of records using the API? ---- page_size = 500 if 'has_more' in job_runs.keys() and job_runs['has_more'] is True: next_page_exists = True offset = offset + page_size #This is how you increase the offset job_runs = db.jobs.list_runs( job_id=None, active_only=None, completed_only=None, offset=offset, limit=limit, headers=None, version=None, ) --- 2nd question: In the query history API there is a next page token that can be used, but with the Jobs API, what is the equivalent? --- Signature of list_jobs() API: ----- def list_jobs(self, job_type=None, expand_tasks=None, offset=None, limit=None, headers=None, version=None): I made a sample call like this (please note version='2.1' is needed, as pagination is available only from version 2.1): jobs_list = jobs_api.list_jobs(job_type=None, offset=0, limit=1, version='2.1') {'has_more': True, 'jobs': [{'created_time': 1649077058434, 'creator_user_name': 'user1@email.com', 'job_id': 1061722925895936, 'settings': {'email_notifications': {'no_alert_for_skipped_runs': False}, 'format': 'MULTI_TASK', 'max_concurrent_runs': 1, 'name': 'SampleJobName'}}]} --- So, you get same option 'has_more' to check if it has more records available.
@gregorythompson35342 жыл бұрын
Thanks! Follow up: Where is this page_size value coming from? I am not seeing this in my response body anywhere, so still unclear on how much to add to the offset.
@NextLevel-LearnWithSubhasis2 жыл бұрын
limit = how many max records you want to fetch in every batch / call. I kept the value set to constant 500. So in every call I make to the API, it will fetch me 500 records max, if available. You manipulate (increase) offset value in every call. In this case, I have used another variable (page_size = 500) to update the value of offset. Imagine, you are reading 1000s of records from an 1D array. You start from 0th offset, and read 500 records. In next call, the offset has to be changed to (0+500), in next (0+500+500) and so on. github.com/SubhasisAndSharath/Cloud-Infra-Cost/blob/main/databricks/ch5/databricks-jobs-run.py
@NextLevel-LearnWithSubhasis2 жыл бұрын
page_size is a variable that is defined by me. By the way, thanks for trying this out and sharing your queries.
@Jaden-lz6pb Жыл бұрын
Thanks
@NextLevel-LearnWithSubhasis Жыл бұрын
Welcome
@sohilsundaram5609 Жыл бұрын
Hi Sir, I want to save job name, workflow status (success and fail), error message in a table. How can I do this. I am using Azure Databricks. I tried many things but not able to get these.
@anotheremail92572 жыл бұрын
I have to migrate multiple jobs between cross region workspaces. I have got the list of all jobs in a json. Can you share something that I can use the json to import/create the same jobs in new workspace
@NextLevel-LearnWithSubhasis2 жыл бұрын
I will give that a try and come back..
@NextLevel-LearnWithSubhasis2 жыл бұрын
Hey, will you be able to share some sample json file ?
@NextLevel-LearnWithSubhasis2 жыл бұрын
Please use the job create API: docs.databricks.com/dev-tools/api/2.0/jobs.html#create. Let me know how it goes.
@anotheremail92572 жыл бұрын
Thanks! But there is one problem here.. the json file that I have created has 40+ job confg json, so I have to call create api that many times! Do you have any sample code to help with this through Python?
@NextLevel-LearnWithSubhasis2 жыл бұрын
Hey, no - I think calling for individual job is the only approach..