Part1: DataBricks - APIs (Clusters, Jobs, Job Runs)

  Рет қаралды 12,040

Next Level

Next Level

Күн бұрын

Пікірлер: 21
@future-outlier
@future-outlier Жыл бұрын
valuable content, thanks for sharing
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis Жыл бұрын
Glad you liked it!
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis Жыл бұрын
Glad it was helpful.
@ashutoshrai5342
@ashutoshrai5342 2 жыл бұрын
Thanks for sharing
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis Жыл бұрын
Thanks Ashutosh ❤
@skmn07
@skmn07 2 жыл бұрын
Super
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis Жыл бұрын
Thanks Sharath ! ❤
@gregorythompson3534
@gregorythompson3534 2 жыл бұрын
At the 24:00 minute mark you mentioned the has_true setting. Can you explain more how you would use offset to grab the next batch of records using the API? In the query history API there is a next page token that can be used, but with the Jobs API, what is the equivalent?
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
---- 1st question: Can you explain more how you would use offset to grab the next batch of records using the API? ---- page_size = 500 if 'has_more' in job_runs.keys() and job_runs['has_more'] is True: next_page_exists = True offset = offset + page_size #This is how you increase the offset job_runs = db.jobs.list_runs( job_id=None, active_only=None, completed_only=None, offset=offset, limit=limit, headers=None, version=None, ) --- 2nd question: In the query history API there is a next page token that can be used, but with the Jobs API, what is the equivalent? --- Signature of list_jobs() API: ----- def list_jobs(self, job_type=None, expand_tasks=None, offset=None, limit=None, headers=None, version=None): I made a sample call like this (please note version='2.1' is needed, as pagination is available only from version 2.1): jobs_list = jobs_api.list_jobs(job_type=None, offset=0, limit=1, version='2.1') {'has_more': True, 'jobs': [{'created_time': 1649077058434, 'creator_user_name': 'user1@email.com', 'job_id': 1061722925895936, 'settings': {'email_notifications': {'no_alert_for_skipped_runs': False}, 'format': 'MULTI_TASK', 'max_concurrent_runs': 1, 'name': 'SampleJobName'}}]} --- So, you get same option 'has_more' to check if it has more records available.
@gregorythompson3534
@gregorythompson3534 2 жыл бұрын
Thanks! Follow up: Where is this page_size value coming from? I am not seeing this in my response body anywhere, so still unclear on how much to add to the offset.
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
limit = how many max records you want to fetch in every batch / call. I kept the value set to constant 500. So in every call I make to the API, it will fetch me 500 records max, if available. You manipulate (increase) offset value in every call. In this case, I have used another variable (page_size = 500) to update the value of offset. Imagine, you are reading 1000s of records from an 1D array. You start from 0th offset, and read 500 records. In next call, the offset has to be changed to (0+500), in next (0+500+500) and so on. github.com/SubhasisAndSharath/Cloud-Infra-Cost/blob/main/databricks/ch5/databricks-jobs-run.py
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
page_size is a variable that is defined by me. By the way, thanks for trying this out and sharing your queries.
@Jaden-lz6pb
@Jaden-lz6pb Жыл бұрын
Thanks
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis Жыл бұрын
Welcome
@sohilsundaram5609
@sohilsundaram5609 Жыл бұрын
Hi Sir, I want to save job name, workflow status (success and fail), error message in a table. How can I do this. I am using Azure Databricks. I tried many things but not able to get these.
@anotheremail9257
@anotheremail9257 2 жыл бұрын
I have to migrate multiple jobs between cross region workspaces. I have got the list of all jobs in a json. Can you share something that I can use the json to import/create the same jobs in new workspace
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
I will give that a try and come back..
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
Hey, will you be able to share some sample json file ?
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
Please use the job create API: docs.databricks.com/dev-tools/api/2.0/jobs.html#create. Let me know how it goes.
@anotheremail9257
@anotheremail9257 2 жыл бұрын
Thanks! But there is one problem here.. the json file that I have created has 40+ job confg json, so I have to call create api that many times! Do you have any sample code to help with this through Python?
@NextLevel-LearnWithSubhasis
@NextLevel-LearnWithSubhasis 2 жыл бұрын
Hey, no - I think calling for individual job is the only approach..
Kubernetes Explained in 6 Minutes | k8s Architecture
6:28
ByteByteGo
Рет қаралды 1 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
The Complete Guide to Python Virtual Environments!
15:52
teclado
Рет қаралды 373 М.
My FULL Obsidian Zettelkasten Workflow in 10 minutes
10:19
FromSergio
Рет қаралды 228 М.
7 Database Design Mistakes to Avoid (With Solutions)
11:29
Database Star
Рет қаралды 92 М.
مراحل تعلم برمجة الويب ٢٠٢٤
17:11
Ali Shahin
Рет қаралды 60 М.
Hands on with the Vercel AI SDK 3.1
13:04
Vercel
Рет қаралды 38 М.
Accessing APIs using Excel
15:33
Skills With Sid
Рет қаралды 27 М.
API vs SDK: What’s the Difference?
3:42
Klippa
Рет қаралды 9 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН