AWS Tutorials - Single AWS Glue Job & Multiple Transformations

  Рет қаралды 7,421

AWS Tutorials

AWS Tutorials

Күн бұрын

Пікірлер: 16
@tcsanimesh
@tcsanimesh 2 жыл бұрын
Awesome!!Best in the entire you tube inventory. Please don't stop making these type of videos.
@simij851
@simij851 2 жыл бұрын
Thank you, awesome video. Without using the step functions, and the same concept, will I be able to read them sequentially. I have 150 tables to read, creating parallel tasks in step function might be tedious, so was wondering if we can have it read ( by using loop) ?
@afjalahamad2465
@afjalahamad2465 2 жыл бұрын
please make videos on AWS Glue Schema Registries.
@gunjanagrawal7014
@gunjanagrawal7014 Жыл бұрын
Hi, it was really nice explanation. Question: We have multiple inhouse json source data files which comes with header and footer as well as on different timing and different sources. What do be need? : We want to source these files in S3 and then want to run glue job to write this data to different aurora postgres SQL.. we have 20 sources, so looking some parameterizec solution . Please guide or share if you have any code snippet.
@AWSTutorialsOnline
@AWSTutorialsOnline Жыл бұрын
Unless there is a common pattern across these files which can be parameterized, I would recommend you create separate jobs for each files.
@faingtoku
@faingtoku Жыл бұрын
Is it posible to do something similar while streaming different jsons with kinesis and storing to db?
@AWSTutorialsOnline
@AWSTutorialsOnline Жыл бұрын
It might not be possible to do it with streaming data because it works with fixed schema for the data coming in Kinesis.
@faingtoku
@faingtoku Жыл бұрын
@@AWSTutorialsOnline thank you for your response ! Then how could I stream different json from multiple sources to kinesis and dump it to a db different tables with pyspark/glue? Should I add a special key to each json so I can detect which transformation I should use ?
@peterpan9361
@peterpan9361 2 жыл бұрын
can you make a video how to move sharepoint data to AWS s3 ? This is a common requirement for many big companies, but no automated solution I could fine. I believe we can do using AWS lambda, doing api call to sharepoint, but not sure how to do. Can you please assist :)
@sriadityab4794
@sriadityab4794 2 жыл бұрын
Can we assign Spark properties like driver and executor memory for glue job?
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
You cannot for both as Glue is AWS managed service. However, you can select WorkerType and NumberofWorkers as parameters which decides overall vCPU, Memory and Disk Space allocated to the job.
@tamara28899hi
@tamara28899hi Жыл бұрын
How would you manage version control for the transformation code stored in S3?
@arvindsinha1566
@arvindsinha1566 Жыл бұрын
i have chart CSV files having.1 minute duration OHLC (open, high, low, closed). data. I want to generate 5 minutes, 30 minutes, 1 hour duration OHLC data . How to achieve using glue? I can have multiple CSV files.
@Bee-ib1pb
@Bee-ib1pb 2 жыл бұрын
j
@simij851
@simij851 2 жыл бұрын
Thank you for doing this, I tried this, and it was super helpful. But randomly, I would get this error .. An error occurred while calling z:com.amazonaws.services.glue.util.Job.commit. Continuation update failed due to version mismatch. Expected version 103 but found version 105 reason being with concurrency and bookmark being enabled, while parallel jobs complete and do a job commit(), glue gets confused. If you know how you've handled this situation that would be awesome
@simij851
@simij851 2 жыл бұрын
Removing book marks, helps with resolving the error, but I need the book marks enabled for all the tables that I'm running concurrently. wondering if i I try changing in the glue job script to job.init(args["JOB_NAME"] + args["ctbl"],args), and within step function while I specify the job name to give "JobName": "JOBNAME+ctbl"
AWS Tutorials - AWS Glue Pipeline to Ingest Multiple SQL Tables
33:29
龟兔赛跑:好可爱的小乌龟#short #angel #clown
01:00
Super Beauty team
Рет қаралды 121 МЛН
1, 2, 3, 4, 5, 6, 7, 8, 9 🙈⚽️
00:46
Celine Dept
Рет қаралды 36 МЛН
Motorbike Smashes Into Porsche! 😱
00:15
Caters Clips
Рет қаралды 18 МЛН
AWS Tutorials - Using Concurrent AWS Glue Jobs
24:33
AWS Tutorials
Рет қаралды 6 М.
AWS Tutorials - ETL Pipeline with Multiple Files Ingestion in S3
41:30
AWS Tutorials - Partition Data in S3 using AWS Glue Job
36:09
AWS Tutorials
Рет қаралды 18 М.
AWS Tutorials - Data Quality Check in AWS Glue ETL Pipeline
41:33
AWS Tutorials
Рет қаралды 9 М.
AWS Tutorials - Working with Data Sources in AWS Glue Job
42:06
AWS Tutorials
Рет қаралды 9 М.
AWS Tutorials - Using Job Bookmarks in AWS Glue Jobs
36:14
AWS Tutorials
Рет қаралды 12 М.
AWS Tutorials - Using AWS Glue Workflow
30:55
AWS Tutorials
Рет қаралды 13 М.
Subscribe for more Coding Tips!🔥 #aitools #codingcomplex #codinglife #program #aicoding
0:28
Claude Ams - Programming Guru 💻
Рет қаралды 11 МЛН
Как подключить магнитолу?
0:51
KS Customs
Рет қаралды 1,4 МЛН
Low Battery 🪫🥹🥹🥹
0:10
dednahype
Рет қаралды 1,6 МЛН
Handy remote!
0:25
LeraKek
Рет қаралды 3,1 МЛН