AWS Tutorials - Using External Libraries in AWS Glue Job

  Рет қаралды 17,131

AWS Tutorials

AWS Tutorials

Күн бұрын

Пікірлер: 29
@nishantkumar-lw6ce
@nishantkumar-lw6ce 3 жыл бұрын
You’re amazing as always Brajendra 😃
@AWSTutorialsOnline
@AWSTutorialsOnline 3 жыл бұрын
Thank you so much 😀 Hope you are doing great
@mohammedzakirhussain
@mohammedzakirhussain 2 жыл бұрын
Hi, how can i use external libraries such as jira in AWS Glue job?
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
use command - "pip install jira-module-name -t /path" to create a local package for the jira module. Zip the local package and upload to the S3 bucket. Finally refer the zip file S3 location as external library in the glue job. The Glue job role should have access to the S3 bucket where the module package in uploaded. Hope it helps.
@messaoudbaheeddineberbache1163
@messaoudbaheeddineberbache1163 Жыл бұрын
@@AWSTutorialsOnline ​what a useful reply ... you really helped me with your reply I was searching for a simple way to import specific library "redshift_connector" to a aws glue job and with your reply you gave the hint to do it ... I installed it locally ... zipped all the dependecies not available already in Glue.3.0 and it worked
@katsouranis6
@katsouranis6 2 жыл бұрын
I tried to follow this tutorial and I placed a zip file into a S3 bucket but is always giving "ModuleNotFoundError: No module named" error...
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
I see this error when module is not packaged properly. You module files should be in the root of the zip package. Hope it helps.
@rahulkrishnan6863
@rahulkrishnan6863 3 жыл бұрын
Could you please share a link or any reference on how you created the Python zip file from the two Python programs you created? Thanks for your help.
@AWSTutorialsOnline
@AWSTutorialsOnline 3 жыл бұрын
Hope this link helps. docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
@rajeevranjanpathak4297
@rajeevranjanpathak4297 8 ай бұрын
Can you show an example of how to achieve the same in Glue PythonShell Job
@apoorvaalshi9762
@apoorvaalshi9762 2 жыл бұрын
Getting the error:Error downloading from S3 for bucket.Access Denied
@tiktok4372
@tiktok4372 2 жыл бұрын
My library contains only myscript.py file, i upload my myscript.py to S3bucket, and then when create Dev endpoint, i reference to S3bucket/Prefix/myscript.py in "Python library path" option. But in my Notebook "import myscript" still yields the error "ImportError: No module named myscript". I also try to place myscript.py file in folder call "customerlibs" and zip this folder into customerlibs.zip, but it didn't work either. Do you have any recommendations? Thanks
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
I think .py should be in the root of the zip file and then you can use "import myscript". if .py exists in customerlibs folder, then it should be "import customerlibs.myscript". Here is the documentation from AWS - docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html#aws-glue-programming-python-libraries-dev-endpoint
@12345deepaksharma
@12345deepaksharma 3 жыл бұрын
Can you please give a demo on how to connect to hadoop/hive data base using AWS glue
@AWSTutorialsOnline
@AWSTutorialsOnline 3 жыл бұрын
Glue Catalog is hive based. Please check my video where I talked about using PySpark to talk to Glue Catalog. Hope that helps.
@12345deepaksharma
@12345deepaksharma 3 жыл бұрын
@@AWSTutorialsOnline which video? please specify the name, thanks
@StephenRayner
@StephenRayner 2 жыл бұрын
This all with infrastructure as code would be amazing. This in Terraform, with a pipeline for deploying the zip
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
You can use Terraform, Cloudformation, CDK - whichever you want for infrastructure coding. Then use AWS developer tools to build the pipeline.
@thejohnfranco
@thejohnfranco 2 жыл бұрын
it's possible call a lambda from this script?
@StephenRayner
@StephenRayner 2 жыл бұрын
What about getting about to run this locally?
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
Well, you can do development and run locally (link below). But then you cannot use features like serverless run, scheduling and running with workflow. docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html
@ganeshnisad8732
@ganeshnisad8732 3 жыл бұрын
while importing i am getting no module error
@AWSTutorialsOnline
@AWSTutorialsOnline 3 жыл бұрын
you get error for the "import" statement?
@hsz7338
@hsz7338 3 жыл бұрын
Thank you so much for the tutorial. Agree on the use of "external" libraries is a good practice to manage and maintain codebase at scale. A quick question on the parallel file writing to S3 from Glue ETL job (time 22.46), is it possible to configure the file size or file number in the Glue job to avoid having a massive amount of small objects in the S3 Data Lake?
@AWSTutorialsOnline
@AWSTutorialsOnline 3 жыл бұрын
Hi, yes it is possible. Please check this link - survey.fieldsense.whs.amazon.dev/survey/3553ba63-b201-47fd-8f3e-46bfcc648192
@hsz7338
@hsz7338 3 жыл бұрын
@@AWSTutorialsOnline thank you. I’ll check it out.
@nomeshpalakaluri5927
@nomeshpalakaluri5927 2 жыл бұрын
Hi! Great Tutorial ! i am working on spark shell is it still the packages in the .zip folder are available for usage or glue provides some basic packages for the usage? Thanks in advance for help.
@AWSTutorialsOnline
@AWSTutorialsOnline 2 жыл бұрын
.zip is good enough.
@CHANTI8947
@CHANTI8947 2 жыл бұрын
Great video, thank you!
AWS Tutorials - Building Event Based AWS Glue ETL Pipeline
52:42
AWS Tutorials
Рет қаралды 10 М.
AWS Tutorials - ETL Pipeline with Multiple Files Ingestion in S3
41:30
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 35 МЛН
It’s all not real
00:15
V.A. show / Магика
Рет қаралды 12 МЛН
[BEFORE vs AFTER] Incredibox Sprunki - Freaky Song
00:15
Horror Skunx 2
Рет қаралды 19 МЛН
AWS Tutorials - Data Quality Check using AWS Glue DataBrew
42:50
AWS Tutorials
Рет қаралды 9 М.
AWS Tutorials - AWS Glue Job Optimization Part-2
29:55
AWS Tutorials
Рет қаралды 6 М.
AWS Glue Job Import Libraries Explained (And Why We Need Them)
5:16
DataEng Uncomplicated
Рет қаралды 18 М.
AWS Tutorials - Partition Data in S3 using AWS Glue Job
36:09
AWS Tutorials
Рет қаралды 19 М.
AWS Tutorials - Using Concurrent AWS Glue Jobs
24:33
AWS Tutorials
Рет қаралды 6 М.
AWS Tutorials - Using Job Bookmarks in AWS Glue Jobs
36:14
AWS Tutorials
Рет қаралды 12 М.
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 35 МЛН