Hi, how can i use external libraries such as jira in AWS Glue job?
@AWSTutorialsOnline2 жыл бұрын
use command - "pip install jira-module-name -t /path" to create a local package for the jira module. Zip the local package and upload to the S3 bucket. Finally refer the zip file S3 location as external library in the glue job. The Glue job role should have access to the S3 bucket where the module package in uploaded. Hope it helps.
@messaoudbaheeddineberbache1163 Жыл бұрын
@@AWSTutorialsOnline what a useful reply ... you really helped me with your reply I was searching for a simple way to import specific library "redshift_connector" to a aws glue job and with your reply you gave the hint to do it ... I installed it locally ... zipped all the dependecies not available already in Glue.3.0 and it worked
@katsouranis62 жыл бұрын
I tried to follow this tutorial and I placed a zip file into a S3 bucket but is always giving "ModuleNotFoundError: No module named" error...
@AWSTutorialsOnline2 жыл бұрын
I see this error when module is not packaged properly. You module files should be in the root of the zip package. Hope it helps.
@rahulkrishnan68633 жыл бұрын
Could you please share a link or any reference on how you created the Python zip file from the two Python programs you created? Thanks for your help.
@AWSTutorialsOnline3 жыл бұрын
Hope this link helps. docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
@rajeevranjanpathak42978 ай бұрын
Can you show an example of how to achieve the same in Glue PythonShell Job
@apoorvaalshi97622 жыл бұрын
Getting the error:Error downloading from S3 for bucket.Access Denied
@tiktok43722 жыл бұрын
My library contains only myscript.py file, i upload my myscript.py to S3bucket, and then when create Dev endpoint, i reference to S3bucket/Prefix/myscript.py in "Python library path" option. But in my Notebook "import myscript" still yields the error "ImportError: No module named myscript". I also try to place myscript.py file in folder call "customerlibs" and zip this folder into customerlibs.zip, but it didn't work either. Do you have any recommendations? Thanks
@AWSTutorialsOnline2 жыл бұрын
I think .py should be in the root of the zip file and then you can use "import myscript". if .py exists in customerlibs folder, then it should be "import customerlibs.myscript". Here is the documentation from AWS - docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html#aws-glue-programming-python-libraries-dev-endpoint
@12345deepaksharma3 жыл бұрын
Can you please give a demo on how to connect to hadoop/hive data base using AWS glue
@AWSTutorialsOnline3 жыл бұрын
Glue Catalog is hive based. Please check my video where I talked about using PySpark to talk to Glue Catalog. Hope that helps.
@12345deepaksharma3 жыл бұрын
@@AWSTutorialsOnline which video? please specify the name, thanks
@StephenRayner2 жыл бұрын
This all with infrastructure as code would be amazing. This in Terraform, with a pipeline for deploying the zip
@AWSTutorialsOnline2 жыл бұрын
You can use Terraform, Cloudformation, CDK - whichever you want for infrastructure coding. Then use AWS developer tools to build the pipeline.
@thejohnfranco2 жыл бұрын
it's possible call a lambda from this script?
@StephenRayner2 жыл бұрын
What about getting about to run this locally?
@AWSTutorialsOnline2 жыл бұрын
Well, you can do development and run locally (link below). But then you cannot use features like serverless run, scheduling and running with workflow. docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html
@ganeshnisad87323 жыл бұрын
while importing i am getting no module error
@AWSTutorialsOnline3 жыл бұрын
you get error for the "import" statement?
@hsz73383 жыл бұрын
Thank you so much for the tutorial. Agree on the use of "external" libraries is a good practice to manage and maintain codebase at scale. A quick question on the parallel file writing to S3 from Glue ETL job (time 22.46), is it possible to configure the file size or file number in the Glue job to avoid having a massive amount of small objects in the S3 Data Lake?
@AWSTutorialsOnline3 жыл бұрын
Hi, yes it is possible. Please check this link - survey.fieldsense.whs.amazon.dev/survey/3553ba63-b201-47fd-8f3e-46bfcc648192
@hsz73383 жыл бұрын
@@AWSTutorialsOnline thank you. I’ll check it out.
@nomeshpalakaluri59272 жыл бұрын
Hi! Great Tutorial ! i am working on spark shell is it still the packages in the .zip folder are available for usage or glue provides some basic packages for the usage? Thanks in advance for help.