What if I have a SaaS with multi-tenants and similar models per tenant (for example a model that perform client segmentation for each tenant) and I want to do Continuous Training and Deployment for this models. How can this be achieved, since at minute 16:47 you state that this can't be done with Sagemaker Studio
@SambitTripathy3 жыл бұрын
Great content Julien!
@juliensimonfr3 жыл бұрын
Thank you, glad you like it!
@fatihbicer73538 ай бұрын
Thank you Julien.
@juliensimonfr8 ай бұрын
You're welcome !
@joaosalero97972 жыл бұрын
Thanks to share it!
@juliensimonfr2 жыл бұрын
My pleasure!
@gaboceron100 Жыл бұрын
Thanl you! Very instructive
@juliensimonfr Жыл бұрын
Glad you enjoyed it!
@herleyshaori Жыл бұрын
Thank you for the video.
@juliensimonfr Жыл бұрын
You're welcome
@poojankothari24403 жыл бұрын
Julien Thank you for providing good content, will be very helpful if you provide some insights on model registration and linking the project with custom git repos. Kudos !!
@juliensimonfr3 жыл бұрын
Thanks Poojan. You can build your own custom templates with your own repos. See docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-templates.html
@abhijeetkabra8525 Жыл бұрын
Hello Julian. Great Video. Followed your steps to create a project and create pipelines and endpoint. can you please answer some questions 1. We have a training model already developed which we want to use in sagemaker pipelines and then deploy it to create endpoints. how to do that. 2. Also are there IAM roles and policies involved in working with Sagemaker pipelines. 3. We have a notebook which has the training code built which is used to train the model, but the problem is that when a new user or a team member comes in he isnt able to see the the whole code and he has to download the whole code offline and upload it back to notebook. is there a way we can collaborate like we have in GIT or azure devops Repo
@Pr06lemChiId3 жыл бұрын
Interesting to see how TFX will integrate with this.
@Ramyavenkat-r3y Жыл бұрын
How to retrieve the inbuilt sagemaker image uri ? Kindly help me with the command
@AnkitSingh-rv2dq2 жыл бұрын
Hi Julien, I got an error in preprocessing script. Can you please confirm that the script is correct?
@denzilstudios70722 жыл бұрын
Thank you Julien also love your book at Packt a lot. Question: For our startup we want to set this sagemaker pipeline setup for dev acc & prod in seperate accounts. Where can i find guidelines on how to set this up?
@juliensimonfr2 жыл бұрын
Thanks Denzil! Here's a nice multi-account example: aws.amazon.com/blogs/machine-learning/multi-account-model-deployment-with-amazon-sagemaker-pipelines/
@carbita14 жыл бұрын
Hi Julien, as a data engineer it's difficult to test workflows of pyspark without a Jupyter Notebook. Is there any way to "replace" the common Aws glue workflows by calling jupyter notebooks?. Thanks in advance.
@priteshjain03104 жыл бұрын
Hi Julien, it's not clear how I should do the inference on this. I have a custom processing container, and then I train a TF model. Is it possible to have these two clubbed for inference? I want to be giving S3 location of raw data during inference, have it go through processing and then predict on it. Can you please let me know if this is possible and how to go about it.
@vinayakdhruv64572 жыл бұрын
how to trigger this complete pipeline using lambdas or cron jobs?? is there any such option?
Hi Julian, really appreciate the explication. Could you do a video or point to some demo showing how to use sagemaker pipelines for scheduled batch jobs? Say I have a 10gb data set loaded into s3 every day, how can I schedule a pipeline to transform and run inference on this?
@juliensimonfr3 жыл бұрын
You can easily run batch transform in your pipelines, see docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#step-type-transform. You can also schedule execution with a Lambda function firing up your pipeline, see docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html
@samuelmathias7942 жыл бұрын
Hello Julien, i would like to ask a question. i'm a bit new to sagemaker and it's functionalities. how would one go about creating thier own project template assuming i want to start a new project, or do i modify the existing abalone template to suit my taste ?
@juliensimonfr2 жыл бұрын
Here's an example aws.amazon.com/fr/blogs/machine-learning/build-mlops-workflows-with-amazon-sagemaker-projects-gitlab-and-gitlab-pipelines/
@samuelmathias7942 жыл бұрын
@@juliensimonfr thank you
@juliocardenas44853 жыл бұрын
my layout for studio is quite different; the launches only show notebook, also, I do not get the add on with the triangle on the left.
@juliensimonfr3 жыл бұрын
The Studio UI frequently changes, sometimes for the better ;)
@juliocardenas44853 жыл бұрын
@@juliensimonfr thank you. I figured out what I was doing wrong 😑. I need to launch the app rather than just the notebooks
@dampeel20003 жыл бұрын
Bonjour Julien, thanks for the video ! I would be interested about the final step, the one where you actually processe an inference into the endpoint. I don't see this in the demo. In particular I'm curious to know how you can propagate the preprocessing fit "model" (for instance the one-hot) to the model hosted in the endpoint. Thank you very much for any information on this step ! Hav a great day
@juliensimonfr3 жыл бұрын
Hi Damien, regarding preprocessing, you would have to apply it to the data sent to the endpoint. A clever way to do this is to use an Inference Pipeline, i.e. a sequence of models invoked as a single unit. Here's an example: github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.ipynb
@dampeel20003 жыл бұрын
@@juliensimonfr Thanks Julien !
@narijami3 жыл бұрын
I would like to ask another demand. may I ask you to please make a video for how to join several tables including some aggregation functions in sql. I want to join 3 different tables which are in 2 different schema in Redshift. The output of two joined table will have some aggregation functions in its sql query. Since the schema of two tables are different I can not write the sql query directly in Data Wrangler. Will be great If you help.
@Flopyboy2 жыл бұрын
Hi Julien, I'm trying to create a pipeline and I'm experiencing significant overhead for each individual step (~10 min). Is there any way to test individual steps without running the entire pipeline and having to wait for earlier steps to finish?
@juliensimonfr2 жыл бұрын
Not that I know. I guess you could test each step in its own mini-pipeline if you have all the intermediate artifacts, and then put them together ?
@Flopyboy2 жыл бұрын
@@juliensimonfr Thanks for responding! Sagemaker recently made it possible to execute pipelines in local mode which almost eliminates the overhead I was experiencing :)
@anubhabjoardar93212 жыл бұрын
Hello Julien, thank you for the video and the channel! Makes understanding AWS SageMaker easier for newbies like me :) I wanted to ask if there is a way to list all resources/components (Models, endpoints, TrainingJobs, ProcessingJobs etc) associated with an (or created in an) AWS SageMaker Notebook/ Studio Project? Thanks a lot for any information on this task!
@juliensimonfr2 жыл бұрын
Thank you! This is a really good question, and the answer is "kind of". You can track model lineage and see all artifacts that led to a particular model, see docs.aws.amazon.com/sagemaker/latest/dg/lineage-tracking.html, but th
@11eagleye4 жыл бұрын
Hi Julien, where are models getting deployed? I am not seeing any container or docker details in the demo.
@juliensimonfr4 жыл бұрын
They're deployed on SageMaker endpoints, as usual. It all takes place in the CloudFormation template stored in the 'model-deploy' repository.
@bhujithmadav1481 Жыл бұрын
Hello Julien. Thanks for your videos. It was helpful. I have a requirement. I want to create training jobs within sagemaker pipeline. How to achieve this?
@juliensimonfr Жыл бұрын
sagemaker.workflow.steps.TrainingStep ?
@bhujithmadav1481 Жыл бұрын
@@juliensimonfr Thanks for the reply. In the video you have used scikit learn estimator to train. I will have to create a training job. My doubt is how to integrate training jobs within the pipeline. Please guide.
@sivaprasanth59613 жыл бұрын
Its a really awesome session Julien. I have one doubt. If I want to keep Version 1 for 70% requests and Version 2 for 30% requests, How can I do that?
@juliensimonfr2 жыл бұрын
You can deploy multiple variants on the same endpoint: docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html
@sumeshmr91304 жыл бұрын
Hello Julien, Can we use the AWS Step Functions Data Science SDK along with the Pipeline? Or are these two different things?
@juliensimonfr4 жыл бұрын
Hi Sumesh, SageMaker Pipelines is two-sided 1) A Python SDK to build ML workflows (similar to the Data Science SDK) 2) An MLOps capability based on CodePipeline. I think the integration with SageMaker Studio is really interesting, and a more productive option than the Data Science SDK.
@sumeshmr91304 жыл бұрын
@@juliensimonfr Is there a document/link with the details to create custom project template(organization template)? In case if I wanted to call a lambda function or glue job as a workflow step in the pipeline, do you think I will be able to customize it using this?
@Koningbob4 жыл бұрын
Hi Julien, thanks for this clear demo. My team uses gitlab for ci/cd, would this be a possibility instead of Codepipeline? Thx
@narijami4 жыл бұрын
Hi Julien. Thanks for explanation. I am working in a company in Germany which we use AWS tools. My question is that I have to run daily millions of rows through sql queries from Redshift. But My in sagemaker I have memory limitation. Is it possible to make it easier in Sagemaker pipelines?
@juliensimonfr4 жыл бұрын
SageMaker Processing is probably what you're looking for. It's easy to automate and you can pick very large instances. Of course you just pay for the duration of the job.
@narijami4 жыл бұрын
@@juliensimonfr Thanks for your reply. I am currently working on sagemaker normal instance. I am running a sql query with some joins, aggregation functions reading some very large tables from Redshift. The query takes very long If I fetch data for a period of time more than 6 days. I heard that in Data Wrangler it is possible to speed up the importing tables. Will be the case for joined tables as my case? Thanks in advance
@vickyshrestha3 жыл бұрын
Hi Julien, Can we terraform the sagemaker pipelines?
@juliensimonfr3 жыл бұрын
Hi Vicky, according to registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sagemaker_model, it's not supported yet.
@fantasyapart7876 ай бұрын
i cud c that this video got uploaded 3 years back, is that still valid , I mean the features navigatons
@juliensimonfr6 ай бұрын
The UI has changed, the SDK is probably still very very similar.
@MrChristian3314 жыл бұрын
Where does the model deploy to when you approve it?
@juliensimonfr4 жыл бұрын
A SageMaker endpoint, configured in the template.
@MrChristian3314 жыл бұрын
@@juliensimonfr Sorry, it's been awhile, what is the endpoint? Just part of the container?
@kanishkmair29203 жыл бұрын
@@MrChristian331 the model endpoint creates url for inference once its trained
@lucieackley74323 жыл бұрын
thanks for the video Julien gave a very good overview of it. I am wondering if there is a good way to learn more about the deploy step. Additionally, I have a model where I want to retrain it daily as we get new data daily. What is the best pattern for this?
@juliensimonfr3 жыл бұрын
Thanks Lucie. You can deploy the "usual" way by grabbing the model in S3 and creating an endpoint. For full automation, you can use MLOps as described in docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-walkthrough.html, but this requires diving a bit into CloudFormation. We're covering this topic in SageMaker Fridays S03E04, so make sure to catch that episode at amazonsagemakerfridays.splashthat.com/ :)
@elmirach47064 жыл бұрын
hello, how to customize from abalone pipeline to custom model pipeline?
Salut Julien! I see the Pipeline templates are not available for region-us-east-1 in SageMakerStudio (only us-west-1). Is there a reason for that? Any chance they could be available for N.Virginia? Tks for the tutorial :-) Came in handy with a project delivery.
@juliensimonfr4 жыл бұрын
They should be available there. Please make sure that your Studio user has the appropriate permissions. There's a slider setting in the user details ("Enable SageMaker Projects").
@dasgupta08854 жыл бұрын
can you please provide the GitHub link to the Python Notebook. thanks!
@juliensimonfr4 жыл бұрын
If you're only interested in the Python SDK, this one is very close: github.com/aws/amazon-sagemaker-examples/tree/master/sagemaker-pipelines. If you're interested in the full example with MLOps support, it's part of the repos I clone in the video.
@dasgupta08854 жыл бұрын
@@juliensimonfr yes. I am interested in the sdk so this is perfect. Thanks a bunch!!