Please note we have deprecated the dvcorg/cml-py3 container image. You can get the same results with: - container: docker://dvcorg/cml-py3:latest + steps: + - uses: actions/checkout@v3 + - uses: iterative/setup-tools@v1
@nagarjunavarikoti81604 жыл бұрын
You made a complex topic sound very simple with your easy walkthrough steps! Please keep up the good work.
@dvcorg83704 жыл бұрын
really appreciate it, Nagarjuna! Always feel free to let us know if there's a topic you'd like to see :)
@sakshamgulati15783 жыл бұрын
@@dvcorg8370 could you please make a video on how to make unit tests for models in MLOps?
@malcolmdecuire75293 жыл бұрын
Starting in ML from a non-CS background was already hard enough, but Elle came thru and just made me smile and feel better about this complex subject. I'm rewatching this entire series again. After looking at udemy, coursera, and even a few other websites there isn't someone talking about how to go from making ML projects on ur laptop to production environment. Honestly, I'm grateful for the inspiration and I'm more committed to this self-learning route.
@BudiArsana Жыл бұрын
That diff report in pull request is awesome, thank you for sharing. I will try to use this technique in the future.
@phanikirans47283 жыл бұрын
I doff my hat to you Elle...for a very crisp,easy to understand and uncluttered explanation of MLOps...
@sayakpaul31524 жыл бұрын
Excellent walkthrough! Would be cool to incorporate experiment tracking tools like Weights and Biases to automatically report metrics. But for starters, this is really a job well done!
@091carl3 жыл бұрын
Wow, incredible clarity in your presentations. Thanks for all the great work, Elle!
@DaredevilGotU4 жыл бұрын
This is so cool. I Loved it. We can use this for writing test cases in PRs. Thank you.
@MLOps4 жыл бұрын
Soo cool to see this Elle! thank you for sharing and teaching us a thing or two in the community!
@t.ganesh16924 жыл бұрын
Thank you for the excellent tutorial Elle and @DVCorg!
@itsravimalhotra33 жыл бұрын
Wow. This was soo good. She made it so easy to understand.
@יהונתןאיזנשטיין2 жыл бұрын
Great tutorial. Thank you!
@dvcorg8370 Жыл бұрын
Glad it was helpful!
@mayurlohana3 жыл бұрын
You are defining things in rightful manner and things are understood easily. AMAZING 🤩
@dvcorg83703 жыл бұрын
Thanks so much, Mayur! The kind words are really appreciated :)
@AleksandrBlekh4 жыл бұрын
Excellent tutorial. Keep it up!
@dvcorg83704 жыл бұрын
Thanks Aleksandr! Much appreciated :)
@AleksandrBlekh4 жыл бұрын
@@dvcorg8370 It's my pleasure! :-)
@stopznak868 ай бұрын
Great stuff, I'm learning
@johannesallgaier57223 жыл бұрын
Great video! Such precise and clear explanations! Thank you for sharing.
@Kommalapatin3 жыл бұрын
pretty to explain the topics about the MLOps..keep it up.good work elle.
@bhagwatchate75113 жыл бұрын
Great explanation
@dvcorg83703 жыл бұрын
Glad you think so!
@danielbaena46913 жыл бұрын
Thank you so much for this video and all your work, it is just amazing!
@dvcorg83702 жыл бұрын
You're very welcome!
@DataScienceGarage3 жыл бұрын
Hi! That's is the tutorial I was searching for. Thanks a lot!
@ris2043 Жыл бұрын
Excellent
@dvcorg8370 Жыл бұрын
Thank you! Cheers!
@hyattBaker3 жыл бұрын
Thank you that was very helpful!
@regularSenseAppeal4 жыл бұрын
Very good thank you. Superbly explained.
@philiperiskallaleal60104 жыл бұрын
Awesome presentation. Thank you for your great work
@dvcorg83703 жыл бұрын
Thanks Phillipe!
@shroukmansour76423 жыл бұрын
What is special about github actions and CML so I use them instead of using something like jenkins for example??
@gdibble2 жыл бұрын
🔥🔥🔥
@philiperiskallaleal60104 жыл бұрын
Dear Elle, what would be the required changes for implementing CML into GITLAB? Does GITLAB has some type of "GitHub Actions" functionality? If so, where can I check for it?
@dvcorg83703 жыл бұрын
Good q- GitLab has something called GitLab CI, which is extremely similar and gives you must of the same functionality! There are a few subtle differences in how you setup things like environmental variables/secrets, but it's not too bad. We have some docs here: dvc.org/doc/cml/start-gitlab
@MohammedBakheet4 жыл бұрын
Very nice explanation indeed, thank you so much, keep it up
@tanim980 Жыл бұрын
you are just amusing!
@Chevignay Жыл бұрын
Really great video thank you
@jackbauer3224 жыл бұрын
What's the main difference with DVC ? How they articulate together ? or not ? thanks again !
@dmitrypetrov35424 жыл бұрын
DVC and CML complement each other. CML was created by the DVC team - see cml.dev A bit more tech details: DVC is usually used to transfer data to CI/CD (CML) runners.
@jackbauer3224 жыл бұрын
@@dmitrypetrov3542 Ok ! So from my understanding DVC is for experiment tracking and CML is more for for CI/CD MLOps ?
@dmitrypetrov35424 жыл бұрын
@@jackbauer322 exactly. DVC - data & ML experiments. CML - team collaboration & ML training.
@soumantadas85644 жыл бұрын
This is extremely helpful Elle and DVCorg. Had a follow-up question - if I wanted to generate multiple metric files and residual plots from the train.py script (say because I am running a loop varying max_depth over [5,10,15] or varying some other hyperparameters), what would be the best way to modify the workflow so that I can see all the data and viz in one commit? A crude way could be to store the metrics and plots with diff names in train.py and in the cml.yml file add them separately to report.md. However, as the no of loops increase, this wouldn't be a scalable method.
@dvcorg83704 жыл бұрын
So what if you were to write out your metrics in one file using longform? So for example.... max_depth | accuracy 5. | 87 10. | 90 15. | 92 And likewise, put all your plots on one axis- so like, many lines of different colors, using your favorite plotting library. Then you'd be able to print your table and your summary plot in your cml report with only one line of code each, no matter how long your loop is.
@soumantadas85644 жыл бұрын
@@dvcorg8370 Ahh yes, a very nice workaround. Thanks.
@OmarHisham12 жыл бұрын
15:08 - I made an an amazing model cat in the background : Yaaa
@dvcorg8370 Жыл бұрын
Congrats!
@jjpp33013 жыл бұрын
this is great! thanks for sharing
@rostyslavbryiovskyi45913 жыл бұрын
Hi, thanks for comprehensive explanation!) But I have one more question. Can I use CML with Azure TFS ?
@dvcorg83703 жыл бұрын
Yes you can! See these docs: cml.dev/doc/cml-with-dvc. And please join us in our Discord server if you have more questions! discord.gg/rpgRdvfyAf
@iPondrio7 ай бұрын
Do you have any video showing how to configure the token ? I’m having a hard time with that config
@anikethdeshpande83364 жыл бұрын
Awesome tutorial!
@dvcorg83704 жыл бұрын
Thanks Aniketh!
@IrtizaKaleem3 жыл бұрын
Hi Elle, can you shed some light if I can do the same, but with a different docker image, such as continuum/anaconda3, so I can do the same for a conda environment? Other than the docker image link, what else would I need to change?
@shaunirwin20164 жыл бұрын
Very nice tutorial! I really like this concept of integrating into the normal software stack. How would one handle the situation of adding new metrics over time? E.g. If you begin a project only displaying F1 score, but as you train more models you realise you are also interested in seeing and comparing the precision. Could this be catered for using CML?
@dmitrypetrov35424 жыл бұрын
Yep, using the existing software stack for ML is one of the ideas behind CML. That's a really good question. The flow relies on Git a lot. So, if the scores were stored\commited then you can derive F1 as well as precision. However, if the scores were not stored/committed you might need to return back, create another experiment just to get the right scores to compare. How do you do that with the other tools or approaches? One relevant discusion - github.com/iterative/dvc/issues/4210
@shaunirwin20164 жыл бұрын
@@dmitrypetrov3542 thanks very much for the reply! Yes, I thought the solution might be something along those lines. For database approaches such as MLFlow one can log metrics later on to previous experiments/runs. I suppose with a git-based system of storing metrics one could manually add an extra commit with the new scores? Or of course rerun the experiment in the normal way with the new scores included, as you suggest. Although for long training times that could be a problem, if you are actually just wanting to do scoring, not training.
@dmitrypetrov35424 жыл бұрын
@@shaunirwin2016 yes, an additional commit is one of the solutions. Re long-running experiments - you are right, but the same happens with logging tools like mlflow - you need to retrain to get the metrics. The only difference, the commit is not needed.
@mehrdat8 ай бұрын
thank you very much. but why i have errors. i couldn't run after first commit. i tried nearly everything. it is deom the the line of the importance plot. what it could be?
@philiperiskallaleal60103 жыл бұрын
Dear Elle, would you be so kind as to show/describe how one can implement a dvc pull request that is meant to be run by a .github/workflows "yaml"'s file, so that it is only run on the git remote repository? An approach through which would be possible to "gitignore" the dvc data, while allowing the git remote a temporary access to the data to properly test the CML commited. Perhaps use some kind of data cache by the git remote repository, and later an automatic deletion of this cached data?
@dvcorg83703 жыл бұрын
One approach is using a local DVC config file, which lets you have a different data remote/different credentials for when you're working locally than what's in your CI/CD system. That means you can still have a DVC config file that gets pushed to your Git repo, but you'll have a local version that gets used when you're developing in your workspace. Docs here: dvc.org/doc/command-reference/remote#example-add-a-default-local-remote Another thought that comes to mind is that you could make the credentials to pull from the DVC remote only available to the runner (via secrets). You might then write a control flow statement... if those environmental variables are present, then run dvc pull. else, don't. : If you want to discuss this in more detail, stop by the CML channel on our Discord: discordapp.com/invite/dvwXA2N
@toilinginobscurity30912 жыл бұрын
Let's say we have a couple of commits in the experiment branch and we want to merge the branch with squashed option. What would happen then? All the reports would be combined?
@muhammadfarjadaliraza45463 жыл бұрын
Awesome video, want to know how to use tpu and gpu ?
@SheeceGardazi3 жыл бұрын
thanks for sharing the talk
@fabianpena27764 жыл бұрын
Thx. The tutorial is amazing. In comments, I am not able to see the PNG files, only the links. Do I need to configure something more?
@dvcorg83704 жыл бұрын
Hm, that sounds like you might be missing a flag in your cml-publish function. Do you have `cml-publish --show-md >> report.md`? If you don't have the `--show-md` flag, you'll get a link to your image instead of an embedded picture.
@fabianpena27764 жыл бұрын
@@dvcorg8370 Thank you again! Now, it works for me :)
@vishal-rana4 жыл бұрын
Beautiful.
@derekcorcoran51294 жыл бұрын
Hello Elle, this looks great, it seems that it works for Python only? I develop Machine Learning tools in R, and I would love to help integrate this if possible
@dvcorg83704 жыл бұрын
The tools we're using here (GitHub Actions and CML) work with any language! Here's a blog about a project using R: mribeirodantas.xyz/blog/index.php/2020/08/10/continuous-machine-learning/ There's a GitHub Action for getting R on your runner, too: github.com/r-lib/actions
@derekcorcoran51294 жыл бұрын
DVCorg thanks, you are doing an amazing job
@sayakpaul31524 жыл бұрын
One thing I figured that the actions do not always trigger upon a new commit to a branch. Is there a way to prevent it?
@dmitrypetrov35424 жыл бұрын
They trigger on push requests. For several local commits and a single push it will run only the last one. So, you need to push on each of the commits.
@mirmohammadjaber26764 жыл бұрын
Have you deleted the experiment branch from the repository?
@dvcorg83704 жыл бұрын
Yes, but you can see the closed PR and browse the branches at previous points in time github.com/andronovhopf/wine/pull/2
@carloslopez72044 жыл бұрын
How can I set a secret token in GitHub actions? My program is calling an API so a need to write the secret token but I don't know if it's correct to write it in cml.yaml because it gonna be public
@dvcorg83704 жыл бұрын
You can add the secret to your GitHub repository, which will give the runner access to it via an environmental variable. You can set it so the variable will be hidden even in logs- check out their docs! docs.github.com/en/actions/reference/encrypted-secrets
@jordieclive4 жыл бұрын
what can this CML tool do that circleci Continous Integration can't do?
@dvcorg83704 жыл бұрын
To be clear, CML isn't a competitor to Circle CI. Circle CI is more analogous to GitHub Actions or GitLab CI; it's a continuous integration system. CML is a toolkit that works with a continuous integration system to 1) provide big data management (via DVC & cloud storage), 2) help you write model metrics and data viz to comments in GitHub/Lab, and 3) orchestrate cloud resources for model training and testing. Currently, CML is only available for GitHub Actions and GitLab CI. But it could in the future integrate with Circle CI (i.e., as an Orb).
@jordieclive4 жыл бұрын
@@dvcorg8370 thanks for detailed reply. I've got it clear in my head now 😃, I watched the other bids in the series and you explain very clearly..I look forward to videos setting up cloud workflow with CML and versioniglng like S3 , gcp. I'm not sure if you are planning to do DL content.. As a suggestion I Would love to see pytorch workflows on cloud with say multigpus . And like basic training tests in CML workflow , like sanity check :fitting/ evaluation on single batch etc. Please keep up tutorials!
@dvcorg83704 жыл бұрын
@@jordieclive No problem! Let us know any other questions you have :)
@hamdikhaled69554 жыл бұрын
Thanks a lot
@jwc76634 жыл бұрын
Scenario: Need NN model and want to test in using GPU. Is it possible as well?
@dvcorg83704 жыл бұрын
Yes! We'll be covering that use case in a video soon. For now we have some an example project to browse: github.com/iterative/cml_cloud_case
@jwc76634 жыл бұрын
@@dvcorg8370 That looks good. Will it support local machine(not cloud) as well?
@dvcorg83704 жыл бұрын
@@jwc7663 Yes- you can set GitHub Actions (& GitLab CI, too) to use self-hosted runners, which can be a local machine. Check out the docs here: docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners
@efels_com4 жыл бұрын
@@dvcorg8370 I would love to see the self hosted GPU flow with the ability to compare the results from the model that is in the master branch repo. And using dvc to roll the data set back to the data set that was used to train the model in master branch. So we could compare both models, on new and old data.
@dvcorg83704 жыл бұрын
@@efels_com We can do this! Adding this to the list of to-dos.
@leilainigodelacruz36484 жыл бұрын
Hi, Thanks for your very useful video. I have a question , because I was trying to replicate this example in my own repo and failed in this part of the cml.yaml ` steps: - uses: actions/checkout@v2 - name: train_model env: repo_token: ${{ secrets.GITHUB_TOKEN }}` do you mean by GITHUB_TOKEN a secret key that I assign in Settings/Secrets tab from the repo? which is a private key. If this is true, I dont know why ifI put my own private key name it doesnt work :(
@dvcorg83704 жыл бұрын
Hi Leila! You don't have to assign any value to GITHUB_TOKEN- it is assigned by default in a GitHub repository. Please delete any secrets you might have added and try again. If it doesn't work, stop by our Discord channel where we can do more hands-on troubleshooting :) discord.gg/bzA6uY7
@leilainigodelacruz36484 жыл бұрын
@@dvcorg8370 Thanks! It did work!
@davidbalakirev59633 жыл бұрын
Hands up if you also had an espresso while watching this.
@jalaj14 жыл бұрын
Hi can you make video on mlcertific.com It is providing free certification on MLOps
@drm8164 Жыл бұрын
i love u
@dvcorg8370 Жыл бұрын
🦉 We love you too!
@jackbauer3224 жыл бұрын
How would mlflow come in here?
@dvcorg83704 жыл бұрын
Good question- you can integrate lots of tools with CML. For example, you can use it with Tensorboard to get a link to your Tensorboard in a PR whenever the model trains. Check out this use case: github.com/iterative/cml_tensorboard_case/pull/3 We haven't tried with MLFlow in particular yet, but expect there could be a similar approach.
@jackbauer3224 жыл бұрын
@@dvcorg8370 Thanks ! Can't wait for the next videos :)
@jeremykusnadi51484 ай бұрын
how do you get around the " `GLIBC_2.28' not found " error?
@dvcorg83703 ай бұрын
This error typically occurs when trying to run a program that was compiled with a newer version of the GNU C Library (GLIBC) than what's installed on your system. Check that version requirements match up and you should be all set!