MLOps Tutorial #5: Automated Testing for Machine Learning

Рет қаралды 25,553

Күн бұрын

Пікірлер: 48

@dvcorg8370 2 жыл бұрын

Please note we have deprecated the dvcorg/cml-py3 container image. You can get the same results with: - container: docker://dvcorg/cml-py3:latest + steps: + - uses: actions/checkout@v3 + - uses: iterative/setup-tools@v1

@essamgouda1609 3 жыл бұрын

Your simplicity in explanation allows me to see how these could be tailored to personal case uses and this is what I like the most in this playlist ! Thank you !!!

@essamgouda1609 3 жыл бұрын

This playlist is a gem, thank you so much for sharing your valuable knowledge with everyone. Much appreciated !

@deepakdevellore 4 жыл бұрын

Absolutely love Dr.Elle O'Brien's explanation on these topics ! Expecting more in future videos .Keep up the good work ! :)

@jackbauer322 4 жыл бұрын

I would be thrilled by a video series of CML / DVC from zero to 100 for total beginners! but with coding along so that a little more advanced people could follow through too! thanks again for your energy and positivity !

@dvcorg8370 4 жыл бұрын

Yes! We're thinking about a "DVC Basics" playlist soon, that would be for total beginners to do some sample projects- hopefully hitting enough high level values that advanced folks could enjoy (or at least watch on 1.5X speed :D )

@jackbauer322 4 жыл бұрын

@@dvcorg8370

@gogreen0417 2 жыл бұрын

Thank you so much for showing this. Nice explanation and wish you make more videos productionizing ML models

@jayanthkumar7964 4 жыл бұрын

How have i just found this channel? Such practical and great knowledge. Thanks, Dr. O'Brien.

@dvcorg8370 4 жыл бұрын

Thanks for the kind words, Jayanth! Really appreciate it :)

@TheBenyos 4 жыл бұрын

Would love to see some future episode on best practices to convert a PyTorch/TensorFlow model to some optimized format (e.g. ONNX) and deploying that model to some ML cloud platform

@dvcorg8370 4 жыл бұрын

Yes- deploying models is on the roadmap! Good idea about converting format, too.

@TheBenyos 4 жыл бұрын

Awesome!

@RedShipsofSpainAgain 4 жыл бұрын

Oooo +1 for this sugg

@vgtgoat 2 жыл бұрын

Thanks doc! This has introduced me to a lot

@GAURAVSINGH-qy4cj Жыл бұрын

I like your colorful parrot ..😅

@dvcorg8370 Жыл бұрын

Thank you! DeeVee says "Hi!" 👋🏼

@GAURAVSINGH-qy4cj Жыл бұрын

@@dvcorg8370 Hi DeeVee...👋

@snehotoshbanerjee1938 4 жыл бұрын

Great episodes! I would like see an episode on model drift and how to automate the whole mlops pipeline.Thanks!

@rajangupta5432 2 жыл бұрын

So cute waybof explaining things.. :)

@Joseph20077 2 жыл бұрын

I loved it Thanks Dr. O'Brien

@jackbauer322 4 жыл бұрын

Unit testing and Integration Testing :)

@anelm.5127 3 жыл бұрын

Thank you very much for this series. Let's say I want to perform unit tests using e.g. pytest before my train_model.py is executed in the dvc pipeline. How would I include the tests dvc pipeline? If I'd run pytest before the model training script and if some test cases fail (but don't throw an error) the training step would still be executed. However, I only want the training to happen in case all tests pass. The only solution I see is to run an additional script that checks in the pytest report if all tests passed and then to run the train script based on the result. There has to be a better solution...

@dvcorg8370 3 жыл бұрын

Thanks for the question Anel! Invoking pytest --exitfirst (or -x for people who prefer short options) will cause pytest to exit with a non-zero status as soon as any test case fails. On most CI/CD platforms, the workflow will fail and stop as soon as any command returns a non-zero exit status. Let's say you have something like this: steps: - run: pytest -x - run: dvc repro The second command won't run if the first one fails. If you're using some kind of custom script outside CI/CD, you can use set -e near the top of the file to achieve a similar effect to this ss64.com/bash/set.html Join our community in Discord to get all your DVC and CML questions answered! 👉🏽 discord.gg/rpgRdvfyAf

@anelm.5127 3 жыл бұрын

@@dvcorg8370 Wonderful ! I managed to get my Ml CI/CD workflow running. I will create a summary on medium and of course cite dvc and your this cool series.

@ThankYouESM 2 жыл бұрын

If I created an ML algorithm of my own (in Toronto)... how can I sell that if it is far more efficient than the 2nd best version?

@jackbauer322 4 жыл бұрын

My new episode ! Joy !

@audigierbenoit8577 4 жыл бұрын

Great video, thank you. What would happen if the test dataset is version controlled with a dvc remote repository? Is it possible to access the test dataset from the CI pipeline?

@dvcorg8370 4 жыл бұрын

Great question- yes, you absolutely can do that. To set up, you'll 1) setup your DVC repo and make sure to push your .dvc and .dvc/config files to the GH repo. 2) Use GitHub Actions + CML to provide credentials for accessing your remote to the runner (for example, see this workflow: github.com/iterative/cml_dvc_case/blob/master/.github/workflows/cml.yaml) 3) Use `dvc pull ` in the workflow to get your test dataset

@TheBenyos 4 жыл бұрын

Love your tutorials, thanks :-)

@airsmokker 2 жыл бұрын

Tell about live detection using ci and cd

@mohitsinghpawar9387 4 жыл бұрын

Thanks for this video .. I have one question , if we include pytest my_testing_file.py -v .. in the dvc pipeline then that is working in the local system when running dvc repro .. but it is giving error when I am using that in the azure blob storage and it is giving error that dvc.yaml file is not reproducible. Kindly let me know what can be done?

@mohitsinghpawar9387 4 жыл бұрын

Actually now the pytest part is working but how can we check when we are running it on github that if the test has passed or not .. I can see that in the command line when running on the local system . Thanks

@dvcorg8370 4 жыл бұрын

@@mohitsinghpawar9387 Hi Mohit, we might be better able to help you on our Discord server where we can ask some questions/see your error messages- dvc.org/chat From what I understand, if your GitHub Actions workflow is running completely (you get a green check, not a red X in your pull request, as in the video), you could look in the logs from the runner (Actions -> Details) to see any output from pytest. You could also pipe the output of pytest into a .md file and then use CML to return the results in a report. There are probably a few approaches here!

@mohitsinghpawar9387 4 жыл бұрын

@@dvcorg8370 Yes .. the thing that you said at in the last part ..that is the one I am looking for. Are there any repo example or anything to pipe the output from the pytest part and show it through reports. I guess that can help. Thanks

@dvcorg8370 4 жыл бұрын

@@mohitsinghpawar9387 if pytest prints to console, you could do: `pytest > report.md` `cml-send-comment report.md` We don't have any pytest project repos currently, but the basics about piping text outputs can be seen in github.com/iterative/cml_base_case In this workflow, we have a Python script `train.py` that writes text output to `metrics.txt`. Then we use bash scripting to add the content of `metrics.txt` to our CML report.

@mohitsinghpawar9387 4 жыл бұрын

@@dvcorg8370 Thanks . Will have a look on it.🙏

@srinivasv3268 3 жыл бұрын

How to create Python OOP for ml model

@Shankarpandala 4 жыл бұрын

Hello Dr, When we will be getting next videos?

@dvcorg8370 4 жыл бұрын

This week!!! Hang on ;)

@mr_tpk 3 жыл бұрын

Awesome 🔥🔥🔥

4 жыл бұрын

This is only with github? I mean we are using codecommit, it is possible to do the same or almost the same with codecommit?

@dvcorg8370 4 жыл бұрын

Good question- currently, we support GitHub Actions & GitLab CI, although we're hoping to expand to a couple other CI providers. What matters isn't your Git repository but your CI system. It looks like the CI system that goes with codecommit is AWS CodePipeline, which we don't currently support. So if there's a way to use your Git repo with GitHub Actions or GitLab CI, then you can use CML right now! If not, we'd love if you open a feature request on our repo to let us know what CI system you'd like to have supported. github.com/iterative/cml/issues