Quick tour of PyCaret (a low-code machine learning library in Python)

Рет қаралды 34,873

Күн бұрын

Пікірлер

@jorge1869 4 жыл бұрын

PyCaret comes to fix all the mess from the scikit-learn documentation. Pycaret is a very strong competitor, and I have no doubt that it will become one of the most used libraries such as pandas and tensorflow. Regards

@DataProfessor 4 жыл бұрын

PyCaret is certainly a user-friendly library that helps to streamline many of the common task in ML and get us started in no time.

@mohammeddanishreza4902 2 жыл бұрын

@@DataProfessor Are companies actually using this package for the real time projects ?

@desmondwong5851 2 жыл бұрын

Thanks for the great tutorial! Got my started with ML at work. (The only issue I ran into was the session crashing at classification "after using all available RAM".) Appreciate the sharing and look forward to more videos on machine learning!

@dunggeonplays 4 жыл бұрын

Hope pyCaret will also add some time series algorithm also

@DataProfessor 4 жыл бұрын

Thanks for the comment. Yes, it seems the time series is under development and should be out in the next release. Info from towardsdatascience.com/announcing-pycaret-an-open-source-low-code-machine-learning-library-in-python-4a1f1aad8d46

@shivaprasadbk2027 2 жыл бұрын

Thank you for sharing this quick tutorial. Very helpful.

@ankitaharwal5886 4 жыл бұрын

*Can we automatically tune all models during initial fit, and choose best performing tuned model*

@shwetaredkar734 4 жыл бұрын

The package is great and very well coded. However, there is a problem with the CatBoost algorithm. The plot_model function doesn't work and throws an exception. If anyone gets CatBoost as best performing algo. then you won't be able to see plots. Otherwise, the overall package is good to use.

@DataProfessor 4 жыл бұрын

Thanks for pointing that out. I also got an error, seems to be incompatibility issue between catboost and plot_model. An exception has occurred, use %tb to see the full traceback. SystemExit: (Estimator Error): CatBoost estimator is not compatible with plot_model function, try using Catboost with interpret_model instead.

@shwetaredkar734 4 жыл бұрын

@@DataProfessor yea. I got the same error. Yellowbricks doesn't support CatBoost. That's what the author says and suggest to use intepret_model instead. Secondly, even if you change the number of folds to train the model, while tuning by default it calculates for 10 fold and not the one that you were expecting.

@dawg7562 4 жыл бұрын

Thanks for this, simplifies a lot

@DataProfessor 4 жыл бұрын

Thanks Adhith!

@thiagogpinto 4 жыл бұрын

Cool walkthrough man... nice work!

@DataProfessor 4 жыл бұрын

Thanks for watching and for the encouragement 😃

@upendram2820 2 жыл бұрын

Thanks a lot sir for such a informative tutorial

@jhuiop6552 4 жыл бұрын

Hi 👋 I can not run the setup for the summary with the different classifications models my google Colab goes too slow even with iris

@shwetaredkar734 4 жыл бұрын

PyCaret sounds interesting. Guess they are missing one more aspect, that is repeated trials of cross-validation. Also, thanks for making this tutorial and bringing it to us and clarifying it. Sometimes one is lost when you explore such packages.

@DataProfessor 4 жыл бұрын

Thanks Shweta for the comment. That would be a great feature to have!

@raymondklutse 4 жыл бұрын

Very insightful video. Thank you

@DataProfessor 4 жыл бұрын

Glad you enjoyed it!

@ignaciogonzalez6179 4 жыл бұрын

Nice vid !!! Hi My Google Colab goes very slow when Im trying to do the Setup like hours and days is that normal ?

@DataProfessor 4 жыл бұрын

Hi, no that's not normal, pycaret installation should take a few minutes.

@ignaciogonzalez6179 4 жыл бұрын

@@DataProfessor No I mean it goes very slow when you are doing the setup with the dataset and the different ML models

@jhuiop6552 4 жыл бұрын

Yes it happens the same to me when I’m doing the setup waiting for the different models to with with the AUC RECALL even using the same dataset iris HELPP I LOVE YOUR VIDEOS 💪👌

@jasonjefferson6596 Жыл бұрын

Incredibly useful thank you 😊

@harshjaiswal5089 Жыл бұрын

can we use this library as a ML project to showcase in interview

@jamespaz4333 3 жыл бұрын

I have just started learning python. Would you recommend me to start out with pycaret or scikit-learn?

@premsinghanant6139 4 жыл бұрын

Thank you so much sir. I learned a lot

@DataProfessor 4 жыл бұрын

Thanks for watching, glad it was helpful 😁

@yunarrs.3129 2 жыл бұрын

oh are you Thai? glad to know พึ่งรู้ คลิปมีประโยชน์มากครับ

@DataProfessor 2 жыл бұрын

ขอบคุณครับ คนไทยครับ 😊

@rockroll28 4 жыл бұрын

Thank you so much ❤❤

@DataProfessor 4 жыл бұрын

Thank you for watching

@salikmalik7631 4 жыл бұрын

Sir, Pycaret is there so still we need to learn machine learning using python (scikit learn, tensorFlow) for data science career...?

@DataProfessor 4 жыл бұрын

PyCaret is a good library but it is an AutoML that essentially puts together a typical machine learning workflow into an easy to use template. Although this may be friendly for a beginner. However, this template may not suit your needs, then you may need to outgrow the pycaret library and develop your own solutions using conventional libraries.

@shwetaredkar734 4 жыл бұрын

@@DataProfessor yea. True that

@SA-Aries 10 ай бұрын

Is it true that pycarat do not provide value of R2

@veronese01 4 жыл бұрын

How to use the Pycaret library with unbalanced data? How to use the technique of undersampling and oversampling in the data analyzed unbalanced?

@DataProfessor 4 жыл бұрын

Hi, this can be applied using the 'fix_imbalance = True' argument in the setup() function of PyCaret, pycaret.org/fix-imbalance/

@mattmatt245 4 жыл бұрын

Can it handle unequal misclassification costs ? Can you assign weights to classes, so it'll plot cost curves in the ROC space, just like Orange does ? Thank you.

@DataProfessor 4 жыл бұрын

Thanks Matt for the question. I think so, you can have a look at optimize_threshold function. which allows adjustment of the probability threshold for defining the cost of TP, TN, FP and FN.

@mattmatt245 4 жыл бұрын

@@DataProfessor I checked out optimize_threshold description and I'm wondering why would we ever need a value for TN. I thought that all that's necessary hare is FP/FN ratio.

@DataProfessor 4 жыл бұрын

@@mattmatt245 I would think that different user may have different usage for the optimize_threshold function and the developer probably want the user to have full control over the parameters. Also TN is used to compute the Specificity and therefore you can fine tune this. Hope this helps.

@mattmatt245 4 жыл бұрын

@@DataProfessor So, you can either minimize cost or maximize profit ? Also, did you try to handle imbalanced data sets with this ? Does it allow to perform over/undersampling or SMOTE ? Thanks

@salikmalik7631 4 жыл бұрын

Make a videos on numpy and scipy kindly. And difference between them.

@DataProfessor 4 жыл бұрын

Thanks Salik, I'll look into that

@rashawnhoward564 4 жыл бұрын

So this is like the caret library in R

@DataProfessor 4 жыл бұрын

The only similarity is the name. Under the hood, PyCaret is based on scikit-learn. Other libraries include: "pandas", "numpy", "seaborn", "matplotlib", "IPython", "joblib", "scikit-learn==0.22", "shap==0.32.1", "ipywidgets", "yellowbrick==1.0.1", "xgboost==0.90", "wordcloud", "textblob", "plotly==4.4.1", "cufflinks==0.17.0", "umap-learn", "lightgbm==2.3.1", "pyLDAvis", "gensim", "spacy", "nltk", "mlxtend", "pyod", "catboost==0.20.2", "pandas-profiling==2.3.0", "kmodes==0.10.1", "datefinder==0.7.0", "datetime", "DateTime==4.3", "awscli

@user-or7ji5hv8y 3 жыл бұрын

Is it similar to AutoML?

@TechnicalGuruji_Satish 4 жыл бұрын

Hello, How can we specify our own dataset? Could you please help

@DataProfessor 4 жыл бұрын

Hi Satish, to use your own dataset you will have to read it into a dataframe. You can use the pd.read_csv() function and assign it to a variable and then use that dataframe as input argument instead of the example dataset dataframe (e.g. iris)

@TechnicalGuruji_Satish 4 жыл бұрын

Thanks for your reply sir, I am using same pd.read_csv() with git path of file which I want to load inside it but it is giving me error in colab. Br, Satish.

@jorge1869 4 жыл бұрын

@@TechnicalGuruji_Satish did you import the pandas library? 'import pandas as pd'