⭕ Check out other videos in this Weka tutorial series: ✅ How to Build Regression Models (Weka Tutorial #1) kzbin.info/www/bejne/qmbQqp99faplfMk ✅ How to Build Classification Models (Weka Tutorial #2) kzbin.info/www/bejne/n3rIapekq71reKc ⭕ Links for this video: ✅ Weka 3 website: www.cs.waikato.ac.nz/ml/weka/ ✅ Buy the Official Weka 3 Book: amzn.to/34MY6LC 👉Watch this video next (How to learn data science in 2021) kzbin.info/www/bejne/pYOZaGOKrdybbpo Support this Channel 👇👇👇 🌟 Buy me a coffee www.buymeacoffee.com/dataprofessor 🌟 Download Kite for FREE www.kite.com/get-kite/? 👉 Subscribe to this KZbin channel kzbin.info 👉 Join the Newsletter of Data Professor newsletter.dataprofessor.org
@Mario-ox5dm4 жыл бұрын
Loving the WEKA tutorials! Working on one of your Streamlit tutorials now as a refresher for me then I'm going to explore these!
@DataProfessor4 жыл бұрын
Thanks Mario 😊
@KenJee_ds4 жыл бұрын
I love Penguins!!!
@DataProfessor4 жыл бұрын
Haha, Go Penguins! 😆
@ioannischrysochos7737 Жыл бұрын
I used the tool ArffViewr and I upload the csv. When I saved the file as arff, the data were not clustered. But when I imported the file in the weka, I saw the data were clustered automatically. Thank you very much for your videos. They are very helpful.
@idanmorad47694 жыл бұрын
Hey, thank for the video. A few remarks: 1. You can just let weka read the text file as CSV, weka then automatically adjust and then you can save it as ARFF file. 2. Can you talk on the set backs of weka? Such as you cannot split to train and test at the preprocess stage so every filter you make take into account the all the data, this open up a possibility of overfitting. 3. Weka also can be used in code, as its jars are in the maven repository. Maybe it can a future video.
@DataProfessor4 жыл бұрын
Thanks Idan for the pointers. Its been a long time since I last used Weka and I didn’t know csv could now be read in directly, thanks for that. Yes, Weka can be used in other languages such as R and Python as well as in the command line, e.g. can use java -jar weka.jar -i input.arff -o output.txt (correct me if I’m wrong, it’s something along this line). Yes I can make a future video about this. As for the data split we can do an initial split into separate files and use the 80 split for model building.
@idanmorad47694 жыл бұрын
@@DataProfessor thanks for the reply :-) for the data I meant that you can import the java code of weka to a java project in NetBeans or InteliJ as you import package in python for a python project or R in R project. For the splitting of the sets, in python we have for sklearn transformers like SimpleImputer or TfidfVectorizer a method of fit which learns the data, and a method of transform which apply on the data. So you can learn the tf-idf values from the train and only apply them to test without learning the test. In weka, as far as I know, when you apply a filter in the preprocess stage it equivalent to fit_transform in sklearn. So if you want to impute missing value with the mean, the mean will be calculated from all the data and not just from the train segment. Or the tf-idf value will be calculated from all the data. Maybe I'm mistaken or you can show in weka how can you apply the same filter on train-test files and the results will only learn from the train file.
@alvynchin11603 жыл бұрын
What do we do when our data set has numeric, categorical and nominal values and we are trying to predict a binary attribute. Do we remove the numerical data and categorical data to just make it nominal.
@amnar790 Жыл бұрын
If my data has imbalanced data in class attributes what I can do to get better performance?
@fahimhamdan15694 жыл бұрын
By comparing brest_cancer & penguin dataset, equal class attribute will leads better model accuracy in all algo?
@DataProfessor4 жыл бұрын
Class balancing would lead indirectly lead to better model accuracy by making sure that there will be less bias in the model when the classes are in equal number. Otherwise the class imbalance may lead to the majority class having more bias in the learning process. Also the class balancing would consequently lead to reliable results.
@fahimhamdan15694 жыл бұрын
@@DataProfessor well explained 👍
@plica06 Жыл бұрын
This is very helpful but I wish you would explain 1. Why you are selecting each algorithm? 2. Why do they perform differently? 3. When I make that Excel summary at the end what would I talk about that summary, what would I say?
@Sawaedo4 жыл бұрын
Do you know how can I download the trained model? I don't see the option anywhere.
@DataProfessor4 жыл бұрын
Hi Jose, unfortunately, Weka doesn't allow the export of trained model
@aliakbarjamali94722 жыл бұрын
Why do you usually convert CSV files into ARFF files while you can use CSV format in WEKA as well?
@DataProfessor2 жыл бұрын
Good question, In the old days ARFF was the only supported native file format and over time CSV was added.
@aliakbarjamali94722 жыл бұрын
@@DataProfessor Fair enough. Thanks.
@edoedo80588 ай бұрын
Wonderful .. I learned too much. TnX :)))
@username424 жыл бұрын
what if you have a thousands of variables in data set , how to rename them to be used in weka ? it is easy for such small tabular data sets, but for scientific sensory data sets, there are hundreds or thousands of variables which needs to be renamed as numeric and latents :/
@DataProfessor4 жыл бұрын
That's a great question. Then we'll need to write a Python script for that using a "for" loop to iterate through the list of variables.
@username424 жыл бұрын
@@DataProfessor yes and then we will continue on python for modelling , training and etc :D and leave weka behind
@DataProfessor4 жыл бұрын
@@username42 Exactly! When the need arises to use Python, it's an early sign suggesting that we need to level up 😊
@MegaBoss19804 жыл бұрын
At 10.24 you used, training set as an option. Isn't it the whole set you imported? Also, in Excel, you have accuracy for training, CV, and test columns. I think training accuracy is on full data set. Which is the reason RF has 100% accuracy.
@DataProfessor4 жыл бұрын
Yes, that is correct 😊
@sembutininverse4 жыл бұрын
thank you.
@DataProfessor4 жыл бұрын
Thanks for watching!
@alhikmah_94 жыл бұрын
Thank you professor for the amazing job you are doing. I appreciate your work. I would also like to know how you can be reached via email.