Text Classification with Weka using a J48 Decision Tree

Рет қаралды 54,442

Күн бұрын

Пікірлер: 30

@hornaburzum 10 жыл бұрын

Good explanation it was understable, but it was quite hard to read some options due the quality of the video. Could be a good idea if you upload it in a better quality. Thanks for this help.

@mionel237 9 жыл бұрын

so far, i watched and followed many tutorials on data mining using weka, this is one the best. you only need to increase the video quality. best of luck

@marengajames879 11 жыл бұрын

I have a dictionary of lexicon features--words and phrases--relating to a particular domain. I want to create a classifier that predicts whether a text document belongs to the domain. How can I model such a classifier?

@S0naris 11 жыл бұрын

@red wefdass: in the classify tab you click on options. The "Classifier evaluation options" pop up. Find the Ouput predictions option and e.g. choose "PlainText" to get text representation of the prediction. Now reevaluate the model with the testing data and the result will contain the predictions for the testing data.

@myWorldDiscover 9 жыл бұрын

very clear and simple explanation, But I have a question: how could we use lammatization instead of stemming, because sometimes a word may be similar in the meaning to the other words which is not included in the training stage? or any suggestion please ! Best regards, Mohammed

@ricardocampos7284 9 жыл бұрын

Hi there, good video! But, there is something wrong. These same steps doesn't work for me (v3.6.13). When I try to classify new text, it tells me the datasets are incompatible. Event filtering the test dataset, the same error occurs, maybe because the test data has new words to training dataset. Any idea how to avoid this problem? By now, the only solution (not yet tested) I imagine is training the classifier with the training data and the unclassified data, altogether in the same ARFF file.

@alokpal3891 7 жыл бұрын

Awesome explanation. It helped me a lot. May be the best explanation in youtube.

@aisulu9428 11 жыл бұрын

I don't have ComboBox or somethimg else for PlaningText, may be I have another version of Weka. What can I do? The next question: can we see only predictions or exactly classification(confusion matrix) too? Thanks a lot =)

@m.waqaschughtai9707 5 жыл бұрын

where I get datasets? any link?

@imroze8840 5 жыл бұрын

Excellent work sir.. Explained everything very clearly and precisely. This helped me a lot in what I was exactly looking for my problem... Thank you very much sir....

@ricardocampos7284 9 жыл бұрын

Include the ?,[text] lines in the dataset filtered, so that the filtering takes all words into account. Then, classify the ? lines to solve your problem. But there is another way which I forgot. :) Think it is the correct one. Ask me if you need so that I check it for you.

@GiovanniBalestrieri 8 жыл бұрын

Hi S0naris and thank you for this tutorial. Is it possible to search for the best subset of attributes in my dataset? How can I add an automatic feature selection algorithm in a text classification task using Weka?

@holiveira11 9 жыл бұрын

I really appreciated your tutorial. It was very clear with an excellent explanation. It helped me so much! Thanks a lot!

@faroozee3239 11 жыл бұрын

hi i have one question.. i have a test data set consists of 4 attributes: 1. virus = sample virus files which are numeric 2. not a virus = clean files which are numeric 3. behavior has 2 options: either normal or abnormal = meaning whether the behaviour of computer is normal or abnormal 4 output which consists of 2 options: either clean file or infected file. now i have a training dataset which i want to classify files samples as either virus or not a virus. let me know on how to proceed

@prasantjillella516 9 жыл бұрын

Hey guys i am new to use j48 classification in programming i want to learn from scratch can any one suggest where i can get the weka software for it and what are the inputs and how to get the outputs for it

@arefeshghi 9 жыл бұрын

+prasant jillella watch this video

@takbirhossaintushar7290 7 жыл бұрын

hello sir , this is really extraordinary , but I am not finding the text to arff file conversion video please send the video link

@chandragst6231 7 жыл бұрын

thank you for this tutorial, it was easy to understand and really helpful for my task!

@ricardocampos7284 9 жыл бұрын

I've found the problem. If you are going to use de StringToWord filter, you have to filter the dataset that includes the non-classified (test) examples. If you filter a dataset, use it to train and then test the classifier on another filtered dataset, you will receive an error saying the 2 dataset ar incompatible, or, if you are lucky, just the result is will very bad. You have to filter the dataset including the non-classified examples, like this: 2, "Text to teach the classifier of category 2" 41, "Text to teach the classifier of category 41" ?, "Uncategorized text, important to be here so that the filtered dataset has one unique format"

@arefeshghi 9 жыл бұрын

+Ricardo Campos I think I made the same mistake as you did. But I did not get how you mentioned to resolve the problem. what do you mean by "filter the dataset including the non-classified examples"?