Getting Started with Orange 16: Text Preprocessing

  Рет қаралды 131,683

Orange Data Mining

Orange Data Mining

Күн бұрын

Пікірлер: 82
@abdielalejandromontanodiaz1035
@abdielalejandromontanodiaz1035 3 жыл бұрын
It is amazing that this is not more popular.
@hwli1358
@hwli1358 3 жыл бұрын
Thank you for your work. Orange is one of the best powerful data mining tools. appreciated
@sonOfLiberty100
@sonOfLiberty100 7 жыл бұрын
I love your framework and your channel so much
@Daveec
@Daveec 2 жыл бұрын
Hello! amazing video! Is there any video explaining how to prepare that fist "Grimm-tales-selected.tab" file?
@prattbanner
@prattbanner 5 жыл бұрын
Finally text widget could install seamlessly with the latest version of Orange 3.24
@paulcostinett886
@paulcostinett886 2 жыл бұрын
I do not see where the plain text editor is; is this a widget I need to load?
@OrangeDataMining
@OrangeDataMining 2 жыл бұрын
Orange does not include a plain text editor. You can edit your text in a notepad.
@anjalisurendran3684
@anjalisurendran3684 Жыл бұрын
Hello, I have been trying orange for my work. However I am not able to add the preprocess text feature. It seems to have a dialogue box that says file not found, even though I can employ other widgets on the same flow. How to resolve this?
@gabrielapinto5306
@gabrielapinto5306 2 жыл бұрын
Thanks for sharing. What should I do if while generating the cloud, Orange cuts off the last letter of every word? Any hints for a solution?
@abotag100
@abotag100 2 жыл бұрын
How can I see or save the data after applying preprocessing widget? is it only showable by wordCount widget? since if I apply the preprocessing widget to a Twitter or corpus widget then I save it or connect it to a Corpus Viewer, I only see the original data not the preprocessed one?
@ia-sicinteligenciaartifici7298
@ia-sicinteligenciaartifici7298 3 жыл бұрын
Hello! This is a great software, thanks. I need to tag paragraphs in a text according to the topic. For instance in "I love twitter since it has only short texts", the tagging would be "Social Networks", and the idea is that when I look for this tag, I can get all the associate paragraphs. Is there any widget/way to do that? I watched all the tutorials but could not find anything like that. Thank you!
@centralstation6987
@centralstation6987 4 жыл бұрын
Hi, what widgets should I use if I would like to extract from a text the words that represent a color (red blue etc.) ? I need to have them as a list.
@hanhantu5658
@hanhantu5658 4 жыл бұрын
is there any way to save data from test score, prediction and confusion matrix to format .csv? because after i'm using bag of words widget, the option to save data to .csv format is gone.. please help..
@mbogitechconpts
@mbogitechconpts 2 жыл бұрын
my Orange 3 can't install the add on but kept crashing, what could be the problem?
@JOÃOGABRIELTORRESDONASCIMENTO
@JOÃOGABRIELTORRESDONASCIMENTO 11 ай бұрын
I was having this problem where the wordcloud widget didnt open because the gui didnt have the WebViewWidget attribute. Solved it by running "pip install pywebview[qt]" on my terminal, but my wordcloud does not appear. What should i do now?
@OrangeDataMining
@OrangeDataMining 11 ай бұрын
Would you mind reporting the issue on our Github page: github.com/biolab/orange3-text This way we can pinpoint the source of the problem.
@nilayozel6458
@nilayozel6458 8 ай бұрын
now when ı enter the apı key it gives an error.  Api error: 401 Unauthorized Unauthorized. Check if your API plan includes Tweets retrieval. The free plan doesn't allow Tweets retrieval anymore.
@prattbanner
@prattbanner 5 жыл бұрын
When I am trying to install the text widget, I am getting the following error - command failed: python python mpip install Orange 3-Text exited with non-zero status. Kindly help. All other widgets are installing perfectly except for bioinformatics widget as well.
@OrangeDataMining
@OrangeDataMining 5 жыл бұрын
Please head to github.com/biolab/orange3-text and describe the error in as much detail as possible (screenshots or copy-paste the error). Thanks!
@fbogeo
@fbogeo 2 жыл бұрын
Thanks for the information. How can I build a text corpus with an own file? Thanks in advance
@OrangeDataMining
@OrangeDataMining 2 жыл бұрын
Orange accepts .docx and plain text files with Import Documents. Alternatively, you can enter each document in its own line in Excel.
@zedoxyt133
@zedoxyt133 4 жыл бұрын
dear i install text add ons but import document widget is not here appearing it is missing now what I have to do please help
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
Update Text add-on from Options - Add-ons.
@JWANJARI
@JWANJARI 3 жыл бұрын
where do i get the data set from or csv files @orange data mining
@prattbanner
@prattbanner 5 жыл бұрын
Hi, Orange is a big revelation for me. Thank you. I am facing only one issue - the text widget is not getting installed even after I tried installing the latest version of orange. Where should I route my issue?
@furqanpatel4000
@furqanpatel4000 5 жыл бұрын
Same problem with me. Do you find any solution?
@prattbanner
@prattbanner 5 жыл бұрын
@@furqanpatel4000 Try downloading the latest version 3.24. The problem is solved
@pornchaijantaworn5851
@pornchaijantaworn5851 3 жыл бұрын
@@prattbanner I try to install V3.24 but it can not run
@felipebritom
@felipebritom 6 жыл бұрын
Hey Im unable to download the the Text Add-On I receive the following error: "Command failed: python python -m pip install Orange3-Text exited with non zero status." What should I do?
@mattmatt245
@mattmatt245 4 жыл бұрын
I have a question about ROC analysis widget. Should you always put prior target class probability, when you're trying to find an optimal threshold for given class weights ? Or maybe it only makes sense after over/undersampling ? Prior probability changes slope of the iso-performance line.
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
Well, ROC only makes sense with prior class probabilities. We're highly suspicious of over/undersampling, but if you do it, you'd do it before ROC anyway. Prior probability is key for estimating performance.
@mattmatt245
@mattmatt245 4 жыл бұрын
@@OrangeDataMining So, any time I work with an imbalanced data set, I'll have to specify it, in order to get correct ROC curve ? ROC analysis widget won't be able to read prior class probability by itself from the data, correct ?
@mattmatt245
@mattmatt245 4 жыл бұрын
@@OrangeDataMining Your documentation is very unclear. Nobody knows how and when to use this option.
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
@@mattmatt245 No, ROC automatically detects initial class distribution. Took care of that. ;)
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
@@mattmatt245 Feel free to submit a PR with improved documentation.
@SisaMasa
@SisaMasa 5 жыл бұрын
Hello Orange! thank you so much for these tutorial videos it really helps! only problem I'm facing, my word cloud detects numbers and years as word. how do I fix this?
@alexlozano1017
@alexlozano1017 3 жыл бұрын
Hey Aiman, have you been able to solve this question?
@SisaMasa
@SisaMasa 3 жыл бұрын
@@alexlozano1017 Hello Alex. Yes, I loaded a list of stopwords and include numbers that I want to hide from word cloud.
@theophilegaudin2329
@theophilegaudin2329 4 жыл бұрын
Thanks for this tutorial. I have 112 text files and I want to preprocess them. However, when I input the text files by "import documents" and then try to preprocess the content, I get a "file not found" error. What should I do?
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
If I understand correctly, your data does not get read? Are you sure you are uploading a folder not a single file? The other thing could be the incompatible format. Probably best to open an issue (github.com/biolab/orange3-text/issues) and we can discuss it there.
@jayanthsurya2763
@jayanthsurya2763 Ай бұрын
Where i can find the your dataset
@shivibhatia1613
@shivibhatia1613 6 жыл бұрын
Just a question if i am working on customer comments how would i include free flowing text as well as customer id and map who are the most dissatisfied customers. this only allows to load the text and not any other column
@OrangeDataMining
@OrangeDataMining 6 жыл бұрын
You can use File to load extra data and then append them with Merge Data.
@izialber
@izialber 11 ай бұрын
How can i link a pdf to a line in my spreadsheet and check if there is a list of words in this file?
@OrangeDataMining
@OrangeDataMining 11 ай бұрын
You need a line in the spreadsheet containing the file name. This will enable you to link the two.
@izialber
@izialber 11 ай бұрын
@@OrangeDataMining Do have any video explaining that ?
@ilkemcengiz7929
@ilkemcengiz7929 4 жыл бұрын
Hey guys thanks for the awesome content ! Is it possible to export word counts from word cloud ?
@rcalastro0
@rcalastro0 7 жыл бұрын
Do you have any tips/tricks to improve Oranges' performance when working with big files? I am trying to work with 100k rows file (about 30 columns) and it's been painful because every step I take, every operator I connect (...) Orange tries to recalculate everything. I am already using the sample operator, rank/pca operators and I've disabled the "send output" to minimize the waiting time, but I would appreciate any tip! I am currently using Orange 3.3 on OSX . Thanks!!
@OrangeDataMining
@OrangeDataMining 7 жыл бұрын
Two tips: 1) Disable the automatic propagation of data (next to Commit automatically button). Then tweak all the parameters and only once you are ready, run the analysis. 2) Upgrade to Anaconda on a 64-bit Python. Orange is not perfect, but we mostly use scikit for computing and should be as efficient as possible. The issue, I reckon, is in your RAM. Anaconda distribution with 64-bit Python should help, but there are 64-bit bundles for Win and OSX coming up soon.
@rcalastro0
@rcalastro0 7 жыл бұрын
Awesome! Thank you for sharing your knowledge =)
@guopingli627
@guopingli627 6 жыл бұрын
Could you tell me how to set features like ATU topic?
@punithac3019
@punithac3019 6 жыл бұрын
Hi, I am unable to retrieve package list, or install them (WinError 10060). I am guessing the error might get resolved if I change the proxy. Please can you help in updating the proxy in the tool.
@OrangeDataMining
@OrangeDataMining 6 жыл бұрын
Please see: github.com/biolab/orange3/pull/2906 You can set it in Settings dialog in the latest version.
@claudshob
@claudshob 4 жыл бұрын
How do you go the custom stopword list? I am lost after that. Fantastic video btw!
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
You provide a plain text file with a word per line and load it into the Filter section, where the folder icon is.
@manuelillanes1635
@manuelillanes1635 5 жыл бұрын
in which format should the text be before importing it to orange?
@OrangeDataMining
@OrangeDataMining 5 жыл бұрын
Plain text, .docx, .odt, .pdf or .xml. Can pass a Table to Corpus as well.
@manuelillanes1635
@manuelillanes1635 5 жыл бұрын
@@OrangeDataMining I can't read .txt files :(
@ignatiusadriangilangp.1738
@ignatiusadriangilangp.1738 Жыл бұрын
thank you, sir i like to learn this:)
@harrymccrindle6641
@harrymccrindle6641 5 жыл бұрын
How do we get sublime text if we aren't using a Mac?
@OrangeDataMining
@OrangeDataMining 5 жыл бұрын
Download the version for Windows?
@yazminboag3502
@yazminboag3502 5 жыл бұрын
Did you ever figure this out as the answer wasnt helpful?
@pratik314
@pratik314 4 жыл бұрын
@@OrangeDataMining i am unable to find sublime option on windows. ?
@LokoKolima
@LokoKolima 4 жыл бұрын
You don't need exactly to use Sublime Text, it can even be Notepad
@NivedhithaEzhilarasan
@NivedhithaEzhilarasan 6 жыл бұрын
I'm trying to install text add on through anaconda installation, I'm constantly getting an error. Could you please help me out
@OrangeDataMining
@OrangeDataMining 6 жыл бұрын
Dear Nivedhitha, are you using the classic installer from our website?
@NivedhithaEzhilarasan
@NivedhithaEzhilarasan 6 жыл бұрын
I had to solved. Visual studio was not added to the path, so it raised that error, stating " cl.exe" cannot be found. Once when I added it, it solved the problem. But, orange3 hangs and crashes when I tried importing the imdb dataset, which has around 25000 rows.
@OrangeDataMining
@OrangeDataMining 6 жыл бұрын
For Windows, we normally suggest using the Miniconda installer, which should work without compiling. I am not sure why it is crashing. I can easily load the 32,000 row adult data set. Is there an error you get that you could forward to use so we can inspect it? Thanks!
@JeremyFennema
@JeremyFennema 6 жыл бұрын
@@OrangeDataMining I'm having the same issue, I'm just using the method in the video and it fails. I can install the Textable add-on without issue, but not the "Text" addon. I can forward the log and error if that's helpful, just not sure where to send it.
@prattbanner
@prattbanner 5 жыл бұрын
@@JeremyFennema I am having a simialr issue. Did your issue get solved? Are you getting the error command failed: python python mpip install Orange 3-Text exited with non-zero status?
@ashisparajuli5756
@ashisparajuli5756 7 жыл бұрын
can we connect spark with this ?
@OrangeDataMining
@OrangeDataMining 7 жыл бұрын
From what we know - no. There's a Spark add-on however, which is a third-party add-on, so you should check with the developers what is its state.
@salvadorsarabiaorea105
@salvadorsarabiaorea105 4 жыл бұрын
Can this analyze texts in Spanish?
@ia-sicinteligenciaartifici7298
@ia-sicinteligenciaartifici7298 3 жыл бұрын
Sí, en el selector de idiomas de los widgets, tiene la opción para elegir Español.
@francois9496
@francois9496 2 жыл бұрын
Thank you
@mklmk-vjhbj
@mklmk-vjhbj 9 ай бұрын
cringe
@manueladelpilargomezsuta3466
@manueladelpilargomezsuta3466 4 жыл бұрын
I have an error "AttributeError: module 'Orange.widgets.gui' has no attribute 'WebviewWidget'" What can i do?
@OrangeDataMining
@OrangeDataMining 4 жыл бұрын
Install WebView.
Getting Started with Orange 17: Text Clustering
3:51
Orange Data Mining
Рет қаралды 98 М.
Classification in Orange (CS2401)
24:02
haikel5
Рет қаралды 126 М.
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН
Getting Started with MAXQDA
10:29
MAXQDA Official Channel
Рет қаралды 18 М.
What is Text Mining?
8:16
IBM Technology
Рет қаралды 24 М.
Introduction to Voyant Tools for Text Analysis
54:36
ualbanylibrary
Рет қаралды 7 М.
Social Media Text Mining/Data Analytics Using Orange
16:59
Murad Moqbel
Рет қаралды 21 М.
Getting Started with Orange 18: Text Classification
4:07
Orange Data Mining
Рет қаралды 88 М.
ORANGE TOPIC MODELLING
10:23
Innovation and Management in Tourism - eTourism
Рет қаралды 5 М.
Word Embedding and Nearest Neighbors
6:20
Orange Data Mining
Рет қаралды 6 М.
7 Outside The Box Puzzles
12:16
MindYourDecisions
Рет қаралды 218 М.
Getting started with MAXQDA 2022
10:08
MAXQDA Official Channel
Рет қаралды 63 М.
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН