i am just searching for best NER algorithms since last two dasy for my usecase can't wait to see what you have it here
@python-programming2 жыл бұрын
This won't focus on NER, but there is a few-shot NER from the same company called concise_concepts. I have tested it and found it good for some labels and bad for others.
@giantdutchviking Жыл бұрын
Thanks for making this vid, been learning Python for a bit and this stuff makes Python shine!
@nguyenngochai62452 жыл бұрын
Thank you very much for sharing! Love it. May I ask would it be possible to add more classes to the data ? It would be even more awesome If it could be done for other non-English language models.
@python-programming2 жыл бұрын
Yes it will be possible to add other classes and you can use any language model on hugging face
@nguyenngochai62452 жыл бұрын
@@python-programming Thank you for your instant reply! I have successfully tried it with the "ja_core_news_lg" model, but I could not get a satisfactory result out of the Japanese sentence-transformers model. Do you have any tips for choosing the appropriate models?
@python-programming2 жыл бұрын
@@nguyenngochai6245 no problem! I will test it out today
@shahidmahmood72522 жыл бұрын
Good knowledge, shared wonderfully. Looks like a great module. Now thinking of all the applications in works of English literature. thanks!
@python-programming2 жыл бұрын
Thanks!
@transflux-us2 жыл бұрын
I was trying to identify "local indicators of climate change impacts" (what changes people observe in their environment -... not city people... :D ) in a database of scientific articles. results are ok. its hard, but it might use as a pre-scan
@python-programming2 жыл бұрын
That is really interesting!
@luiztauffer85132 жыл бұрын
This is gold material, thanks so much for putting this out in such a comprehensive way! @Python Tutorials for Digital Humanities In one of your videos you mentioned you do research in History, is that right? I’m curious to know how people are using text classification methods such as this in History research, do you have any material you could point me out to?
@python-programming2 жыл бұрын
Thanks!! Yes, my background is a PhD in medieval history but I mostly work with archival material at Smithsonian and USHMM. A lot of the publications you can find in history with text classification deal with sentiment analysis. You can find articles on Digital Humanities Quarterly and the Oxford Digital Humanities journal.
@Hypothermia13372 жыл бұрын
Hello Dr. Mattingly, do you know if it's possible, to fine-tune a pre trained model? I'm really not familiar with that but I need to tweek a model with a few exceptions. Yours Sincerly
@python-programming2 жыл бұрын
It is! If you want to fine tune a language model that can be done via Gensim or the Transformer library from HuggingFace. If you want to fine tune NER you will have some problems, namely catastrophic forgetting.
@victordeleon99882 жыл бұрын
Great video, thanks a lot. Do you recommend any models in spanish besides those already available in spacy?
@python-programming2 жыл бұрын
No problem! It depends on what you are trying to do, there are some great BERT models for Spanish. You can find them on HuggingFace's website.
@victordeleon99882 жыл бұрын
@@python-programming Great, thanks a lot, your channel is awesome.
@python-programming2 жыл бұрын
@@victordeleon9988 Thanks!!
@ezrakassa34722 жыл бұрын
Cant wait. Is it multiple or binary classification though? I am hoping there would be a multiple classification as there is an elaborated video you did on binary classification?
@python-programming2 жыл бұрын
This will be binary, but it works for multi-class just as well. Remember when you use few-shot classification, you are not doing traditional supervised learning. Instead, you are using the vectors of a support set (not training set) to then auto-identify similar vector sentences. The similarities are then scored so that you know how much something belongs to a certain category. The more classes that you have, the more support samples you need. I recommend using it to get a quick sense of your data and generate a starting data set quickly to then train a new model via supervised learning. This video is meant to serve as my transition into multi-class classification on this channel =), so those videos should be coming out shortly. We will use spaCy (simpler) and Keras (more advanced). It multiclass text classification will also receive a whole chapter in my forthcoming book on spaCy ML.
@DK-rl1sf2 жыл бұрын
Thank you for this tutorial. I tried saving the trained model using nlp.to_disk('D:/ABC'). But when I load it back using spacy.load('D:/ABC') in a fresh Jupyter Notebook, I get the error "[E002] Can't find factory for 'text_categorizer' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. ...". I am still in the same conda environment so I can't be missing dependencies. What is causing this problem?
@Filipkasic2 жыл бұрын
Is there a way to utilize this model without having to define what the keywords are but simply to provide a list of them without any definition?
@youTanod2 жыл бұрын
Thank you very much for this useful video. This is exactly what I need. I tried it with real data, but I get this warning message, what should I do? UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.
@python-programming2 жыл бұрын
Can you paste what your support data dictionary looks like?
Is this still relevant comparing to using gpt for classification?
@python-programming Жыл бұрын
That is a great question. Yes, though GPT 4 is better at few shot than this approach. I still think this is useful for getting a quick classifier up and running locally to help annotating.
@gangs0846 Жыл бұрын
@@python-programming thank you sir
@CoreyMalcom2 жыл бұрын
This is a really good tutorial Thank you! I have not been able to get it running so far. When I attempt to "nlp.add_pipe( ) " on the text_categorizer, the kernel crashes and restarts. Any clue as to why this would be happening? I have a fresh environment with spacy and the classy_classification newly installed.
@python-programming2 жыл бұрын
Thanks! Hmmm that is odd. What is your OS? Mind DMing me on Twitter with some pics?
@CoreyMalcom2 жыл бұрын
@@python-programming Sent. Thanks for looking at this. Will be really helpful.
@python-programming2 жыл бұрын
@@CoreyMalcom no problem! I am in the middle of traveling. Will try and respond tomorrow
@maxwellmandela2 жыл бұрын
great stuff!
@python-programming2 жыл бұрын
Thanks!
@szachynakubie49552 жыл бұрын
thank you
@trashyAIguy Жыл бұрын
Cool! I'll use it in my trashy ai to make it less trashy 🤣 to make it understand intentions
@lisagilyarovskaya55932 жыл бұрын
Thank you very much for this video, was looking for something exactly like this !! I was wondering if there is any way to save the model config on the disk once the pipe with support samples was added, do you have any ideas on that?