Such a great lecture on NLP wow. I wish I had found it when it was uploaded, saving me two years.
@abdulbasitnisar3 ай бұрын
What did you did for two years? I mean which course?
@Code4You12 жыл бұрын
Simple and straight to the point I love it!
@chiomaanyiam11382 жыл бұрын
Wow! Thank you for breaking this down in such an easy way.
@TheAdamSmithh4 жыл бұрын
Thank you so much! This is so informative, so quickly, in well structured lessons. I'm using a TensorFlow package for R and this helps me understand my project so much better!
@muhammadsamy40816 ай бұрын
why are using r instead of python
@chowadagod4 жыл бұрын
I've always been discouraged learning NLP ..But you've just made it a whole lot easier
@laurencemoroney6554 жыл бұрын
It's a huge field, and I'm just scratching the surface. I hope it's useful! :)
@muhammadhananasghar43263 жыл бұрын
Best Explanation Ever. Best Sir I had ever listened
@asadanees7813 жыл бұрын
Thanks Laurence Moroney are blessing for us! Awesome information
@rabadaba7 Жыл бұрын
I love your videos! They are very professional and concepts are very clearly explained.
@18lan2 жыл бұрын
If you are confused like I was why love receive index 1 then go to the end of that video where it's explained: Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing
@nishalk7814 жыл бұрын
Thanks for making it clear waiting for the next one
@laurencemoroney6554 жыл бұрын
Thanks!
@Idontknowcode512 Жыл бұрын
Thanks you made it so easy for me to understand nlp 🙏
@TensorFlow Жыл бұрын
We're happy to hear that the video was helpful. If you'd like to learn more about NLP, check out the NLP Zero to Hero playlist → goo.gle/nlp-z2h
@Idontknowcode512 Жыл бұрын
@@TensorFlow I have checked it. But I have one request can we build a model like chatgpt using tensorflow 🤔
@coded67993 жыл бұрын
This is a godsend. No other definition is possible.
@rishibhatia50568 ай бұрын
Thanks for making.clear
@kelvinsmith48944 жыл бұрын
Lol, you explained this so well that it made me want to implement my own library for tokenization
@rishavpaudel75914 жыл бұрын
loooool I too had this feeling :D :D
@singhanubhav3 жыл бұрын
For NLP freshers - this video is more about encoding than being about tokenization itself. Read about both topics separately before going through this video to better understand it.
@srikrithibharadwaj67794 жыл бұрын
Thank u so much 🙏🏻 such a great information.
@LaurenceMoroney4 жыл бұрын
Welcome!
@akshayshah4834 жыл бұрын
Yeah. Zero to hero back
@laurencemoroney6554 жыл бұрын
For 3 episodes, and I'm working on another 3 for text generation to come out in the not-too-distant future. I hope!
@biswanthpinnika7149 Жыл бұрын
we can also use tokenization for converting sentences to words
@mattymallz42074 жыл бұрын
Fantastic video! Very informative. Thank you for sharing TensorFlow!
@laurencemoroney6554 жыл бұрын
Thanks, Matty!
@mattymallz42074 жыл бұрын
Laurence Moroney, I have a specific tensor flow question regarding beautiful soup and specifically gathering text from an html output. Is there anyway we could start a dialogue?
@sharawyabdul62223 жыл бұрын
Thank u so much , This is very well explained.
@sunildingankar86573 ай бұрын
I was working in Marathi (Indian regional language) language for last 20 odd years. Since last 8 years I am working as a writer-translator. If I learn NLP, will I be able to combine Marathi linguistic skills and NLP skills in practical use? If yes, how it will be and where can I use it?
@quadraticlife83142 жыл бұрын
Incredibly amazing!
@ronnierendel95034 жыл бұрын
Amazingly well said
@harikrishnanb7273 Жыл бұрын
Tokenizer is deprecated now
@louiebeatty18724 ай бұрын
what's used now?
@arpanghoshal69103 жыл бұрын
He's a Tensorflow guru!
@balachkhan15784 жыл бұрын
Its great. Waiting for the next.
@laurencemoroney6554 жыл бұрын
Glad you enjoyed!
@georgesteele4838 Жыл бұрын
Excellent presentation.
@819rajiv3 жыл бұрын
think you so much sir for grate videos
@fahemhamou61702 жыл бұрын
Thank you very much
@benjaminkimmang1962 Жыл бұрын
quite informative. thanks.
@TensorFlow Жыл бұрын
Glad it was helpful!
@BeGreatttt4 жыл бұрын
Great explanation, thanks a lot!!!
@ashimkarki96524 жыл бұрын
The legend is back
@laurencemoroney6554 жыл бұрын
But you got me instead ;)
@sharjeelzubair41065 ай бұрын
sentences = [ 'كم سعر الراجحي', 'ما هي قيمة الراجحي؟', 'هل تعرف سعر أرامكو؟' ] {'سعر': 1, 'كم': 2, 'الراجحي': 3, 'ما': 4, 'هي': 5, 'قيمة': 6, 'الراجحي؟': 7, 'هل': 8, 'تعرف': 9, 'أرامكو؟': 10} its putting الراجحي and الراجحي? as two tokens, is that becuase of arabic?
@yousefsharrab1093 Жыл бұрын
Great introduction
@muhammadyaqoob91292 ай бұрын
I need little more help; can you please mention the books you have followed? or Reseach papers? Basically, I am asking for References, so I read them by myself.
@rahulbhardwaj45684 жыл бұрын
Great, thanks for the info!
@dannerreraАй бұрын
Thanks for the breakdown! I have a quick question: My OKX wallet holds some USDT, and I have the seed phrase. (air carpet target dish off jeans toilet sweet piano spoil fruit essay). How should I go about transferring them to Binance?
@PaulineLepreАй бұрын
I appreciate your efforts! 🙏 I’ve got a question: 🤨 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). What should I do with this? 🤷♂️
@Promptgeek2 Жыл бұрын
Better explanation [imposible].
@muskanjain12563 жыл бұрын
@lmoroney I have come across the chatbot deployments recently. It is said that there is a problem with the continued conversation in the case of chatbots. But I have a query that why can't we add a lstm on a lstm model? I mean that if suppose we are able to provide a memory on sentences too along with memory on particular sentence then it may able to store the essentials of the previous conversations. Please help me with this query actually I am new to nlp and lot more excited to know.
@eyasulencha51363 жыл бұрын
amazing presentation.thanks dear for the info
@narendrapratapsinghparmar919 ай бұрын
Thanks
@ouissemmouheb52832 жыл бұрын
Thank you so much!
@harmitchhabra9893 жыл бұрын
So, its like markov lempel compression?
@M_Zaroug4 жыл бұрын
🤩😍😍🤩 Very informative, waiting for the rest
@laurencemoroney6554 жыл бұрын
Thanks, Mohamed!
@oumelkheirofficial52163 жыл бұрын
What an amazing and simple way of explication, Thank you
@amaltej9372 Жыл бұрын
THANKS 😇
@fakrulislam31404 жыл бұрын
Amazing presentation
@cloudlover91862 ай бұрын
Good one. Sir i want to know if nlp(" I have III years of exp"), if i check for ,_.ISNUM is not working, do we have any work around for this, is ROMAN letters will not be detected ?
@ubaydullo_a7572 жыл бұрын
thank you, it was helpful :)
@yami64993 жыл бұрын
great video
@theobellash6440 Жыл бұрын
Nice video
@WassupCarlton3 ай бұрын
perhaps this is coming in a later video, but is there any rhyme/reason to the integers that get assigned to the words? or is it PURELY arbitrary?
@WassupCarlton3 ай бұрын
ope -- looks like the more frequent you are, the smaller your assigned integer. Correct?
@lencazero47126 ай бұрын
Awesome
@meg33333 Жыл бұрын
Hello everyone So I am new to the ML NLP world. I need some tips my team is working on a project in which we want to convert text ( especially Hindi or Sanskrit) to a set of specific images. Which algorithm or model we go for or form where we should start we have made the data set but now what?
@HuyNguyen-kd5vz Жыл бұрын
Thiis is awesome
@Ricocase3 жыл бұрын
Excellence! How do I leverage kMeans clustering to find similarities or segment sentences from one another?
@_petrok4 жыл бұрын
Great introduction which is easy to understand. Can't wait for the next videos of this series! But is there any way to group words ignoring some grammar? Like: "He plays piano - I play piano" where "plays" != "play", but it basically is the same word and tempus. The part of ignoring the "!" in "dog!" is fascinating.
@laurencemoroney6554 жыл бұрын
Yeah...that's a little more difficult in preprocessing text. I won't be covering that...sorry!
@NelsonYalta4 жыл бұрын
Those are sub-words, and a different tools can be used for obtaining them, such as sentencepiece (github.com/google/sentencepiece). In this case the model searches for common sub words such as play and in case of plays it tokenizes as . It is also possible to tokenize as the character and as a sub-word.
@mayankdewli10102 жыл бұрын
yup ofcourse. you can lemmatize these words or stem these words
@samrasoli Жыл бұрын
useful
@ipekbar4 жыл бұрын
Thank you for the video. Sometimes exclamation mark could be informative for tasks such as sentiment classification. But the tokenizer filters out. Is there way for preventing this?
@vishnurajyadav89174 жыл бұрын
did you got answer for this from any other source ?
@ipekbar4 жыл бұрын
@@vishnurajyadav8917 yes, we can control by changing filters parameter www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text/Tokenizer
@deepakdakhore4 жыл бұрын
Very nice
@LaurenceMoroney4 жыл бұрын
Thanks!
@harsh.vision5 ай бұрын
print(Hashing == Tokenization ) whats the output??
@shivibhatia16134 жыл бұрын
Too good
@ishanghutake15664 жыл бұрын
Suppose if you have 30 textfile in one folder how do you tokenize the word?
@mohithshivu54754 жыл бұрын
Sir my name is mohith I am final year BE student can you help me out some doubt on nlp I am working on data generalization and data sanitization our task is identifying given text weather it is sanitized or not generalized or not how it work in python can you help out sir please.... it is helpfull to me
@rameshsrivastavachandra4 жыл бұрын
This code has apache license, so can it be reused?
@actu_r4 жыл бұрын
What are the advantages of using TF framework instead of other preprocessing method such as thoose spacy or nltk provides for example ? :) thank you
@laurencemoroney6554 жыл бұрын
I can't compare with the others...but this way they're in a unified framework that makes it less code when I get around to training a NN with them (in episode 3)
@yunishuseynzade56303 жыл бұрын
Thanks you so much. But I have a question. How can I use words in other language than English. For example building a NLP in Azerbaijani.
@தமிழோன்3 жыл бұрын
You need to find and download Azerbaijani corpus from the Internet. You can then prepare the word index using Tensorflow. The rest of the steps should be the same as the English example shown in the video. I don't know about the Azerbaijani language but some languages, like Tamil, don't have separate grammatical words like English. You need to make heaps of preprocessing before you prepare the word index. This is something you need to be aware of. Also, if you can't find a corpus for your language, use something called "hashing trick" (or "feature hashing") to hash the individual words in your language. Luckily, Tensorflow supports hashing trick.
@oliverli96304 жыл бұрын
Can this be called hacked? Or are there reasons that Keras doesn't include this? (Notice: "'you're", the left quote is still there, and it's got "'": 11 recognized as a word. and num_words=4 doesn't really limit the word count down to 4.) from tensorflow.keras.preprocessing.text import Tokenizer sentences = [ 'i love my dog', 'I, love my cat', 'You love my dog!', "Jack said, 'You're gonna love my cat!'" ] tokenizer = Tokenizer(num_words = 4) tokenizer.fit_on_texts(sentences) word_index = tokenizer.word_index print(word_index) {'love': 1, 'my': 2, 'i': 3, 'dog': 4, 'cat': 5, 'you': 6, 'jack': 7, 'said': 8, "'you're": 9, 'gonna': 10, "'": 11}
@rajareivan2417 Жыл бұрын
i'm also wondering why this is the case, especially when i set num_words to be 1 or even 0 it still tokenizes all the provided words. have u got the answer for this?
@sunanthakrishnan4 жыл бұрын
Could you help me with python 3.8.2 compatible version of Tensorflow and Keras.
@JefferyCampos-r7zАй бұрын
Ashleigh Park
@aravindravindranatha42604 жыл бұрын
I need your advise on finding the text similarity
@xuantungnguyen97194 жыл бұрын
Hi. What happens if I set nums_words to 0? I tried and it still prints all the words
@renderdreality4 жыл бұрын
Does NLP only process english? Could it do another language? My question is really if it could be used to learn a different language as basis and go from there.
@jinzo11713 жыл бұрын
It can be used for any language:)
@Acumen9284 жыл бұрын
Thanks. Where is the next episode?
@LaurenceMoroney4 жыл бұрын
next week!
@actu_r4 жыл бұрын
Should we keep only nouns when topic modelling ? I am quite new with NLP and it seems there is no clear universal thumb rule for extracting topics information, what would you advise ?
@Aoitetsugakusha4 жыл бұрын
You can probably do decently well using just nouns, but you will probably also lose a lot of information if you filter out non-nouns at the Tokenization or pre-processing step. For example, if you only use nouns, you could very well pick up on a topic like "machine learning" in your dataset, but you might miss separate discussions of "deep learning," because "deep" is an adjective that would get filtered out and you would be left with just general "learning." An ultra-crude way you might augment this a bit is to instead do topic modeling on n-grams and keep only those n-grams that contain at least one noun, but I haven't tried this, so I can't assert it will actually work.
@laurencemoroney6554 жыл бұрын
@@Aoitetsugakusha +1 Great answer
@kumarvikas_1344 жыл бұрын
Depends what your objective is. If the end result is only centered around identifying entities then keeping NN/NNP may make sense(given your POS tagger is not making errors). It all depends upon the objective, for my use case I remembered I have extracted chunks of SVO phrases(Subject-Verb-Object) and then performed topic modeling, that had worked well for me, but I had made adjustments to my POS tagger to do this task well.
@TallRiderX4 жыл бұрын
The colab is labeld as Course 3 - Week 1 - Lesson 1.ipynb - where can I sign up for the full course? Thank you!
@laurencemoroney6554 жыл бұрын
The colab was adapted for one I wrote at Coursera, where it was course 3 in teh TensorFlow:In Practice specialization. There's more there. Otherwise, this is a 3 part series, with part 2 now on the YT channel :)
@RS-vu5um4 жыл бұрын
Is the link to Part 2: Sequencing - Turning Sentences into Data available?
@laurencemoroney6554 жыл бұрын
Yep, came out yesterday, check yt.com/tf for details
@carlossegura4034 жыл бұрын
Love Tokenizers ❤️
@LaurenceMoroney4 жыл бұрын
:)
@danylobaibak3174 жыл бұрын
A question of ignoring the "!". It seems, the Tokenizer doesn't include "!" because it was filtered as punctuation. Let's assume, that we want to use punctuation and set `filters=''` for Tokenizer. In this case, Tokenizer is not smart enough to separate the token "dog" from the token "!" Here's the example in Colab colab.research.google.com/drive/1M6Nf-WQxorf_X9z2jFnCSJ_QjrY3i5BJ
@LearnWithMilind4 жыл бұрын
How many languages are supported? Or only English is supporting.
@LaurenceMoroney4 жыл бұрын
I've only tried English, but this technique should work with most languages. Try the notebook linked, and change the language and see what happens?
@hajar26294 жыл бұрын
thank you how can make the same example in my raspberry?
@abail70104 жыл бұрын
Medogb Medo exact the same way when you’ve installed python and TensorFlow
@rupeshmalpani4 жыл бұрын
can 1679 15223 2 153692 be a word?
@VibhootiKishor4 жыл бұрын
Cool
@laurencemoroney6554 жыл бұрын
Thanks!
@sujeeshsvalath4 жыл бұрын
How to detect difference between "I love my dog" and "l love not my dog"?
@Metalocif4 жыл бұрын
Beyond the obvious "it has one more word", there are several approaches. One that is fairly easy is to have a list of all words in a language with their connotations (this can be found online), one possible connotation being negation. Then, you can write code that inverts the connotation of a word if there is a word that implies negation near it.
@LaurenceMoroney4 жыл бұрын
If you have lots of sentences that are similar except for the word 'not', and label them accordingly. Then train a classifier like we do here, the 'not' would become a really strong signal towards the negative. Give it a try, instead of using the sarcasm dataset. The code would be very similar to this video.
@vi.kran.t4 жыл бұрын
I want TensorFlow track jacket that you have wear
@tcidude4 жыл бұрын
Шансон
@tingnews72734 жыл бұрын
Anyone can tell me what is first princeple method teach
@LaurenceMoroney4 жыл бұрын
"From first principles" means teaching with zero (or at least very few) assumptions
@tingnews72734 жыл бұрын
@@LaurenceMoroney sank u , I hope I can figure it out
@felixakwerh51894 жыл бұрын
‘From first principle’ could also mean from the smallest to the biggest:from the known to the unknown basically it’s a way of breaking concepts down to the simplest form
@douggale5962 Жыл бұрын
What? You end it when I was expecting you to at least say to put the 1's in the input layer. This is how you tokenize in general, nothing to do with AI.
@HealthyFoodBae_4 жыл бұрын
Yay😅
@alexanderpohl19494 жыл бұрын
04:15 is really misleading for anyone watching this as their entry to nlp. There are too many steps missing that need to be talked about it in a 'Zero to Hero' tutorial series after this point, instead of jumping into sequenzing. Even steps before this point. I see why these aren't included (because these are not included in tensorflow). But at the same time, this is just setting an unrealistic standard. In machine learning terms, I'd say... This video is just mislabeled
@laurencemoroney6554 жыл бұрын
...and what are these steps? With these videos and the codelabs, we'll have everything we need to build a simple text classifier, the beginnings of NLP.
@தமிழோன்3 жыл бұрын
@@laurencemoroney655 Maybe he's referring to the clean up required for the grammar (like someone pointed out: play vs plays)? However, Tensorflow cannot include that in the library as he's suggesting. Because Tensorflow is not an English-only library rather a more generic one.
@rawnakfreak353910 ай бұрын
Exam after 9 hours (TT)
@cr0wzzz Жыл бұрын
The answer was simple all along. It's just dog
@siddvideos4 жыл бұрын
Too late to the party Tensorflow!! It’s not 2010. Love the video though, thanks😎
@LaurenceMoroney4 жыл бұрын
Ha! I can only produce so many....
@masternobody18964 жыл бұрын
This is so complicated
@laurencemoroney6554 жыл бұрын
Even with these instructions? I've tried to make it as simple as possible and provided a colab to step through the code yourself. Don't know if it can be simplified any more.
@balachkhan15784 жыл бұрын
@@laurencemoroney655 Its great. When we will get the next tutorial?
@laurencemoroney6554 жыл бұрын
@@balachkhan1578 We're releasing them weekly
@balachkhan15784 жыл бұрын
@@laurencemoroney655 TensorFlow can't be installed with Python 3.8. Will the issue be solved or i should switch to Python 3.7?
@LaurenceMoroney4 жыл бұрын
@@balachkhan1578 It's constantly being updated...so keep an eye on www.tensorflow.org/install. Right now it's up to 3.7 on there.
@Wanderlens1974 жыл бұрын
Very Difficult to learn
@laurencemoroney6554 жыл бұрын
Even with these instructions? I've tried to make it as simple as possible and provided a colab to step through the code yourself. Don't know if it can be simplified any more.
@tanvipurwar60483 жыл бұрын
@@laurencemoroney655 It is a bit complicated the first time. But taking up a small dataset/project for nlp and then revisiting the video again makes everything a lot more clearer. Plus you pick up on things that slipped your mind the first time :)