Artificial Intelligence For Everyone
1:00:35
AutoThemeGenerator: User Guide
8:57
Build Your Own AI Tutors with GPTs
57:18
LLM-powered Topic Modeling
1:25:56
8 ай бұрын
Parameter Efficient Fine Tuning Methods
1:24:45
Segment Anything Model (SAM)
58:52
10 ай бұрын
Пікірлер
@ugedashienkumaaondo6283
@ugedashienkumaaondo6283 26 күн бұрын
this is educative
@mohammadwahba3077
@mohammadwahba3077 Ай бұрын
❤ gooooooood
@mohammadwahba3077
@mohammadwahba3077 Ай бұрын
❤your channel is wonderful
@mohammadwahba3077
@mohammadwahba3077 Ай бұрын
Gooooooood
@chillvibes1896
@chillvibes1896 Ай бұрын
you are not able to explain well
@chillvibes1896
@chillvibes1896 Ай бұрын
this girl is very confused, you should learn properly and then teach
@dafizzykings
@dafizzykings Ай бұрын
what software did you use in recording your screen, Dr. An? I like it so much
@alishafii9141
@alishafii9141 Ай бұрын
thanks a lot. this video was useful for me. I think it is complete and it cover all thing that I will need in ML. I enjoyed it
@applyailikeapro7191
@applyailikeapro7191 Ай бұрын
The package could only be installed with pip version older than 24.1. Newer versions of pip will not work due to compatibility issues with textract. To downgrade to a version older than 24.1, please do the following: pip install "pip<24.1"
@FractAlkemist
@FractAlkemist 2 ай бұрын
I have been using discreet code for GA as I have been learning it, and am now moving on to bigger problems - so need libraries and things like DEAP. Nice video intro - thanx! My question is DEAP gonna be around for a while? I dont wanna learn something and then have it go obsolete like PyEvolve did.
@applyailikeapro7191
@applyailikeapro7191 2 ай бұрын
While no one can predict what will happen, I personally believe DEAP will be around in the foreseeable future. Fingers crossed 🤞
@romanonugia8180
@romanonugia8180 2 ай бұрын
Topic -1 is an outlier and should be ignored
@annakaliuzhna1205
@annakaliuzhna1205 2 ай бұрын
Amazing theory and practical examples combination. Thank you a lot for sharing this !
@applyailikeapro7191
@applyailikeapro7191 2 ай бұрын
You are most welcome!
@reshmag.d3849
@reshmag.d3849 3 ай бұрын
k
@tarroma23
@tarroma23 4 ай бұрын
This is very helpful , thank you so much
@applyailikeapro7191
@applyailikeapro7191 3 ай бұрын
You're very welcome!
@sevDiablo
@sevDiablo 4 ай бұрын
I believe your code in getTotalValue() function allows to have items between the selected ones and still not being calculated into the totalValue.. e.g.: if the for cycle can has selected 5 items which can fit into the selection, but the 6th can't fit by its weight, the zeroOneList can still allow having the 6th element being selected, but the totalValue will not contain the weight of that item. the false representation of items (after a given index) will not affect how the genetic algorithm works, converges and performs? shouldn't be this punished by the fitness function?
@qingpingwan
@qingpingwan 4 ай бұрын
useful ideal
@FredBosire-dm8zk
@FredBosire-dm8zk 4 ай бұрын
Give me that job of training you using CVAT. Former worker of CVAT.
@Enigma0694
@Enigma0694 5 ай бұрын
How to apply to this job?
@mfarooq28
@mfarooq28 5 ай бұрын
Getting this error FileURLRetrievalError upon DownloadUtil.download_word2vec(dest_dir = '.')
@xspydazx
@xspydazx 5 ай бұрын
Knowldge is a dangerous thng ! same as a university environment : the aproaches taken here and the understanding is lacking the dynamic ...... it requires ..... hence such acomplishments being made OUTSIDE!.... The technology made a super jump forwards in the Covid: And now simplicity is key: the older principles CAN be applied to this modela s the funcitonalitys that we previously eluding us , and how it would have been implemented is now redundant : the seq to seq network produces outputs based on examples ! hence can be fit to any task or methodolgy if the prompt is framed correctly with the data : which can be anything ! the llm give the Neural network a CHAt interface allowing for natural languge querys ! SO ! we should treat it in a different way! we should now be TEACHING: the model methodolgies , hence university is the right place to create such data: By using object oriented methodolgies a model could essentailly prodcue any program to do any task : ... so by teaching it the object oriented programming methodolgies ie for a task deciding on which principle to apply will enable the model to perform the task corecctly instead of guessing the correct "Completion"<<<< (is it a completion now or hav some model past this and become advance reasoners??)<<< (MSC QUESTION)...... . NOw case based reasoning methods should be installed into models as a model which can perform based on its past history or session data is a case based reasoner!!! << later when we have inhanced memeroy recall and live update we will have a full internal rag system so case base will extend based on past known experiences right now! there is so much possibilty to accomplish and so little time and resources ! (truthfully... the way large corperations ar desining are hit and miss methods: but with my technique the model will be forces to thing from its own internal generations: Now when it responds depending on complextiy the speed of the answer will truly denote if it is thinkingor not and then we will be able to see the thinking in natural languge! So much to be wowed about (go to mistral models and begin your training exxperience now!
@xspydazx
@xspydazx 5 ай бұрын
I have found that tuning is actually an artform: we need to rememeber that the last layer is replaced by this new peft layer : or even proportions of layer weights which will be replaced by the generated peft layer! ie with the lora config various modules are targtted: So with this understanding in hand we can even Retrain a layer or Target specifc layers in the model to be working on: (usefull when you evaluate the output generated at various layers as you can also target specifc parts of the model which is "Storing" unwanted data: (storing is not correct in truth the llayer has these words and sequence as higher probablitys in a wword to word matrx (this is what each layer is holding)).... So when training itss important to train for diferent targetss or you will be essentially Overwriting the model and never making progress with fine tuning sessions : (Which do need to be performed regularly) ...as data is Probable ! so even repeated data is kind to the model as in life some data is heavily emedded ; such a truisims etc : if a truisim is sparse then it willl never surface and other more negative falshoods will rise higher , hence positive and negative sample to match! A Model can be trained for multiple purposes :::: as well as any different models for different purpoes be merged to create a single model :( they will need to be realigned to thier training prompts and dats (they will be quite off but after a few steps will travel towards a stedy line ..... ie not dropping unless the lerning rate is changed: (hence good convergemces into the modal:) : these can be considered to be Multi FIt Models ! ::: they ae fit to multple tasks : this can be from dictionary tasks, ie what is a.... to extract the entitys from this text : to generate a function in xLang to produce a result : the same structured prompt is not used : ie NOT insrtruct Or Chat : as these are mearly place holders for multi purpose task based querys :So we craft customized prompts and give examples of how they should be completed : Hence completions : ie Define Cat : and the model would genrate , a description : a Url Source : a 'picture, and a sound: ... hence given the correct data in the formatted response would be generated by the model: In fact we are mearly showing an onput seq to out seq as well as collecting a bag of words which pertain to a specific arrangement ie the output: SO the place holder we leave will be mask filled by the model or genrated: in fact when the model recognizes this patern if Define > Target it will automaticall remember the templarte it was trained with and generate the data from the traiing set or generate the most probable answrer : hence these rompts can be considered tasks : these tasks become hidden within the model: the lesson to be learned is that ALL structured data should be entered in this way so that the model can use the examples to make prediction in the future: .... Ie: a collection of time over distance calculations : with each variable as a plug in value but the calculation structured correctly in the prompt , this will be used as a THOUGHT template to solve simular shaped data: for enabling full acess to the task a masked model should also be used which requests the missing variables from the user to plug into the calcualtion for the next turn response or response chain (NOT A ONE SHOT) as even a response chain can be a single completion : Hence in Training ! A COMPLEXED PROMPT is important unless your just dumping data into the model with the hopes that it will be used later in some text generation task or sumarization project: So the importance is randomizing or focusing the peft setting for diffierent traing tasks enables for embedding of tasks within the model as well as giving the model the tools to calcualte or generate correct and formatted predictions: Hence designing the CHain of thpoughts you wiish the model to take: I personally have trained amodel in these methods: and the model even self corrects and discusses internally the response : hence the response also may be quite slow sometimes due to the fact that each variable in the conversation does have some form of maping to a hidden function which may not be displayed without the explicit prompt templat: ie thoughts: As i have also framed many conversations with self inside this space : when using a chat template it does not mean that it does not do these calculations UNSEEN!: now we can bvisuallize our preprogrammed processes : LeroyDyer/Mixtral_AI_MasterTron << Hugging face: Its important to have a model which will also accept traing quickly hence retraining ontop and ontop ... inside a slice here and a slice there ::: then the models retain thier training and recall: especially wne merged with other model from the same training series : Ie i only merge with other models i have trained as i know that i have specific layers in the odel highly focuessed : I also always pretraining check how many params are avaliable :a an always expect to be different from the first or previous or i will understand that i maybe traing previous parameters! anyway good lessons i will keep a look !
@xspydazx
@xspydazx 5 ай бұрын
when using paremter effecinet fine tuning : YOu DO Select the target modules to be overwritten : the peft is a copy of these modelue which you will replace so although th emodel weight are frozenyour actually tuning the weihts which will be merged into those frozen layer: hence if you target all the hidden layers then the peft will be a copy of thse layer , so you will be fine tuning the last known postions of those weight as the peft creation COPYS thse Targetted Layers! as you will note : TO Generate 1b parameters takes alarge amount of memeory ! so this process is reduced by simply copying the target layers to anew model: ie the peft model !!! hence his is what is trained on top o the frozen model ! and after merged into the parent: but can even be used seperatly (with the base model)
@xspydazx
@xspydazx 5 ай бұрын
ie : lora rank 7 / 14 = 18,350,000 paramters, 6/12 = 15,728,000 parameters with (all a QkV,O,Proj layers targetted) ... hence for each session Chaging these setting to touch Different tensors and biases.... after you exhaust your best options : ie you should be training a few models for different tasks side by side ; then you can merge the collection.... enabing for retargetting of the same selection for the next set of training stages : This is important to retain information and transfer skills : as well as devlop a model such as chat gpt : as well as devloping a RAGless model ! its important to underrstand that we are training methodolgies into the model (tasks) and teaching it the examples of how the tasks should be performed as well as how rthey should be calculated (chain of thoughts or chain of functions) hence the model should have these functions internally ie : For entity detection ( after recieveing the input ) its chain of intenal thoughts should be : 1. tokenize text: 2: create a list of entitys from the list using a set of entiy lists (generated internally) then return the detected entitys: It would need to generarte a function to tokenize the text, as well as generate entity lists based on the topic of the text ::: then push the text through the funcitons to produce and output: If the model is given 1000k pretrained examples and overfit on the task (to 0.2 loss) then when testing on unseen datset you will see the model genrrating funcitons and attempting to utilze the functions internally to answer these question : If the model makes mistakes (obvioulsy) it will mean that we will need to give the model 100k examples (loosely fit) , at a loss of 0.9 - 1.2 .... enabling for the model to gather the examples and later recal these examples sucessfully using its known methods: later smaller datsets from the 100k rerhrased in the prompt should be also used to train the model to get closer to the dats (always use a dataset of (1k) to align the model.... hence merge with another model and align ! this methodolgy is call task embedding ! (it works well ) so when training you may find yourself with many 1k alignment datsets (used for after merging (its like a check to see if the model lost past knowledge) .... hence giving it the same old example to remember ... hence even these become frequent memeorys! So we discovered now we can create FREQUENT MEMEORYS! with alignments and overfitting! hence the artform of merging methodolgies! there are things you may ask chat gpt to do and you would expect that these are performed internally ..... but they are not (the system is a scam) ... the front end manages intents discovered from the input query and produces an agent to perform the task: these tasks are basically plugins!!! << Extensions>>> (a neural net needs training and extensions are essentially a Rag of Skills!!).... All can be internal !
@omarbadr9469
@omarbadr9469 5 ай бұрын
I really want to thank you for this great series. I have been struggling to find a good resource on text augmentation and text imbalance. Until I found your channel. Thank you for this amazing series.
@applyailikeapro7191
@applyailikeapro7191 5 ай бұрын
You're very welcome!
@ColabCorgi
@ColabCorgi 5 ай бұрын
Excellent content. Just what I was looking for! Any tips for how to optimize topic modeling process using gpt models from OpenAI?
@giacomocassano1439
@giacomocassano1439 5 ай бұрын
hello! I'm a researcher in Politecnico di Milano and University of South Australia, I'm trying to do the same thing, maybe we can have a chat!
@ColabCorgi
@ColabCorgi 5 ай бұрын
@giacomocassano1439 sure, how can I reach you
@joshed790
@joshed790 5 ай бұрын
Could you give an example of how to merge this topic modeling with our original dataset for further analysis and report creation
@rolandabi2848
@rolandabi2848 2 ай бұрын
Hey Josh, you found a way to do this?
@oceanchen9513
@oceanchen9513 6 ай бұрын
Thank you so much Prof.An. This content is so useful for me. It help me a lot with understanding the process of nlp in spark. This is such a meaningful job. Huge support for you, please keep operating this channel!
@54LZ
@54LZ 6 ай бұрын
An interesting and great presentation. Thanks for sharing.
@applyailikeapro7191
@applyailikeapro7191 5 ай бұрын
Glad you enjoyed it! 😊
@stevenxu5747
@stevenxu5747 7 ай бұрын
Based on your accent, I'm guessing that you're from Taiwan. Am I right?
@yapayzeka5424
@yapayzeka5424 7 ай бұрын
nice mate thx
@applyailikeapro7191
@applyailikeapro7191 5 ай бұрын
Glad it helped
@carthagely122
@carthagely122 8 ай бұрын
Thank you doctor i Hope you make more intersting vidéo liké this
@frankshines-stroudfamily
@frankshines-stroudfamily 10 ай бұрын
Excellent AI working and sharing group from so many places around the world. Very interesting the way DeepSpeed RLHF works: liked the info on time and cost alternatives. I will deep dive into this further. Thx!! Great work
@shamukshi
@shamukshi 10 ай бұрын
hello Wang, Can you share your email with me ? I need help regarding SAM and i will pay you for it
@trangquyen4307
@trangquyen4307 10 ай бұрын
Hello im so stuck at my csv file it keep return error num_nodes argument must be larger than max ID in the data, but got 13000 and 13000000
@ceesh5311
@ceesh5311 11 ай бұрын
thanks!
@santiagoajala5404
@santiagoajala5404 Жыл бұрын
I can't use categorical_embedder
@obibi
@obibi Жыл бұрын
I like the simplicity in which you explain things and present the topics. Thank you for your contribution.
@chikopheidris394
@chikopheidris394 Жыл бұрын
I do appreciate this tutorial
@tuttoshithole
@tuttoshithole Жыл бұрын
Thank you for these valuable insights and case studies! Would you mind adding the sources and papers used to your description box? I am writing an essay about it, and it could be really helpful. Thank you!
@caiyu538
@caiyu538 Жыл бұрын
NLP data augmentation is more difficult than image data augmentation. Great lectures.
@caiyu538
@caiyu538 Жыл бұрын
Great lectures. I plan to use this strategy to augment my data.
@enricodinardo6338
@enricodinardo6338 Жыл бұрын
not a very interesting video: reading the notebook on my own would be the very same, it would have been nice to have more comments and explanations
@elebs_d
@elebs_d Жыл бұрын
Thanks for this video. It helped clarify a lot of confusing terminologies
@sidsquad01
@sidsquad01 Жыл бұрын
thanks for the amazing help
@picassoofai4061
@picassoofai4061 Жыл бұрын
subscribed and hit the like button in the first 5 seconds, Great channel and Great Content.
@lizhur4216
@lizhur4216 Жыл бұрын
Thanks for this good content! I got 2225 files, but texts = spark.sparkContext.wholeTextFiles(','.join(lst_filenames)) is not working. Do you have any idea?
@skoomaaddict8213
@skoomaaddict8213 Жыл бұрын
Hi. I am trying to use tesseract to extract some numbers from images but I could not succeed. Is there a way to train or improve recognition? I need to recognise only numbers. Thanks
@alfredmilford8403
@alfredmilford8403 Жыл бұрын
I also need to recognize only license plate numbers, did you find any performant tool for that?
@delaoma7977
@delaoma7977 Жыл бұрын
Great video Doc
@dan7582
@dan7582 2 жыл бұрын
Great video, keep up the good work.
@louwynan9274
@louwynan9274 2 жыл бұрын
This is so fun and interesting!
@davidsnow2653
@davidsnow2653 2 жыл бұрын
Shouldn’t Ai be sentient by definition?
@applyailikeapro7191
@applyailikeapro7191 2 жыл бұрын
AI by definition is to enable machine to do common human cognitive tasks without explicitly programmed. However, it does not necessarily imply sentience. Having sentience may not be the necessary condition for being "intelligent"
@louwynan9274
@louwynan9274 2 жыл бұрын
Nice video. Really helpful!
@applyailikeapro7191
@applyailikeapro7191 2 жыл бұрын
Glad it was helpful!