Your videos are always unique & highly knowledgeable. Thank you
@CodeWithAarohiАй бұрын
Thank you!
@layamahmoudi600215 күн бұрын
Thank you for the amazing video, it's absolutely perfect!
@CodeWithAarohi14 күн бұрын
I'm glad you found it helpful!
@TruthOnly_jayshreeRamКүн бұрын
awesome, very nicely explained. Thanks Ma'am.
@CodeWithAarohi11 сағат бұрын
Most welcome 😊
@TruthOnly_jayshreeRam11 сағат бұрын
@@CodeWithAarohi Ma'am, Can you please make a video on "memory-augmented zero-shot image captioning"
@vcarvewood4545Ай бұрын
You are excellent teacher. I'm in love in your voice since YOLOv8 tutorials. Attention to Aarohi is all we need.
@CodeWithAarohiАй бұрын
Thank you for the compliment! I'm really glad the tutorials and my voice have made learning enjoyable for you.
@revanb27812 күн бұрын
Very impressive video, Thank you.
@CodeWithAarohi2 күн бұрын
You're welcome!
@arnavthakur540922 күн бұрын
Excellent content ma'am
@CodeWithAarohi21 күн бұрын
Glad you found it helpful!
@Sunil-ez1hx22 күн бұрын
Such an informative video🙏🙏
@CodeWithAarohi21 күн бұрын
Thanks, glad you found it helpful!
@eranfeit8 күн бұрын
Thank you for a great explanation
@CodeWithAarohi2 күн бұрын
You are welcome :)
@AsthaPatidar-w1tАй бұрын
Please make a video for Convolution to Vision Transformer in detail. And thanks for this video.
@CodeWithAarohiАй бұрын
Noted!
@pifordtechnologiespvtltd5698Ай бұрын
Extremely Appreciated 👏👏👏
@CodeWithAarohiАй бұрын
Thank you so much 😀
@bharatto2220Ай бұрын
Thankyou for explaining the videos very elaborately and clearly. But at some places it was too basic like RGB, would appreciate a timeline so that I can skip to the required part
@CodeWithAarohiАй бұрын
Thankyou and I will add timeline
@aneerimmcoАй бұрын
informative, thank you ma'am.
@CodeWithAarohiАй бұрын
Most welcome 😊
@munimahmed9374Ай бұрын
Can you please explain DEiT model, this Vit explanasion is the best video on Vit I found on the internet. thanks a lot
@madhavanu6980Ай бұрын
Transformers for remote sensing classification paper ma'am.....plz explain it ma'am...bcz you do it great and in easily understandable manner
@IshaheennabiАй бұрын
it's great video thanks for it
@CodeWithAarohiАй бұрын
Glad you liked it!
@satvik4225Ай бұрын
43:20 You said that we do element wise addition of Patch representation and position embedding which means their dimension is same. The patch representation is of length 768x1 and you also said the length of the position embedding vector is 512. How will you do the element wise addition. did you mean linear projected vector of eatch patch which has dimension of 512? I learnt alot of stuff, thanks
@CodeWithAarohiАй бұрын
@@satvik4225 Every patch is of size 512x1 after linear embedding layer. And then we are adding position encoding with patches.
@satvik4225Ай бұрын
Okay, thanks. It was a bit confusing as you said here that patch representation is flatenned patches. Thanks again
@salmareang7458Ай бұрын
I have some confusion take one input image then how qkv are find ?
@jynpogger15 күн бұрын
God Please Protect My Teacher at all costs
@CodeWithAarohi14 күн бұрын
Thank you so much for your kind words and blessings! 😊🙏
@ramchandhablani98347 күн бұрын
Mam, your video is very good, I have two questions, If there are 2 hidden layers, then there will be three matrices say W1, W2 and W3 for linear projection. The 2nd question is to train these weights and biases, we neet target vectors corresponding to each input vector. from where we will get those target vectors?
@CodeWithAarohi2 күн бұрын
W1: for projecting image patches into embeddings. W2: for query, key, and value projections in the self-attention layer. W3: for the feed-forward layer after attention. The target vectors come from the labeled dataset, where each input image has a corresponding label (for classification tasks) .
@soravsingla657415 күн бұрын
Perfect
@CodeWithAarohi14 күн бұрын
Thanks!
@Mulugeta-c5qАй бұрын
Thank you for your Good work and can you make a video for ViTPose code too?
@CodeWithAarohiАй бұрын
@@Mulugeta-c5q Sure
@aryarushipathak5039Ай бұрын
Hello ma'am , can we use Vit and CNN to identify emotions from the face ? CNN for feature extraction and mtcnn for emotion labeling
@nursami784227 күн бұрын
Hello ma'am, can you explain in more detail about encoder transformers such as normalization, multihead attention, softmax, MLP, The video doesn't provide a detailed explanation about that, can you explain that in the next video?
@nursami784227 күн бұрын
or is there a reference that explains in detail about it?
@CodeWithAarohi26 күн бұрын
Will try to cover in another video.
@sreenalakhani498517 күн бұрын
pls make a video on video vision transformer also
@CodeWithAarohi17 күн бұрын
Sure!
@satvik4225Ай бұрын
can you explain diffusion models next
@CodeWithAarohiАй бұрын
Noted!
@madhavanu6980Ай бұрын
Ma'am plz do video on TRS remote sensing transformers...plz ma'am its my humble request....as i completely understand the ViT still I can't understand TRS plz ma'am
@CollegeOnline22 күн бұрын
mam please please please please please please please create video on Gated Vision transformer as i am trying to use it in my research paper, but I am not able to find any literature regarding GVT. mam if you have any links to GVIT then kindly share it please
@CodeWithAarohi21 күн бұрын
Hi, I need time to make this video because I never used this model before and I have read the paper and understand it in order to create a video.
@fatima-arbabАй бұрын
Ma'am plzzz make a video Copy move forgery detection in video using machine learning Using yolo model and dataset casia
@shivamsood456617 күн бұрын
Mam pls also make on stable diffusion,pls mam 👏
@CodeWithAarohi16 күн бұрын
Noted!
@adityanjsg99Ай бұрын
Arohi ji, possible for you to build a Model which is as good as gpt? Though on limited data and scale..
Hello ma'am, I am still waiting for your video on video generation
@CodeWithAarohiАй бұрын
Will upload soon
@danigunawan6807Ай бұрын
please d your share code explainned thanks ?
@chinnaiahkotadi470226 күн бұрын
Thank you for your effort mam.but you unable to explain how to position value getting, 2. You not explain while neural network work with help of relu activation functions 3. While making relationships between quer,key and value , what is role of Key
@CodeWithAarohi26 күн бұрын
Noted! Will try to cover in another video.
@sanathspai3210Ай бұрын
Hi Arohi It was good session. One suggestions could you please make video of all the yolo versions starting from V1 to v11? Many are waiting for it and will be very beneficial
@CodeWithAarohiАй бұрын
Hi, Glad my video is helpful! Great suggestion, I will surely make a video on all yolo models.
@sanathspai3210Ай бұрын
@ please try to tag me once you are done. I’m very much in need of it and the way you explain it