2:33 Contrastive Learning - Dollar Drawings 3:09 Motivation of Self-Supervised Learning 4:48 Success with DeepMind Control Suite 6:00 MoCo Framework Overview 8:08 Dynamic Dictionary Look-up Problem 8:46 Data Augmentation 9:21 Key Dictionary should be Large and Consistent 10:42 Large Dictionaries 11:32 Dictionary Solutions in MoCo 13:37 Experiments 14:26 Ablations 16:10 MoCov2 with SimCLR extensions 18:24 Training with Dynamic Targets
@abhishekyadav4794 жыл бұрын
The queue encoding is FIFO not LIFO, correct me if I'm wrong
@dutchJSCOOP4 жыл бұрын
You are not.
@kunai98094 жыл бұрын
i was confused by it too...
@kunai98094 жыл бұрын
6:15 in the denominator are not all _other_ keys, but _ALL_ keys, including the positive one. From the paper, right under the equation: " The sum is over one positive and K negative samples"
@ShivaramKR4 жыл бұрын
What is the problem in using the same encoder for both key and query, why should they be different?
@timbrt94133 жыл бұрын
If I have understood correctly from the paper, using the same encoder for keys and querys yields in an oscillating loss, because the encoder changes to fast for the "older" keys. (See section 3.2 in momentum update and 4.1 in ablation: momentum in the paper)
@dilipyakkha92253 жыл бұрын
Speed.
@siarez4 жыл бұрын
Thanks for the video. Why are the weights computed for the query encoder useful at all for learning the key encoder?
@connor-shorten4 жыл бұрын
We are aiming for one representation space as the product of this task. The query and key encoders can't be too disentangled from each other because than the query encoder could learn a trivial solution to map queries to their positive keys. Good question, it's challenging to answer well, please ask any follow up questions or comments on this.
@safoorayousefi38144 жыл бұрын
@Patrik Vacek Because then you'll either have a small dictionary due to memory constraints, or if you store past mini-batches then your dictionary will be inconsistently out-dated.
@vikrammurthy83372 жыл бұрын
thanks for taking the time Connor ..still couldn't figure out 2 mysteries from the paper a) why maintain a dictionary when we are NOT sampling from it ? from the psuedo code in the paper, the only time the queue is used is while calculating -ve logits (which has an additional issue .. if im taking all KEYS from current batch, there will definitely be +ve keys in the queue when i multiply the query into the queue right ? most will be -ve but atleast the +ve pairs in the batch WILL result in +ve keys) b) while calculating the loss , the paper uses an N dim array of 0's .. i understand it specifies the 0th index of the target label so i can assume the 0th index to 1 and the rest as 0's BUT one would assume that only the positive logits would need to be closer to the 0th index ..why are they making even the -ve logits come closer to the 0th index ) .. im quite confused
@connor-shorten2 жыл бұрын
Hey Vikram, I will try to get around to this. Please feel free to join the Weaviate slack chat to ping me again about this in case I forget.
@vikrammurthy83372 жыл бұрын
@@connor-shorten thanks much .. i just re read the paper and realized that the dictionary is nothing but a big sampler for ALL -ve keys ..so my understanding is that since the Query encoder is being trained to learn the best possible representation of the images, it can only do so if it can come as close as possible to the +ve key and go as far AWAY as possible from all the -ve keys in the dictionary ..so more the -ve keys it can "escape" from the better and crisper the image representation gets hence enabling the encoder to allow for richer image embeddings that can be used in low volume datasets via supervised learning ( instead of using the small dataset to create an overfit model OR , theoretically, use imagenet's supervised pre trainers )
@TimScarfe4 жыл бұрын
Great summary Connor!
@ThibaultNeveu4 жыл бұрын
Thank you !!!!!
@connor-shorten4 жыл бұрын
Thank you for watching!
@farzadimanpoursardroudi452 жыл бұрын
Thank you, It was really helpful.
@egeres143 жыл бұрын
I love someone is breaking actually complex topics in AI with this much care and cosistency, but it goes completely unnoticed while siraj + medium collect views with clickbaity content xd
@phuccoiinkorea33413 жыл бұрын
Thank you
@BlakeEdwards3334 жыл бұрын
Thank you!
@2107mann4 жыл бұрын
Nice
@connor-shorten4 жыл бұрын
Thank you!
@2107mann4 жыл бұрын
@@connor-shorten How can I contribute?
@spenhouet4 жыл бұрын
I did not look at the paper but that looks similar to Siamese neural networks.