DeepMind x UCL | Deep Learning Lectures | 4/12 | Advanced Models for Computer Vision

Рет қаралды 47,862

Күн бұрын

Пікірлер: 23

@leixun 4 жыл бұрын

*DeepMind x UCL | Deep Learning Lectures | 4/12 | Advanced Models for Computer Vision* *My takeaways:* *1. Why we need to go beyond image classification **0:15* *2. Plan for this lecture **3:40* *3. Tasks beyond classification **5:20* 3.1 Object detection 7:22 -Model: Fast R-CNN, two-stage detector 14:40 ---Identify good candidate bounding boxes ---Classify and refine -Model: RetinaNet, one-stage detector 20:55 3.2 Semantic segmentation 28:12 -Model: U-Net 32:45 3.3 Instance segmentation 36:31 -Reference model: Mask R-CNN 3.4 Metrics and benchmarks 37:48 -Classification: percentage of correct predictions; Top-1: top predication is the correct class; Top5: correct class is in top-5 predictions -Object detection and segmentation: Intersection-over-union (IoU) -Object detection and segmentation dataset: Cityscapes, COCO 3.5 Training tricks 42:58 -Transfer learning *4. Beyond single input image: motions are import cues **50:26* 4.1 Pairs of images 59:16 -Model: FlowNet 1:00:35 4.2 Video input 1:03:56 -Apply 2D model to each frame 1:04:12 -3D convolutions 1:05:48 4.3 Applications: -Action recognition 1:09:30 --Model: SlowFast 1:11:25 4.4 Training tricks 1:14:35 -Transfer learning 4.5 Challenges: difficult to obtain labels; large memory requirements; high Latency; high energy consumption 1:15:53 *5. Beyond strong supervision **1:20:23* 5.1 Data labelling is tedious 1:20:36 5.2 Self supervised learning 1:21:40 -Standard loss: learn mapping between inputs and output distributions/values -Metric learning: learn to predict distances between inputs given some similarity measure (e.g. same person or not) -State-of-the-art representation learning vs supervised learning on accuracy and number of parameters 1:29:41 *6. Open questions **1:30:16*

@TheAero Жыл бұрын

These lectures as interesting, however it feels like they are way too high-level.

@Marcos10PT 4 жыл бұрын

The object detection part was a little confusing, too much explanation with only a few images to refer to. I think a more visual explanation would work better 😊

@wy2528 4 жыл бұрын

I feel the same

@gringo6969 4 жыл бұрын

Great course ! Thanks DeepMind, Thanks Viorica! just a little remark, it's hard to tell what she is showing with the laser pointer on the projection :)

@sayakchakrabarty 4 жыл бұрын

This lecture series is great and bound to spread great knowledge

@susmitislam1910 3 жыл бұрын

Great lectures, thanks! One small request I'd like to make is that due to showing only the lecturer's face and the computer slides and not the projector screen in the class, in some complicated slides it gets hard to follow, as the teacher is obviously pointing at parts of those slides as she is speaking. If it's not inconvenient, please try and do that with a mouse pointer instead, so that it's clearer to the KZbin viewers. Thanks again!

@abhishekyadav479 4 жыл бұрын

Correct me if I'm wrong but faster RCNN is a one stage detector and end-to-end differentiable as opposed to what is given in the lecture

@ArshedNabeel 4 жыл бұрын

abhishek yadav It’s a single unit for the forward pass; but during training, the RPN (region proposal network) is trained separately using objectness scores for loss.

@ninadesianti9587 4 жыл бұрын

Oh my goodness. I’m so left behind in this field. I don’t know how to catch up. Thank you for the lesson!

@farhanhubble 4 жыл бұрын

Thank you for sharing this. It's a great walkthrough of how computer vision has improved.

@dsazz801 2 жыл бұрын

Thank you so much for the kind, simple, and well-explained lecture! The open question parts were great that give some insights about the near future :)

@lukn4100 3 жыл бұрын

Great lecture and big thanks to DeepMind for sharing this great content.

@DatascienceConcepts 4 жыл бұрын

Quite useful

@ben6 4 жыл бұрын

I don't get why literally everywhere in the research (CV) community quote FPS without hardware. Even children know that FPS is hardware dependent, because they play games and the same game will have different FPS on the same graphics settings, sometimes even on the same machine based on cooling. Changing the performance of the hardware will drastically change the 'FPS'. From 1 to 1000. This number is totally meaningless without indicating the hardware. I guess its my job to guess which card you used and how many? In 2018 I would assume 1 or 2 NVIDIA GTX 1080TIs.

@ArshedNabeel 4 жыл бұрын

Ben B You raise a very valid point, this is one of my pet peeves about CV literature too! Numbers like FPS or even running time are too dependent on the underlying hardware to be meaningful without context. A more meaningful measure would perhaps be ‘#computations per forward pass’ or something similar. In this particular case, the 5fps claim comes directly from the Faster RCNN paper, which came out in 2015. The exact details of the hardware are not mentioned in the paper (or at least I couldn’t find it). I assume it will be quite a bit faster on contemporary GPUs.

@thomasdeniffel2122 4 жыл бұрын

thank you!

@lizgichora6472 3 жыл бұрын

Thank you.

@danielsoeller 3 жыл бұрын

I really like the series, and tr to watch half a a episode per day. This lecture was for me(not a native English Speaker) quit hard to follow. I don`t want to offend, i just want to give feedback. I think it was gerat that you gave it a shot, keep at it and you will became better :)