C5W3L03 Beam Search

Рет қаралды 83,595

DeepLearningAI

Күн бұрын

Пікірлер: 29

@khanzorbo 6 жыл бұрын

Thank you for being a protagonist of open education. These videos help a lot

@GeorgeZoto 4 жыл бұрын

The legend of Deep Learning! Thank you Professor Andrew Ng for sharing your light with the world 🕯️and for teaching us this awesome new field 😀Forever grateful!

@alireza173 5 жыл бұрын

You’re a fantastic teacher. Thank you

@cipherxen2 3 жыл бұрын

If first two words were "Andrew" and "Ng" next two words will be "is", "best".

@blairt8101 Ай бұрын

explained extremely well!

@zhuoerlyu4705 4 жыл бұрын

So helpful, thank you

@ze2411 3 жыл бұрын

Andrew Ng is the ML G.O.A.T!

@siloquant 6 жыл бұрын

Best!

@wolfisraging 6 жыл бұрын

u r best

@mohomedarshad7252 5 жыл бұрын

Hi, Thank you for your effort. I find your videos and explanations very instructive and detailed. But I was wondering if you could make a video about tree-to-string machine translation using tree transducers. It is something that I can't quite capture yet.

@danny-bw8tu 6 жыл бұрын

thank you!

@jeonghwankim8973 6 жыл бұрын

It was an awesome video sir. I just have one question. If we instantiate the network for each highest probable word sequences, does it mean we should use separate GPUs to run the instantiated networks? Will it take more time if I use a single GPU? Just out of curiosity.

@jismonj1 5 жыл бұрын

can u please help me in local beam search coding on unity

@Kareem-hl8hj 5 жыл бұрын

Thank you

@sandipansarkar9211 3 жыл бұрын

quite tough

@user-vm7we6bm7x 4 жыл бұрын

Nice accent and vid

@PRATEEK30111989 5 жыл бұрын

what is the computation complexity of the above method? is the beam width only used in the first iteration?

@robertbracco8321 5 жыл бұрын

The beam width is used in every iteration. At each stage we evaluate every possibility for the 3 beams we carried over from the last stage (this produces 30,000 new possibilities in the example) and then we reduce it to just 3 (our beam width) before moving on to the next step. It looks like the computation complexity for a search of a sequence of k words from a dictionary of n possible words, with a beam width b would be as follows. n steps for the first word of the sequence. then b*n for the additional k-1 steps in the sequence, giving n+(b*n*(k-1)), which for simplicities sake could just be considered b*n*k. For a sequence of 10 words, a dictionary with 10,000 words, and a beam size of 3, it would take 3*10*10000, or 300,000 operations. Beam is just a reduction of breadth-first-search, so if the beam were infinite, it would be identical to BFS. The complexity of BFS would be the size of the dictionary to the power of the size of the sequence (10000^10 in our example). For even modest dictionary sizes and sequence lengths this quickly becomes infeasible so that's why we need beam search to narrow the possibility while still giving us a high likelihood of finding the optimal result.

@shankarbhaskaran5778 5 жыл бұрын

One question on this one , If we increase beam width then can September come as a candidate in the first 3 words.May be the African sentence was literally translated as "September is the best time for Jane to visit Africa"

@samuelbarham8483 3 жыл бұрын

Sure! The beam search candidate sequences he used in his example were just that -- examples. I don't believe they were taken from a real neural network. They were merely meant intuitively to motivate the algorithm.