Edit Distance of two strings - Real world application

  Рет қаралды 69,944

Gaurav Sen

Gaurav Sen

6 жыл бұрын

What is the edit distance of two strings? It is the minimum cost of operations to convert the first string to the second string. The operations allowed are addition, deletion and transition. We find the sequence of edits required to change the first to the second to in the cheapest way possible.
This algorithm using dynamic programming is used in many real world applications. The edit distance of two strings helps auto correct and word suggestion algorithms. These applications run when we type texts and messages on phones or computers.
The edit distance is also called the levenshtein distance between two strings. It has applications in auto correction, genetic material study and many other string processing applications. The most common use is that of auto correct in text editors, which use a variation of the edit distance algorithm to find closest matching words from a dictionary.
This is a dynamic programming solution which converts a recursive approach to a matrix. This matrix allows us to store values which are accessed repeatedly, and hence converts the simple recursion to dynamic programming. Each cell in the dp matrix depends on adjacent cells and the original strings for it's value.
I am looking forward to talking about approximate string matching in the next videos. Cheers!
References:
web.stanford.edu/class/cs124/...
www.geeksforgeeks.org/dynamic...
en.wikipedia.org/wiki/Levensh...
people.cs.pitt.edu/~kirk/cs15...
Code:
github.com/gkcs/Competitive-P...

Пікірлер: 120
@sarkarsh5537
@sarkarsh5537 4 жыл бұрын
Awesome video! So helpful!!! And the ---> "There'll never be a time when I'll add and then delete a character because that's stupid." was funny, had to pause the video to laugh for a moment. XD
@gkcs
@gkcs 4 жыл бұрын
😁
@utkarshjain8024
@utkarshjain8024 2 жыл бұрын
"Adding a character to A is equivalent to deleting a character from B", this was the one line that made me understand the entire problem. Really clear explanation!
@huangleonard1557
@huangleonard1557 2 жыл бұрын
This is the very first edit distance tutorial which gives me no feeling of what the fxxx are you talking about. Excellent!!
@aakashpathak21
@aakashpathak21 6 жыл бұрын
I was stuck for an hour because of the thing that how we can compute (i, j - 1) instead of ( i + 1, j)...saw a lot of videos but you are the one who actually explained :) Thanx bro!
@vlogs.anurag
@vlogs.anurag 4 жыл бұрын
exactly
@saurabhjoshi9319
@saurabhjoshi9319 3 жыл бұрын
I watch the whole video just because of this point Well explained
@vinayak186f3
@vinayak186f3 3 жыл бұрын
I'm still stuck ,unable to backtrack , the goal was to convert pqqrst to qqttps , but he ended up with qqts 😥
@BingumallaAmaresh
@BingumallaAmaresh 5 жыл бұрын
The first part of the video explaining the function and the logic that follows for DP was what I just needed. Thanks :)
@mohitksahu
@mohitksahu 2 жыл бұрын
I watched many videos but could not understand the addition point. Till now, this is the best explanation I have come across for this problem. Thanks a lot, Gaurav. You are the savior.
@sachinsharma905
@sachinsharma905 3 жыл бұрын
This is the most articulate explanation of this problem in the whole internet space.Thanks man.
@pranavgaur6399
@pranavgaur6399 4 жыл бұрын
The only reason I was able to understand the logic behind this problem(after watching several videos) was because you explained the logic behind insertion which no one else did.. keep up the amazing work
@studgaming6160
@studgaming6160 5 жыл бұрын
I didn't understand the how minimum function takes minimum of 3 elements of array from Tushar Roy's video, Then I came here and I UNDERSTOOD. Thank You brother. Keep making such awesome videos.
@heenaagarwal6795
@heenaagarwal6795 3 жыл бұрын
When I search for a problem explanation on youtube, and I find your video at the top of the output list of my search, I get a sense of relief and a smile on my face. Now I will watch the video.
@gkcs
@gkcs 3 жыл бұрын
So glad to hear that Heena, thank you 😁
@SurjitSingh14-c6i
@SurjitSingh14-c6i 3 жыл бұрын
@@gkcs what is wrong public: int ct=0; string lcs(string s, string t) { int dp[s.length()+1][t.length()+1]; for(int i=0;i0) { // replace ct+=1; } else if(x==0&&y>0) { // insert ct+=y; } } int x=s.length()-p1; int y=t.length()-p2; if(p2>=t.length()&&p1
@sanjarcode
@sanjarcode Жыл бұрын
Learning and thinking like this is way better than just saying DP and then learning it.
@lovehasija3995
@lovehasija3995 4 жыл бұрын
Awesome Explanation. I get shocked to see so many videos explaining the algorithm without the intuition or derivation. You did it really well, thanks for sharing. It's really helpful.
@gkcs
@gkcs 4 жыл бұрын
😁
@anushkachakraborty8635
@anushkachakraborty8635 2 жыл бұрын
Of all the edit distance explanation videos I have seen, This was the most passionately and relatable explained. Thankyou so much :)
@musicislife1844
@musicislife1844 4 жыл бұрын
Awesome video!! Thank you so much!! I have a humble request...please add more videos on dynamic programming problems the way you teach makes things look way easier than they are!! Keep it up!. Thanks a lot!!!
@rohangarg5473
@rohangarg5473 5 жыл бұрын
No doubts in Edit distance now, thanks👍
@rishabhb2688
@rishabhb2688 4 жыл бұрын
Very well explained ! Thank you !
@nikhilgumidelli6308
@nikhilgumidelli6308 4 жыл бұрын
Brilliantly explained !
@fullemptiness
@fullemptiness 4 жыл бұрын
Coming from coursera course, suck in explaining the problem. Thanks, your explanation helped!
@SATISHKUMAR-xk7cn
@SATISHKUMAR-xk7cn 5 жыл бұрын
Excellent explanation !! keep it up.
@adlurirohith9528
@adlurirohith9528 3 жыл бұрын
Adding a character in first string is logically deleting the last character in second string and the way that we do it is ( i+1 , j )-->( i , j-1) . I get it now. Great video btw.
@Ankitbomb1
@Ankitbomb1 5 жыл бұрын
That helped. Thanks man!
@MegaArti2000
@MegaArti2000 3 жыл бұрын
Had to watch it twice, but I think I got it. Thx.
@StarshipPapillon
@StarshipPapillon 5 жыл бұрын
thank you gaurav, saved my life :)
@gkcs
@gkcs 5 жыл бұрын
Glad it helped!
@jkrw3540
@jkrw3540 5 жыл бұрын
Can you explain, why you have added and deleted from the end of the string and not from other part of the string ? Also, how do I know what character I am going to add in the first string to make it the second string ?
@PriyankaSingh-pn8xv
@PriyankaSingh-pn8xv 3 жыл бұрын
Awesome explanation
@amitpurohit8816
@amitpurohit8816 4 жыл бұрын
Sir your explanation is awesome....I watched tonnes of videos for this problem...But yours was the most effective..thank you!!
@gkcs
@gkcs 4 жыл бұрын
Thank you :)
@amitpurohit8816
@amitpurohit8816 4 жыл бұрын
@@gkcs I'm surely gonna recommend your dp videos to all my friends who are beginners in dp
@vrdash
@vrdash 4 жыл бұрын
Edit Distance that eluded me for so long makes so much sense now. I thought I would have to just blindly believe in the Algorithm :p
@barshalamichhane6761
@barshalamichhane6761 2 жыл бұрын
Thank you so much!!
@sahilkakkar5628
@sahilkakkar5628 2 жыл бұрын
Good Explanation, really helpful!
@dznuts123
@dznuts123 Жыл бұрын
Just to clarify, adding a character to string 1 is not literally the same as deleting a character from string 2. The minus 1 from string 2’s index is an abstraction, because after implicitly adding a character to string 1, the character at the index in string 2 is taken care of. We don’t need to minus 1 from string 1 index, because the index did not change during the additive process.
@prateekkanujiya9775
@prateekkanujiya9775 5 жыл бұрын
U r awesome bro. Respect
@pranavkochhar9352
@pranavkochhar9352 3 жыл бұрын
VERY WELL EXPLAINED!!
@innocent1029
@innocent1029 5 жыл бұрын
could you do a video on finding if a string is interleaved of two other strings?
@aparichitsingh384
@aparichitsingh384 6 жыл бұрын
Again a nice one. Thanks for this :) Cheers!
@gkcs
@gkcs 6 жыл бұрын
Thanks Aparichit! :) I am sorry I must have forgotten, but I don't remember thinking of a follow up video on Centroid Decomposition. Was it something I said? Centroid Decomposition: kzbin.info/www/bejne/Y33KlZRja8R0n7s
@aparichitsingh384
@aparichitsingh384 6 жыл бұрын
Sorry for this! I didn't get any notification for : kzbin.info/www/bejne/opnYY2x5apata9U Going to watch it tonight. Thanks a lot for reply bhaiya :)
@gkcs
@gkcs 6 жыл бұрын
Have fun! Btw, it has a mistake, so I added a correction video in the description :)
@aparichitsingh384
@aparichitsingh384 6 жыл бұрын
yes I'm talking about that correction video! (didn't get any notification for that. so, misunderstood)
@rathinavelsankaralingam2929
@rathinavelsankaralingam2929 4 жыл бұрын
Thank you man! Just Thank You /\ You should live long :)
@AkhileshKumarCSE
@AkhileshKumarCSE 6 жыл бұрын
Hello Gaurav, could you please help me understanding the at 12:58 - 13:05 here you are pointing at 'p' then saying add a char to get to 'q' ?
@gkcs
@gkcs 6 жыл бұрын
I have replied to your other comment on this :)
@akashchatterjee5179
@akashchatterjee5179 4 жыл бұрын
Can anyone please say that why LCS -max length(S1,S2) logic fails...
@anirudhatalmale5575
@anirudhatalmale5575 5 жыл бұрын
outstanding
@ShivamKendre-fc3su
@ShivamKendre-fc3su 4 жыл бұрын
Really nice
@AjayKumar-qm5fu
@AjayKumar-qm5fu 6 жыл бұрын
Commendable effort bro
@gkcs
@gkcs 6 жыл бұрын
Thanks!
@yobro7051
@yobro7051 3 жыл бұрын
this guy always seems so much energetic
@saurabhsharma7123
@saurabhsharma7123 6 жыл бұрын
The concern of this video was why computing ( i + 1, j), instead of (i, j - 1) is wrong. That would lead to cyclic recursions!
@gkcs
@gkcs 6 жыл бұрын
Exactly!
@priyeshshrivastava8425
@priyeshshrivastava8425 5 жыл бұрын
Hi Gaurav, if we are converting i+1,j to i,j-1 why are we still taking Ca the cost of addition, and not Cd?
@sidharthsk3807
@sidharthsk3807 5 жыл бұрын
U IIT Kharagpur
@AayushThokchom
@AayushThokchom 4 жыл бұрын
​@@gkcs As Priyesh asked in the comment. My understanding is: this approach can only be applied if Cd < Ca. If Ca
@gkcs
@gkcs 3 жыл бұрын
@@priyeshshrivastava8425 Because although the operation is equivalent, we cannot change the costs of the operations. That's usually an input to the problem.
@iiillliii1386
@iiillliii1386 2 жыл бұрын
proud of you my friemd
@shreyanshtiwari3495
@shreyanshtiwari3495 Жыл бұрын
great!
@riturajjoshi5935
@riturajjoshi5935 6 жыл бұрын
Congrats on the new mic!
@gkcs
@gkcs 6 жыл бұрын
Thanks!
@madhukiranattivilli2321
@madhukiranattivilli2321 Жыл бұрын
I'm not sure if I clearly understood the concept of "adding to A is same as deleting from B". Lemme know if my writeup has flawa... (1) To implement dynamic programming technique, higher state should always reduce to a lower state. But in the case of adding a character to A, a higher state moves to an even higher state. So, this needs to change. 1 way is to delete character from B, instead (2) Difference in the sizes of strings A and B wud be 1 whether a character is added to A, or a character is deleted from B. So, to help the problem in (1), delete character from B (3) If A=pqqrst and B=qqttps, adding character to A would implicitly mean adding the final character of B to A as well. So A becomes pqqrsts. Now A and B have the same terminal characters. They have to be removed, because the algo does addition/deletion/transition always to the terminal characters, and so not deleting them will block the work of the rest of the algorithm. Hence, this change makes A,B = pqqrst, qqttp. And this would also be the state if s is deleted from B without making any changes to A (3) is the main point. (1) and (2) are supporting points Comments?
@himanshusingh694
@himanshusingh694 4 жыл бұрын
Why you are considering going right towards p and going down towards q as deletion operation?
@gkcs
@gkcs 4 жыл бұрын
Check out the other edit distance video on the channel, which explains this.
@LearnWithBahman
@LearnWithBahman 5 жыл бұрын
Officially, I am a fan now, hopefully, one day we work together.
@gkcs
@gkcs 5 жыл бұрын
😁
@archip3112
@archip3112 6 жыл бұрын
Hi Gaurav, Please explain the case when the given two strings are of different size?
@gkcs
@gkcs 6 жыл бұрын
Hey Archi, The same algorithm applies! The matrix is of size M*N, where they M and N are the lengths of the two strings. 😋
@archip3112
@archip3112 6 жыл бұрын
you are right. Thanks !
@yuvrajoberoi7834
@yuvrajoberoi7834 4 жыл бұрын
Excellent explanation of this problem. Also, you gave me a better appraoch to appraoch a proble. I have seen problems like design autocorrect feature etc. and there this thing can be very helpful right?
@gkcs
@gkcs 4 жыл бұрын
The autocorrect feature also has to take into consideration "probable words". For example, the word "runvwe" is closer to "timber" than we might expect. It's because our keyboards are designed that way. Our fingers also have typical ways of typing, and our mind make phonetic to spelling translations when we aren't sure of the spelling. All in all, it's a complex problem to solve. Neural networks have been seen to perform well in autocorrect applications. I would go with one of those :)
@yuvrajoberoi7834
@yuvrajoberoi7834 4 жыл бұрын
@@gkcs Hey, thank you for giving your time to explain this.
@kartikshet1
@kartikshet1 6 жыл бұрын
Great explanation. I have one issue with dynamic programming, even if I come up with the recursive function, I find it hard to convince myself that it leads to the solution, I find myself stuck there.
@gkcs
@gkcs 6 жыл бұрын
It'll improve soon 😁
@StarshipPapillon
@StarshipPapillon 5 жыл бұрын
I think it will help you to trace your program step by step just like dynamic programming
@AkhileshKumarCSE
@AkhileshKumarCSE 6 жыл бұрын
Hello Gaurav could you please help me understanding the concept of "Addition" here ? f(i + 1, j) = f(i, j - 1) ? Example given two strings "p", "q" , how will we convert "p" --> "q" using "Addition" operation ? As per logic which i understood from this video it would go like below :- f(1,1) f(1,0) = "q" --> " " (I guess this is 'j - 1' , effectively deleting from string "q" than adding in "p"). " " --> "p". (Addition : adding 'p' into NULL string). So basically we converted "q" --> "p" not "p" --> "q" which was the objective of Edit distance ? correct me if i understood wrong and help me understanding correctly please !
@gkcs
@gkcs 6 жыл бұрын
Both are equivalent. Converting A to B is exactly as expensive as converting B to A. So we pick any one and follow the steps. If the steps need to be printed, we can simply invert the operations that are optimal and the answer will be the same :)
@AkhileshKumarCSE
@AkhileshKumarCSE 6 жыл бұрын
Thank you !
@SportsMoments17
@SportsMoments17 2 жыл бұрын
13:13 we have not only one possibility, we have two possibilities because diagonally and vertically both are 4
@visheshsrivastav2107
@visheshsrivastav2107 6 жыл бұрын
Bhaiya can you upload a video for hardest gcd problem (codechef april cook off problem)
@gkcs
@gkcs 6 жыл бұрын
Coming up next :)
@sulekhakumari-hs4gy
@sulekhakumari-hs4gy 4 жыл бұрын
want more from strings
@alexanderrivera1938
@alexanderrivera1938 3 жыл бұрын
Can one of the strings be longer than the other?
@gkcs
@gkcs 3 жыл бұрын
Yes
@Shrimangal100
@Shrimangal100 4 жыл бұрын
in d[1][1] your output is 0+1, but in d[1][2] it is only 1 and not 1+1. Kindly explain
@hey.............
@hey............. 4 жыл бұрын
great explanation Gaurav. Thank you for the help.
@gkcs
@gkcs 4 жыл бұрын
Thanks Varun!
@hussainburhanuddin3705
@hussainburhanuddin3705 3 жыл бұрын
Wow, I really loved this video, Your explanation was really easy to understand. But, you didn't explain how the initial table would be created, at 10:59 , as I understood the table had 1,2,3,4,5,6 because Ca and Cd were = 1. I am not able to get my head around how the set up the initial table (end states) values for different values of Ca and Cd.
@gkcs
@gkcs 3 жыл бұрын
We make a table full of zero values, with dimensions equal to length of the two string m * n.
@hussainburhanuddin3705
@hussainburhanuddin3705 3 жыл бұрын
@@gkcs Okay, that makes sense. But how would you fill the values for row 0 and column 0 if Ca,Cd != 1???
@faizaomer5737
@faizaomer5737 4 жыл бұрын
Uffff Thanks Allah... Finally i got it output concept... Thank you Tutor
@ayushgarg5929
@ayushgarg5929 3 жыл бұрын
Jai Shri Ram🚩
@ilikememes9052
@ilikememes9052 4 жыл бұрын
Ohhh God !! I can't understand anything in algorithms 😭😭😭. Have to watch multiple times.
@mandeep2192
@mandeep2192 4 жыл бұрын
because u like memes :P
@anandchowdhury9262
@anandchowdhury9262 4 жыл бұрын
Tushar should see your videos..
@varshavasan2130
@varshavasan2130 4 жыл бұрын
@fullemptiness
@fullemptiness 4 жыл бұрын
In the Edit Distance problem you need to know the source and target strings before solving the problem, but in the text correction application, you haave no idea what is the target string!
@gkcs
@gkcs 4 жыл бұрын
Yes. A frequency based trie is useful in those cases.
@samarpanharit4268
@samarpanharit4268 6 жыл бұрын
I am unable to understand 😶😶
@gkcs
@gkcs 6 жыл бұрын
Which bit?
@samarpanharit4268
@samarpanharit4268 6 жыл бұрын
Got it now after watching it thrice😂 thanks for asking
@stalera
@stalera 4 жыл бұрын
using the approach described above for words "smitten" and "arbrikitten", the actual difference should be 5 but I get as 4. the issue I see is that i is repeated in the second string and that is causing the issue. I'm sure I'm missing something if you could help me understand. Thanks.
@gkcs
@gkcs 4 жыл бұрын
I tried the code: github.com/gkcs/Competitive-Programming/blob/master/src/main/java/main/java/videos/EditDistance.java The difference is 5. Where did you get 4?
@stalera
@stalera 4 жыл бұрын
@@gkcs my bad, shouldn't the difference be 6? arbrik 6 characters and remaining are same. If I use the LevenshteinDistance algorithm it gives 6. Am I missing something here? The issue is because arbrikitten has 2 'i' which is being considered both the times for 'i' in smitten. Shouldn't it be considered only once?
@stalera
@stalera 4 жыл бұрын
hey man, do you think I'm missing something here? it could be some silly mistake if you can help me figure out.
@gkcs
@gkcs 4 жыл бұрын
ideone.com/jjou1K
@stalera
@stalera 4 жыл бұрын
@@gkcs appreciate your efforts to respond to each of the comments man. that takes hell lot of patience. Hope you've not kept an assistant to do that ;)
@artihlec
@artihlec 5 жыл бұрын
young Addy Osmani
@axc4280
@axc4280 5 жыл бұрын
Far from Topic Can you tell me what is difference between C++ and Java when we learn it and from what kind of matter we have to be attentive...
@pallavisompalli2689
@pallavisompalli2689 6 жыл бұрын
Will you marry me? 😭
@thebibhuty
@thebibhuty 5 жыл бұрын
This channel is related to programming not bharat matrimony ;)
@karun4663
@karun4663 5 жыл бұрын
@@thebibhuty savage
@rajatpal1019
@rajatpal1019 5 жыл бұрын
lol :)
Ukkonen's algorithm for approximate string matching
21:55
Gaurav Sen
Рет қаралды 11 М.
A clash of kindness and indifference #shorts
00:17
Fabiosa Best Lifehacks
Рет қаралды 113 МЛН
What is an API and how do you design it? 🗒️✅
15:26
Gaurav Sen
Рет қаралды 718 М.
What is a MESSAGE QUEUE and Where is it used?
9:59
Gaurav Sen
Рет қаралды 955 М.
Minimum edit distance | Dynamic programming | Backtracking
28:52
The Egg Dropping Problem - Interview Question
17:25
Gaurav Sen
Рет қаралды 181 М.
Amazon Coding Interview Question - Recursive Staircase Problem
20:01
How Fuzzy Text Search Works
18:36
Big Python
Рет қаралды 13 М.
But what is a convolution?
23:01
3Blue1Brown
Рет қаралды 2,5 МЛН
What's an Event Driven System?
14:59
Gaurav Sen
Рет қаралды 310 М.
What is CONSISTENT HASHING and Where is it used?
10:50
Gaurav Sen
Рет қаралды 771 М.