LeetCode 196: Delete Duplicate Emails [SQL]

Рет қаралды 19,629

Күн бұрын

Solution and walkthrough of leetcode database problem 196: Delete Duplicate Emails. I'm using MySQL but this solution should work in any SQL dialect such as PostgreSQL SQL Server, etc.
Link to the problem: leetcode.com/p...

Пікірлер

@frederikmuller 2 жыл бұрын

If you want to have access to more free SQL problems, check out StrataScratch: stratascratch.com/?via=frederik

@tongkezhao2188 3 жыл бұрын

I like the way that you always give interview tips relate to each question at the end of a video. Good job!

@sheetalm5959 3 жыл бұрын

Thank you for the video, I am new to SQL coding. Can you please explain how did you get the 'Intuition' behind using self join here? Thanks.

@frederikmuller 3 жыл бұрын

Sure, the intuition behind using self join is that we compare the table to another copy of itself to see if we find any matching email addresses with higher ID values than the current row we're looking at (these will be the duplicates). We're then deleting these rows from our data base by using the DELETE statement since by the problem definition we only want to keep the entry with the lowest ID value. Hope that makes sense!

@sheetalm5959 3 жыл бұрын

@@frederikmuller Got it. Thank you🙂

@shuchivardan375 2 жыл бұрын

can we use row_number and partition by email function?

@frederikmuller 2 жыл бұрын

actually yes, and then just delete everything with row_number > 1 if you order by Id in ascending order.

@frederikmuller 2 жыл бұрын

good idea!

@andrew5291 4 жыл бұрын

Thank you! You are the best! Very helpful : )

@christophercastro6078 Жыл бұрын

I am assuming this wouldn't work if the table had more than 1 duplicate email. Only works if there is at most 1 duplicate email. The statement in the problem descriptions states: Write a solution to delete ALL duplicate emails... Which could be interpreted two ways.

@christophercastro6078 Жыл бұрын

Never mind, this works with several identical emails since there is only one id that is the smallest from the bunch!

@frederikmuller Жыл бұрын

Thanks for bringing this up, this will help people.

@utsavsagar6665 11 ай бұрын

Thank you for the help but still I am unable to understand what is wrong with this syntax.

@pranitmhatre8924 3 жыл бұрын

I got exactly the same problem in my coding Interview.

@utsavsagar6665 11 ай бұрын

Please let me know where i am wrong in this in syntax 🙏 delete from person where email in(select email from person group by email order by id having count(email) >1);

@shriyamjalan247 3 жыл бұрын

Hi Frederik, I was trying to write a query by selecting the duplicate emails and then deleting it. Can you help me explain what I am doing wrong ? Also, if it is possible to write a query in this way. I selected the duplicate emails in following way: select email from Person group by email having count(Email)>1 I put this in where clause and then tried deleting it.

@inzamamsafi5261 2 жыл бұрын

you need a minimum id while grouping. so take that into consideration as well

@mickyman753 2 жыл бұрын

you can do this select temp.email from (select email from Person group by email order by id) as temp having count(temp.Email)> 1;

@satyamgavane9345 Жыл бұрын

what mistake i have done in my code please can you tell me with cte as(select *,row_number() over (partition by email order by id) as r from Person) delete from cte where r>1;

@kwoktungwong9486 2 жыл бұрын

Hi, why do we delete p2 here not p1?

@frederikmuller 2 жыл бұрын

Because we’re supposed to keep the email entry with the lowest id. Since we have a filter on p1.Id < p2.Id, p2 will always be the email duplicate entry with the higher id. There can be more than one duplicate but this query deletes every p2 that can be joined based on the conditions: same email and bigger id as another entry (our p1). Hope that makes sense.

@uddipangoswami8427 2 жыл бұрын

Why this is not working in Oracle?

@ahmedboutaraa8771 4 жыл бұрын

thank you

@stellaueda4565 4 жыл бұрын

Thx again

@paper-studio Жыл бұрын

this code not working, not even the discussion code working, there is some issue I guess

@joshuaperez2271 Жыл бұрын

Thanks for explaining this concept!

@frederikmuller Жыл бұрын

Glad it was helpful!

@ekdumfarzi8039 2 жыл бұрын

delete p2 from... why did u use p2, we dont write delete statement like that...where can i learn more about that way of writing..

@frederikmuller 2 жыл бұрын

I just replied this to another comment asking the same question: Because we’re supposed to keep the email entry with the lowest id. Since we have a filter on p1.Id < p2.Id, p2 will always be the email duplicate entry with the higher id. There can be more than one duplicate but this query deletes every p2 that can be joined based on the conditions: same email and bigger id as another entry (our p1). Hope that makes sense. The DELETE statement works just like the SELECT statement but instead it deletes rows instead of selecting them for output. It doesn’t show up often in interviews so I don’t think you need to read up more on it. The trick with this question is to find what you want to delete by using a JOIN which makes the question tricky and non-standard.

@vasiliyosipov43 2 жыл бұрын

I was dissapointed about that too, because we can't write anything after delete except from. But I guess it's the MySql special option. Didn't work for PostgreSQL btw

@newtc93 3 ай бұрын

ure amazing

@frederikmuller 3 ай бұрын

no u

@sungwonjudycho 2 жыл бұрын

Hi Fredrik! Thank you so much for your videos; it's been really helpful while trying to study SQL by myself. I had some questions about your explanation and would be great if I could get some pointers on what I'm missing here. Q1. why "WHERE" is not accepted? [My Attempt] DELETE p2.id FROM Person p1 JOIN Person p2 ON p1.email = p2.email WHERE p1.id < p2.id Q2.I'm not sure I understand how the [AND] clause could be used on this Join table. Shouldn't ON only accept when the two columns are identical? [ACCEPTED ANSWER] DELETE p2.id FROM Person p1 JOIN Person p2 ON p1.email = p2.email AND p1.id < p2.id