TOP K FREQUENT WORDS (Leetcode)

TOP K FREQUENT WORDS (Leetcode) - Code & Whiteboard

Рет қаралды 6,054

Күн бұрын

A clean, short, and understandable to the Top K Frequent Words problem (*very* frequently asked by Amazon!!!)
Let me know down below how I can help you guys next! :)
leetcode.com/p...
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Let's connect on LinkedIn!
/ aleksmitrovic
Follow me on Medium!
/ mitrovic.aleksandar
Questions or suggestions for new videos? Email me!
babybear4812@gmail.com

Пікірлер: 43

@flslwl0414 3 жыл бұрын

Please correct me, but isn't the time complexity O(n log n), not O(n log k)?

@babybear-hq9yd 3 жыл бұрын

It would be O(n log k) because of step 3 where we're creaing the heap. Step 1 = O(1) --> Error checking Step 2 = O(n) --> creating a dictionary with n steps Step 3 = O(n log k) --> we have n items with a heap of size k; if it was a heap of size n, it would be O(n log n) Step 4 = O(k log k) --> ... I think! we're popping off k elements and rearranging the heap every time. Overall, O(n log k) is the most intensive operation!

@flslwl0414 3 жыл бұрын

@@babybear-hq9yd Thanks for the reply. I think the size of heap can grow up to N when the given list has distinct elements. Thus, I think the overall time complexity is O(n log n). Correct me if I am missing!

@babybear-hq9yd 3 жыл бұрын

@@flslwl0414 ah shit you're right, i reviewed it too quickly and missed the fact that it wasn't constrained to size K. I'll pin this comment so others see it :) thank you!

@tripsntreats2790 3 жыл бұрын

@@babybear-hq9yd @Leo Song - I think the time complexity will be O(klogn) because at max any time we will pop k elements from the heap of any size ..so for the heapify we can take the TC as O(logn) and for k pops it should be O(klogn)

@moorchini 3 жыл бұрын

I love you man, like a coder loves a dark background :D Thank you again.

@babybear-hq9yd 3 жыл бұрын

I love you too :D

@samuhvarta 3 жыл бұрын

Thank you for this video. Instead of pushing all N elements to max heap, push K elements to min heap and for every K+1 push, pop the one with lower frequency. Heap will only heapify K elements thereby reducing complexity to N(logK), no?

@babybear-hq9yd 3 жыл бұрын

Hmm I think you may be onto something. Just thinking out loud here -- if I keep a max heap, it will store the largest element at the top. However (and correct me if I'm wrong here) I'm not sure that I will have O(1) access to the _smallest_ element in a _max_ heap.. right? I believe the smallest elements would be like the leaves on a tree, but in no particular order (i.e. if I have a handful of leaf elements, there's no real relationship between them in terms of which is larger or smaller, as long as the parent is always larger.) Let me know what you think!

@samuhvarta 3 жыл бұрын

You are correct. The way I see it - Using a max heap of N items and popping k items will result in complexity of k(logN) since the heap will sort N elements on each pop and there are k pops - and the answer will be the array of popped elements in descending order. But having a min heap of size k, and only popping elements when the min heap grows beyond k will have N(logK) complexity since heap is only sorting k elements on N pops - and the answer would be what is left in the min heap. Depending on the difference in N and K one approach will be better than the other.

@babybear-hq9yd 3 жыл бұрын

@@samuhvarta Hmm interesting. I believe that a max heap of N items and popping k will still be NlogN though. Adding 1 item to the heap would be logN; doing it N times would be NlogN. If you have a working NlogK solution that passes all tests, would be interested to see :) Not sure how I'd implement it.

@paulonteri 2 жыл бұрын

N log K solution: ``` from typing import List import collections import heapq class Word: def __init__(self, word, count): self.word = word self.count = count def __gt__(self, other): if self.count == other.count: return self.word < other.word return self.count > other.count class Solution: def topKFrequent(self, words: List[str], k: int): word_count = collections.Counter(words) heap = [] for word, count in word_count.items(): heapq.heappush(heap, Word(word, count)) if len(heap) > k: heapq.heappop(heap) top_k_words = [] while heap: top_k_words.append(heapq.heappop(heap).word) top_k_words.reverse() return top_k_words ``` My backup of this and other solutions: github.com/paulonteri/data-structures-and-algorithms

@tripsntreats2790 3 жыл бұрын

Hey @babybear4812...this is by far the best solution that I have seen for this problem. I was trying to use the nlargest function of the heapq and that gives wrong answers lexographically. So this solution of yours was kinda neat. And I have a suggestion - do discuss the time and the space complexity of each question in the end as that kind of wrap things up and will be helpful for those viewers who are preparing for the interviews. And rest I feel a great video...keep up the good work man :)

@babybear-hq9yd 3 жыл бұрын

appreciate it! :) yeah i must've missed this one. I usually mention the asymptotic analyses in the video or description.. whoops!

@tabmax22 Жыл бұрын

Neat solution but would we be allowed to implement this in an interview? Just seems too dependent on built-in libraries doing the heavy-lifting for us. I've had an interviewer say to me once we're testing your implementation of algorithms not built-in libraries

@babybear-hq9yd Жыл бұрын

it's something i talk about a lot, and the answer is it depends. I would (and do) mention that you need to identify the pros & cons of your solution. Mention that in Python (and other languages) some of these data structures are pre-built, and likely optimized, so for time's sake you could rely on them. However, I would still offer to build it out from scratch for the purpose of the interview if that's what they'd like to see. Interviewers just want to see that you're thinking :)

@konark709 3 жыл бұрын

great video aleks!

@meetshah477 2 жыл бұрын

bro that one of the easy solution that I have seen.

@babybear-hq9yd 2 жыл бұрын

love to hear that my man!!

@tonynguyen6124 2 жыл бұрын

clean, well explained, and concise. thanks!

@babybear-hq9yd 2 жыл бұрын

thx buddy

@irisi3308 3 жыл бұрын

The best explanation ever! Also, you're too cute!

@babybear-hq9yd 3 жыл бұрын

Ahaha glad you likd it 🙇‍♂️

@rohitpunetha9506 Жыл бұрын

thanks man

@tl8035 3 жыл бұрын

Love your clear solution! However, due to line 20, isn't the solution O(k log n) worse case? How would you solve it in O(n log k) time like the problem asked?

@babybear-hq9yd 3 жыл бұрын

I agree with the statement that line 20 would make it O(k log n) because we're popping the top item of the heap (O(1)), then rearranging the heap (O(log n)) a total of k times, thus O(k log n). If I'm honest with you, I don't currently see any way to get the O(k log n). However, every solution I've seen (including both in the 'Discuss' section and the LC solution itself) that claims to be O(n log k) is actually O(k log n). Unless there's something blatantly obvious that I'm missing, I think that follow-up is misleading... though I wouldn't bet my life savings on this.

@amitupadhyay6511 2 жыл бұрын

We can use bucket sort and with that, the best I seems to get is O(nklog(k)) Can someone tell if my time complexity is right ( in below code)? class Solution: def topKFrequent(self, words: List[str], k: int) -> List[str]: import heapq dic={} #------------------------>O(1) for i in words: dic[i]=dic.get(i,0)+1 #------------------->(n) z=[[] for i in range(len(words)+1)] for i , j in dic.items(): z[j].append(i) #---------------------------->O(n) ans=[] for i in range(len(z)-1,-1,-1): #-------------------O(n) if len(z[i])==0: continue else: heapq.heapify(z[i])#-----------------------(O(k)) while k and z[i]: #---------------------klogk ans.append(heapq.heappop(z[i])) k-=1 return (ans) # time complexity= (O(nklogk))

@johnsu5772 Жыл бұрын

nice one!!!

@babybear-hq9yd Жыл бұрын

thx bud!

@VadimFilin 2 жыл бұрын

Please create heap manually for C and Go programmers.

@babybear-hq9yd 2 жыл бұрын

Doubt it’ll be up on this channel mate, creating heaps manually is something I’ve seen covered extensively already

@samw2066 3 жыл бұрын

Where does the code Sort the words with the same frequency by their lexicographical order ? it does not appear to, even though leetcode accepted it.

@babybear-hq9yd 3 жыл бұрын

it's sorted by frequency, then alphabetically when the array is heapified (i.e. it becomes a heap)

@parthvaghela7820 2 жыл бұрын

Lol thanks for sharing which company asked it.

@babybear-hq9yd 2 жыл бұрын

no worries!

@fgtdjkg Жыл бұрын

wawaweewa, it's a very nice!

@babybear-hq9yd Жыл бұрын

lmaoooo

@utkarshkumar5349 2 жыл бұрын

Is heap avaliable in all languages or getting inbuilt heaps specific to python?

@babybear-hq9yd 2 жыл бұрын

Other languages have it as well. I believe Java does, not sure about C++. At the end of the day, in your interviews you can either offer to: A) build a heap, and then use it, or B) ask the interviewer if they'd like you to assume taht you already have one available. My guess is they're more likely to go for option B as it'll allow you to focus on the essence of the problem. Asking the question, though, shows the level of thoughtfulness they're looking for :)

@taram1489 3 жыл бұрын

How about returning the top frequent in their alphabetical order, your solution doesn't do that.. That's what the problem is all about

@babybear-hq9yd 3 жыл бұрын

We actually do take care of that -- otherwise the solution wouldn't have passed. When adding to the heap, we're adding a tuple: (-frequency, word). By default, the heapq library creates a min heap for us. This is why we negate the frequency -- the most negative one will be at the top (i.e. the one that appeared most frequently). Subsequent to that, if we have a collision in frequency values, we compare the word alphabetically. Since it's a min heap, the word with a lower value will go above, thus being returned first in the final line. The words are compared letter-by-letter on (I believe) ASCII value, or some equivalent.

@tripsntreats2790 3 жыл бұрын

@@babybear-hq9yd - I have a question here - why did you add tuple as the data structure to the heap and not the list?

@babybear-hq9yd 3 жыл бұрын

@@tripsntreats2790 list should work just the same! no particular reason