Group Anagrams - Categorize Strings by Count

Group Anagrams - Categorize Strings by Count - Leetcode 49

Рет қаралды 502,287

Күн бұрын

Пікірлер: 406

@TheFazilaashraf 2 жыл бұрын

For those who don't know what is defaultdict(list). It means "Defining a dictionary with values as a list ". Also it will give you error so dont forget to use this in your code. from collections import defaultdict

@patrickwoolard4340 2 жыл бұрын

I was wondering what the "defaultdict(list)" did, thanks!

@ikthedarchowdhury8947 2 жыл бұрын

Thanks so much for the insight!

@ikthedarchowdhury8947 2 жыл бұрын

Also, we need to define the hashmap as res= collections.defaultdict(list)

@brhh 2 жыл бұрын

@@ikthedarchowdhury8947 I think it was clear that importing defaultdict from collections did the job, no need for collections.defaultdict

@jpkeys6000 Жыл бұрын

Thank you!

@bosteador 2 жыл бұрын

I went with a O(m*n*log(n)) solution. This one seems super advanced to come up with on the spot, but I would have never thought of sorting words and key them in that way. Both solutions are super smart.

@falcomomo Жыл бұрын

Same here, I got the O(m*n*log(n)) solution first fairly easily then failed the tests which require better time complexity. From when I last was grinding interview prep about 6 years ago I think I would've got the better solution, but right not I feel the same - no chance to think of on the spot in an interview. However, past experience tells me that it'll come back with practice. 10mo on from your comment, how you feeling?

@gradientO Жыл бұрын

@@falcomomo I get 6ms runtime for O(mnlogn) Java solution, but 22ms for O(mn). idk how

@gopalakrishnan9610 Жыл бұрын

@@gradientO Probably because there is a lot of overhead in the solution above. Initialising dictionary, converting list to tuples etc. Also, the MNLogN solution works. If you see in Leetcode the length of string is pretty less. It is

@dipanshi1588 Жыл бұрын

@@falcomomocan you tell me your strategy how you have done in m*nlogn

@tusharsnx 10 ай бұрын

Reason why O(M * N * logN) is better in this case: It's wild but in this case O(M * N * logN) is better than O(M * N * 26). Because in order for logN to be 26, you need N to be 10^26 which is practically impossible. That's why I consider logN to be constant because it literally is (in the programming context).

@theornament 8 ай бұрын

Just did this problem with O(m . nlog(n)) yay, just something to note: - The time complexity between sorting and using a map for character frequency is very similar (if not equal) depending on the average length of the strings. The problem in LeetCode specifies that the string lengths vary from 0 to 100, making it so the sorting time isn't as impactful on O. (Tested multiple times both solutions runtime as well) - The algorithm will take more space on the solution of frequency map than the sorting solution if the strings do not have much characters (depending on the sorting algorithm you use as well) as it will take O(n) space complexity for the algorithm to sort the string. This is because the strings may be way less longer than 26 characters, which is the size of the frequency map, so it's a thing to take into consideration for sure. So, in other words, if you see that the string length is specified as having the possibility of being very large then the frequency map is the way to go, otherwise the sorting algorithm is the best solution. Pretty sure you won't read this Neet but I just had a meeting with you and Pooja Dutt, both of you are awesome.

@AbhishekMishraiitkgp Жыл бұрын

If you don't want to use defaultdict() from collections, you may do this: def groupAnagrams(strs): res = {} for s in strs: count = [0] * 26 # a .... z for c in s: count[ord(c) - ord("a")] += 1 key = tuple(count) if key in res: res[key].append(s) else: res[key] = [s] return list(res.values())

@Mrorange016 7 ай бұрын

Also, you could use: res[key] = res.get(key, []) + [s] Instead of if/else

@bykhalifa1654 3 ай бұрын

@@Mrorange016That’s exactly what I was thinking. I am glad someone thought of this too lol

@QuadQuadQuadQuad 3 жыл бұрын

Always love the direct and no nonsense explanations! Great job!

@NeetCode 3 жыл бұрын

Thanks! Much appreciated

@ShivKatira 8 ай бұрын

@@NeetCode Why do you avoid using Counter()?

@etchris 2 жыл бұрын

Wow! Very different than my soultion. In my head I went about it by grabbing the word at strs[0], check to see if every char in that word exitists in any of the other words, and if so append all matching words to a group. After all possible words have been found we remove them from the strs list, so the element at strs[0] is now a new word that is not any of the anagrams of the orginal strs[0].

@nicholas_as 2 жыл бұрын

is the complexity o(n)^2 ?

@Sandeep-zu7gd Жыл бұрын

@@nicholas_as i think that'd be O(n^2 * k) where k is the length of the largest string

@geld5220 Жыл бұрын

ooh that make sense. So did you find that checking the length of the string should also be there - Because `eat` can be in `eaten` but they are not anagrams. ?

@harperbye Жыл бұрын

i did this also but i got "time limit exceeded"

@symbol767 2 жыл бұрын

This one was really difficult for me for some reason.. I need to come back and solve this again without looking at the solution. Thanks man, liked and comment for support.

@stefano_schmidt Жыл бұрын

it's been a year. Have u solved it?

@MiguelRodriguez-ie1hh 10 ай бұрын

Its been 6 months since @stefano_schmidt asked if you solved it. Have you solved it?

@StudyTime-u8g 10 ай бұрын

It's been 7 months since @stefano_schmidt asked if u solved it. Have u solved it?

@mmkamron 8 ай бұрын

It's been 8 months since @stefano_schmidt asked if you solved it. Have you solved it?

@sandeepmallina7 8 ай бұрын

It's been 10 days since @mmkamron asked if you solved it. Have you solved it?

@coding10yearold 2 жыл бұрын

first, thank you @NeetCode for such great videos so I am doing these problems on the Jupyter Notebook. Here are the minor changes to get the code to work: 1. add "from collections import defaultdict" 2. lowercase the word "List" def groupAnagrams(self, strs: list[str]) -> list[list[str]]: 3. change "return res.values()" to "list(res.values)"

@sf-spark129 Жыл бұрын

for #2, you need to import typing. from typing import List

@vaiterius Жыл бұрын

Thank you! Never would have thought of making the keys the ASCII character count, that is pretty clever

@sahildhawan22 2 жыл бұрын

For anyone wanting to write exact same solution in Javascript, here it is: var groupAnagrams = function (strs) { let res = {}; //mapping charCount to list of anagrams i.e "1e,1a,1t": ["eat", "tea", "ate"] for(let s of strs) { //To count each char: let count = new Array(26).fill(0); //for a....z //We'll index a to 0 till z to idx 25 for(let c of s) { count[c.charCodeAt(0) - "a".charCodeAt(0)] += 1; } let commaSeparatedCount = count.join(","); if(commaSeparatedCount in res) { res[commaSeparatedCount].push(s) } else { res[commaSeparatedCount] = [s]; } //console.log("res: ", res); //console.log("Object.values(res): ", Object.values(res)) } return Object.values(res); } Basically, I don't know Python, so I typed out line by line Neet's python solution to understand the output of each of the lines to reproduce this solution in Javascript. Thank you Neetcode!

@beyondlimits8159 2 жыл бұрын

do we know space complexity?

@sahildhawan22 2 жыл бұрын

@@beyondlimits8159 It should be O(26) because in worst case we can have combination against every letter.

@beyondlimits8159 2 жыл бұрын

@@sahildhawan22 But in worst case, we have unique words. And since our object maps our array of unique letters to a unique word, shouldnt the space be O(n), where n is length of our input array?

@amulyamakes Жыл бұрын

the join is so smart!

@DolaLado Жыл бұрын

thank you.

@imang3719 Жыл бұрын

Thank you @NeetCode for all the great videos! I just want to point out that the optimal approach is great for a large n but in case of this particular problem because n is limited to max of 100 (0

@endian675 Жыл бұрын

Agreed. A sorting-based approach is consistently a little bit faster in my testing.

@bossgd100 Жыл бұрын

👍

@MykolaPavluchynskyi Жыл бұрын

Also sorting based variant is more universal, because not limited by using only 26 characters. Adding uppercased letters won't double time complexity for example.

@rishabhjain4546 3 жыл бұрын

Thanks. Great work! Here is a solution without using defualtdict . I also used pretty print here to see the result . import pprint strs = ["eat" ,"tea","tan","ate","nat","bat"] result = {} for i in strs: key = [0] * 26 for c in i: key[ord(c) - ord('a')] += 1 if tuple(key) in result: result[tuple(key)].append(i) else: result[tuple(key)] = [i] pprint.pprint(result)

@roywastaken 2 жыл бұрын

I like this more! Good job.

@TheStrafendestroy 2 жыл бұрын

without importing the module you can use list(result.values())

@anshdholakia714 2 жыл бұрын

you can also use result[tuple(key)=result.get(tuple[key],[]) + [i]

@AbhishekKumar-cx1mo 2 жыл бұрын

@neetcode I think the O(m.n.logn) solution will always be optimum as log(n) will always be smaller than 26 in all of the cases

@efthymiosn3381 2 жыл бұрын

indeed lol

@qbertrtrtg 2 жыл бұрын

the average length of a string would have to be > 1.0e26. then neetcode's solution would be better. log(n) = 26 , n would need to be massive.. Neetcode is overlooking somethings here.

@gaureesha9840 2 жыл бұрын

Yes, correct, as max string length is 100 under constraints section

@ScaramangaG Жыл бұрын

On leetcode they are about the same for python3 with the sorting solution being a little bit better on the memory. sorting version - 104ms, 17.9mb counts array version - 104ms, 19.8mb.

@Kokurorokuko Жыл бұрын

@@qbertrtrtg 2^26, not 1.0e26, actually.

@nguyen-dev 2 жыл бұрын

According to the question, each string has only 100 chars => log(100) < 7 => O(mnlogn) < (7mn). Therefore, the first solution is much faster than the second solution, which is O(26mn). O(26mn) is better than O(m nlogn) only when the average string length is 67.108.864 chars.

@nguyen-dev 2 жыл бұрын

If max string length is very large and more than 2^26, here is my solution to combine both solutions: class Solution { fun groupAnagrams(strs: Array): List { // TO(m(26a + blogb)) / SO(mn) // m: strs.length // a: average length of elements which has 2^26 chars or more // b: average length of elements which has less than 2^26 chars // n: average length of all elements val hm = mutableMapOf() val switcherLength = 1 shl 26 // 26 could be adapted according to the # of accepted chars for (str in strs) { // O(m) val key = if (str.length < switcherLength) { // replace joinToString("") with concatToString() for better performance if using Kotlin 1.4 or newer str.toCharArray().apply { sort() }.joinToString("") // O(blog(b)) } else { val counts = mutableMapOf() for (char in str) { // O(26a) counts[char] = 1 + (counts[char] ?: 0) } counts } hm[key] = (hm[key] ?: mutableListOf()) + str } return hm.values.toList() } }

@lgami Жыл бұрын

First of all, thanks for your work. I coded first and second solutions in python3, solution 1 (using sorted strings) seems to perform better in speed use and in memory use on leetcode test cases.

@sola2943 6 ай бұрын

Just did this and you're right. According to leetcode, the ultimate solution of this video beats 27.88% solutions based on runtime and 5.36% solutions based on memory whereas the a sorted solution beats 87.87% solutions based on runtime and 84.54% of solutions based on memory. It varies but the second solution is never lower than in performance or space complexity (edit: just ran the non-sorted code again and it beat the an iteration of the sorted-code. I just read a reddit comment that says LC's % beat is basically a random number generator and I think that might be accurate). I'm still new at this but I suspect leetcode isn't very good at reporting how well a code actually performs.

@kushwanthaddepalli5236 4 ай бұрын

can i get the solution for this question using sorted strings

@kcprxkln 2 ай бұрын

@@kushwanthaddepalli5236 here you are anagrams = defaultdict(list) for word in strs: sorted_word = tuple(sorted(word)) anagrams[sorted_word].append(word) return list(anagrams.values())

@hassansuhaib9087 Жыл бұрын

I went with a different approach where I just check whether sorted(string) is in the dict, if not then just store an array against that sorted string in the dict

@Mustafa-099 Жыл бұрын

Such a clever solution, I wonder if I will get to this level of problem solving to come up with something similar on my own ._.

@kivanx Жыл бұрын

did you get

@Mustafa-099 Жыл бұрын

@@kivanx yes, but back then, I have been out of touch for DSA for quite a long time lol

@kivanx Жыл бұрын

@@Mustafa-099 why

@Mustafa-099 Жыл бұрын

@@kivanx I got caught up in full time job and have been prepping to go for grad school rn

@kivanx Жыл бұрын

@@Mustafa-099 same time? it must be hard af

@tomdarank1272 3 ай бұрын

For practical scenarios, I found that it's faster to sort each word. I actually got a better result on leetcode by sorting the words and then checking if they're in a dictionary.

@alexanderp7521 Ай бұрын

wow, your roadmap helps a lot - i've managed to solve this issue from the first try. thanks!

@shonsanchez6403 3 жыл бұрын

Great video! I want to point out that the group anagrams problem limits the string length to size 100 so the m*n*logn solution will be faster for all cases as the worst case would be for string size 100. m*100*log(100) -> m*100*2 < m*100*26.

@shan504 3 жыл бұрын

Thanks for the great point. Mind helping me? I don't get why 26 is 'technically' part of the complexity. Sure count has a length of 26, but inserting it into the dict is O(1), isn't it?

@shonsanchez6403 3 жыл бұрын

Hi @@shan504 , The best way I can explain is when inserting into a dictionary the dictionary has to do a look up to check if the key already exists in the dictionary. The look up portion of the insert is why we have 26 as part of our time complexity. Normally the key to a dictionary would be a primitive type (int, byte, short, long, float, double, boolean, and char) and during these cases a look up would take O(1). But as shown in the video the solution is using a list as it's key instead of a primitive type. Now when the dictionary does a look up it has to check each integer within the list(26 of them as opposed to 1) to figure out what the key needs to be. I hope this helps clear things up.

@shan504 3 жыл бұрын

@@shonsanchez6403 Hi Shon. Thank you! I was thinking that was the case, though I think in general, at least for an interview, I would just say the look up is essentially O(1). Thanks again :)

@TCErnesto 2 жыл бұрын

@@shonsanchez6403 the runtime is not O(26mn) but O(m(n + 26)). The O(26) comes before and after counting the characters, not for each character, so technically: 200m > 126m but this is incomplete. Sorting takes O(n log n) = 200 + convert the string to a tuple in O(n) = 100 + hash the tuple in O(n) = 100 = 400 The hash map approach takes: create an empty array of size 26 in O(26) + count each character in O(n) = 100 + convert the array to a tuple in O(26) + hash the tuple in O(26) = 178 400m > 178m so the hash map approach does less operations. All of this in the worst case.

@denysivanov3364 Жыл бұрын

@@TCErnesto but you can convert this hashmap into sorted string directly and vise versa. Hashmap can't be less then n log n this way.. If its faster this way we would have sorted using hashmap and converted it back. All O(N)"savings" here are from not counting operations using hashmap..

@manideeprepala6050 8 ай бұрын

dic = defaultdict(list) for s in strs: dic[tuple(sorted(s))].append(s) return dic.values() # this is one way of using built in function but what anna told is applies for all langs !

@slayerzerg Жыл бұрын

more efficient time space way would be to sort each str ele in strs and put the original form in a hashmap with hashmap values as lists of similar anagrams (ie map = { 'aet':['tea','ate'], 'abt':['bat','tab'], ... } then you just output the hashmaps values

@_thelegendaryduck_8523 Жыл бұрын

he mentioned this in the begining for the runtime within leetcode it definitely is faster but for over all big o notation has it as (m*nlogn) so if you have massive strings hundreds of thousands of characters his solution is much faster

@gopalakrishnan9610 Жыл бұрын

I did the same thing, The problem had a constraint where len(string)

@dimakavetskyy2082 2 жыл бұрын

awesome but I wish you went through one example step by step. i bet it's simple but it takes me an hour to fully understand the problem :(

@mlevvy96 7 ай бұрын

Wow, wouldn't have think of ASCII solution, I think sorted one is a bit simpler to understand and come up with and also for this particular LeetCode tests performance looks the same, but maybe for very long strings your proposed solution would pay back. Thanks for sharing!

@zZ-yt1yx 2 жыл бұрын

The type of res.values() is dict_values. You might need list(res.values()) to avoid type error if you encounter one.

@olivertang901 2 жыл бұрын

thanks bro

@josealvarezpulido Жыл бұрын

I was able to come up with a solution before looking it up :) not optimal but I was glad I was able to solve it. Thanks in part to your videos. def groupAnag(list): result = {} for item in list: itemResult = [] for letter in item: itemResult.append(letter) itemResult.sort() itemResult = ''.join(itemResult) if itemResult in result: result[itemResult].append(item) else: result[itemResult] = [item] my_list = [i for i in result.values()] return my_list

@martimconstantino2465 2 жыл бұрын

If we sort each string, the time complexity would be m*n*log(n). In the presented solution, it is m*n*26, which is represented by m*n. However, according to the constraints on leetcode, n

@khanhchung5207 2 жыл бұрын

7 and 26 doesn't matter for a computer can calculate 10^9 computations in a second. Basically, both provides similar complexity.

@pinakadhara7650 Жыл бұрын

You shouldn't stick so much to the constraints.

@HelplessFangirl Жыл бұрын

If you're like me and doing this cause you couldn't figure it out on your own, I usually take notes on my code in the comments so I understand the logic when I come back to it and to solidify it in my head. hope this helps!

@bigkurz 2 ай бұрын

Bro. I've been an engineer for like five years now, and you are so much better than me at this. I've decided to become really good. I want to be able to do this shit. I'm not going to burn myself out, but I'm going to train neetcode 150 until i can do all 150 problems without looking up solutions. Might take me a whole year or more, but i dont care. I'm going to be able to solve these. I know for a fact it will make me a better problem solver and engineer.

@souravbanerjee6457 2 жыл бұрын

Maintain a dictionary and a list, dict pairings (sorted_string: length of list), check if sorted string is already in dict if not just append it to list in list[list[]] type, if found take the length of list from dict as it is the position of sorted string and append it to the list inside the parent list, fastest method so far.

@jeffnguyen91 3 жыл бұрын

Thanks NeetCode! Great github link

@John7777tjk 3 жыл бұрын

As a developer with 5+ years of experience, I usually work a lot with comparing asc characters that's why it makes sense for the companies to throw this to the interview...Precious coding interviews

@swaroopacharjee2767 2 жыл бұрын

I have made a hashmap where I have put the sorted version of the string. For every element in the list, I check it it’s sorted version exists in the hashmap or not, it it exists, then I append it to the list to the respective sorted string, otherwise create a new one. Once done, I create a list of list of the values present in the dictionary. I have given my implementation in python below. Thank you for your explanation. :) :) class Solution: def stringSorting(self, inp: str) -> str: inp = sorted(inp.lower()) return "".join(inp) def groupAnagrams(self, strs: List[str]) -> List[List[str]]: pairs = {self.stringSorting(strs[0]):[strs[0]]} result = [] for i in range(1, len(strs)): tmp = self.stringSorting(strs[i]) if tmp in pairs: pairs[tmp].append(strs[i]) else: pairs[tmp] = [strs[i]] return [val for key, val in pairs.items()]

@coding10yearold 2 жыл бұрын

if you want the LeetCode's exact output (if order mattered) then the following code does it: from collections import defaultdict class Solution: def groupAnagrams(self, strs: list[str]) -> list[list[str]]: res = defaultdict(list) strs.sort() for s in strs: count = [0] * 26 # a ... z for c in s: count[ord(c) - ord("a")] += 1 res[tuple(count)].append(s) return sorted(list(res.values()),key=len)

@alexistm Жыл бұрын

This one worked! , the one on the video did not, it groups words that are not anagrams

@jyotirmoy-paul 2 жыл бұрын

It is given that n can be max of 100, and log(100) is 2 making the O(n·m·log(n)) solution faster than O(n·m·26). Let me know if I am thinking correctly.

@SuubUWU 2 жыл бұрын

Well technically speaking, when we use log, it's a log base 2. Your solution of 2 only works if you're using the default log base of 10. Either way, we get 6 and change when using log base 2, which is faster than a string input of 26. This solution of 26 is ONLY better if we're using strings roughly larger than: 100,000,000 log_10(100) = 2 [Log base 10 of (100) equals 2) log_2(100) = 6.644 [log base 2 of (100) equals 6.644)

@dorondavid4698 3 жыл бұрын

Isn't the overall time complexity for the sorting approach even longer than M*NlogN because you also have to do a string compare which is check N characters to N characters? Or is that factored into the M portion of the time?

@TCErnesto 2 жыл бұрын

if by string compare you mean converting the string to a hash map key, this is not true as the keys are hashed, there's a hash number associated to each string so the hash map doesn't compare strings

@X_platform 7 ай бұрын

Since big O is an expression of worst case compute complexity, @7:57 n is the 'longest' string. (instead of the 'average' length)

@Vastasoceans7532 10 ай бұрын

Here's a slightly shorter version of the same thing : class Solution: def groupAnagrams(self, strs: List[str]) -> List[List[str]]: res = defaultdict(list) for s in strs: res[frozenset(Counter(s).items())].append(s) return res.values()

@Codershaka 8 ай бұрын

Thanks neetcode lovely solution.I was going the opposite way using strings as keys and overcomplicated things thanks.

@koyorain4899 2 жыл бұрын

Even if the complexity of insert the "count" into the dictionary is o(n), isn't it still be O(m * n)? as the inserting happens after the inner loop. I do not get the O( m *n*26) part. I thought it is O(m * (n + 26))

@TCErnesto 2 жыл бұрын

inserting into a hash map is always O(1), but yeah the total runtime is O(m(n + 26)) = O(mn)

@shonsanchez6403 2 жыл бұрын

The reason why it’s O(m*n*26) is because of the lookup time for a dictionary. Because the key is no longer a primitive (int, char, float etc) but instead an array of characters the look up time is no longer 0(1) but O(size of the array I.e 26) because the map now has to compare each element of the key to check for its equality.

@shonsanchez6403 2 жыл бұрын

@@TCErnesto inserting into a hash map is not always O(1) that is only the case with a primitive key (ints, bool, char) but not with keys that are complex data structures like arrays and objects because of the hashing algorithm which will now take more than one element into account to create a hash key. In addition the dictionary has to compare the key with input to prevent a hashing collision and overriding existing data.

@TCErnesto 2 жыл бұрын

@@shonsanchez6403 yeah hashing an array will take O(x) where x is the size of the array. In this case hashing the tuple will take O(26), but this operation takes place after counting the characters which takes O(n). The tuple is not being hashed every time a character is counted, that would be the time you mention, O(26n). But first the chars are counted, and then only after that, the tuple is hashed, therefore the runtime is O(n + 26).

@shonsanchez6403 2 жыл бұрын

@@TCErnesto I believe you’re right about the complexity. Must have been thinking of the algorithm different in my head 😅 but looking at the actual code cleared things up. Good discussion.

@malymohsemahmed7032 3 ай бұрын

res = defaultdict(list) creates a defaultdict object named 'res' where each key will have a default value of an empty list. This is particularly useful when you want to append items to lists within a dictionary without having to initialize the list manually for each new key.

@PabitraPadhy 3 ай бұрын

Not sure how keeping the array as a key works. I just did an ansi sum of each character for the string as there could be a max of 100 characters in the string as per problem constraint, lowercase z being 122 * 100 = 12200 within the range of an int. so my map was int, vector still O(m*n) in time and O(n) in space in worst case.

@rahmansahinler1 Жыл бұрын

Maybe we can get rid of mapping strings with integer list and use sorted keys. Something like below worked for me: ans = defaultdict(list) for str in strs: sorted_str = sorted(str) ans[tuple(sorted_str)].append(str) return ans.values()

@fullnelsoncodes 9 ай бұрын

I think that this is the best solution

@onyandoonyando187 2 жыл бұрын

Instead of counting characters at all the 26 indices and checking for similar counts , I used sorted() with O(nlogn) time and went directly to appending.

@FilthySnob Жыл бұрын

Thats what i did before this vid as well but Thats still m.nlogn isnt it

@rohitsai6993 2 жыл бұрын

What is the space complexity of this solution? Will it be constant because we're using a single array of count 26 and we can reuse that array for every word? Or is it O(m*n) where m is the number of strings and n is the average length of the string?

@classified022 2 жыл бұрын

If every word is unique you need to store a count of that word in the dict, so it would O(26 * n) or O(n) where n is the number of strings. In this case the average length of the word does not matter since we compress that info down to 26 ints. Note if we clone the words its O(n * m) though if you had very long words you would likely store pointers to the words rather than duplicate the words

@jashwanthgottipati9262 7 ай бұрын

Hey Navdeep, just a small suggestion. In fact this could be a feature I think that would be useful on neetcode. Instead of giving a unsolved problem when you click shuffle, can you add a feature call revise which gives us a random solved problem so that we can re-do a solved problem?

@RohithMusic Жыл бұрын

You automatically assumed that sorting the alphabets of each string is O(nlog(n)). This only applies for a comparison based sort. However, you can sort using counting sort which will end up taking O(26) + O(n) time (or O(n)) plus O(n) additional space. If we serialize the output of counting sort as character followed by count, instead of constructing the whole sorted string, we will end up with exactly the same key as you got. A hashmap can then be used as usual to finish the problem.

@Reza1984_ Жыл бұрын

I initially thought about it and discarded it, but it actually beats the array or hashmap as a key solutions, nice!

@parkourbee2 9 ай бұрын

TIL about counting sort, wow.

@deepakthakur8781 2 жыл бұрын

I did this in O(n) time by creating my own hash function where elements should be same but their order does not matter (still failing some testcases probably due to collision) but it was so much tiring and this simple solution is so much better.

@gopalakrishnan9610 Жыл бұрын

Interesting, can you share your hash function. How did you manage to ensure no collisions?

@deepakthakur8781 Жыл бұрын

@@gopalakrishnan9610 I dont remember now exactly, but I used the ascii codes of alphabets with their position in word as a multiplier, still dont remember the whole hash function,

@haezsimpson3799 10 ай бұрын

I've firstly solved this problem with binary search by answer. Because we know how to check if two strings are anagrams, so we can use is_anagram(a: str, b: str) -> bool function in our binary search for the input array. But the time complexity is bad for this solution.. :)

@saisundar6020 2 жыл бұрын

class Solution: def groupAnagrams(self, strs: List[str]) -> List[List[str]]: sorted_map = defaultdict(list) for item in strs: sorted_map[tuple(sorted(item))].append(item) return sorted_map.values() I thought of this way

@faisalabdulkadir9439 2 жыл бұрын

Pls can u explain the function of the default dict to me ??

@icedmosha6375 2 жыл бұрын

Amazing solution, learnt a new pattern and new method today! thanks

@StoriesByDrew Жыл бұрын

I originally tried to solve this by creating a hash using the char codes of each char in a string and using the hash as the key to group the values. I had a ton of collisions as my hashing function was crappy. Was wondering if you'd revisit this and solve with hashing?

@olamilekanadebowale2804 Жыл бұрын

anagram_groups = {} for string in strings: canonical = ''.join(sorted(string)) if canonical in anagram_groups: anagram_groups[canonical].append(string) else: anagram_groups[canonical] = [string] return list(anagram_groups.values())

@saladuit Жыл бұрын

I have a question for the M * n* log(n) solutions. I also came up with that, but one has to return the unsorted strings. So I am assuming that you also made a deep copy of strs to not fiddle with the order of characters?

@kryddan 10 ай бұрын

I don't think I could ever have come up with the NC solution during an interview. Best I could do was using the hash() function: def groupAnagrams(strs: List[str]) -> List[List[str]]: hash_index_map = collections.defaultdict(list) for i, word in enumerate(strs): hash_value = 0 for char in word: hash_value += hash(str(ord(char))) hash_index_map[hash_value].append(strs[i]) return hash_index_map.values() This passes all the LC checks, though of course it isn't foolproof due to the possibility of distinct words adding up to the same hash value, so I doubt it would fly during an interview. IRL I'd probably just do: def group_anagrams(strings): res = defaultdict(list) for s in strings: res[tuple(sorted(s))].append(s) return res.values() And call it a day.

@ZackInTech 6 ай бұрын

This solution looks better def group_anagrams(words): # Step 1: Create an empty dictionary anagram_groups = {} # Step 2: Iterate through each word for word in words: # Step 3: Sort the letters of the word sorted_word = ''.join(sorted(word)) # Step 4: Check if sorted_word is in the dictionary if sorted_word in anagram_groups: # If it is, append the word to the list of anagrams anagram_groups[sorted_word].append(word) else: # If not, create a new key with sorted_word and set its value as a list with the word anagram_groups[sorted_word] = [word] # Step 5: Return the values of the dictionary return list(anagram_groups.values()) # Test the function print(group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]))

@0xDolve 5 ай бұрын

Thank you so much

@falcan7752 5 ай бұрын

def group_anagrams(strings): anagram_groups = {} for string in strings: canonical = ''.join(sorted(string)) if canonical in anagram_groups: anagram_groups[canonical].append(string) else: anagram_groups[canonical] = [string] return list(anagram_groups.values())

@razewithodin Жыл бұрын

I'm confused. What about for a test case like ["ab","c"]. ab and c both = 3 so they would just go in same list even though they are not anagrams?

@macmac1283 11 ай бұрын

Great video!! just wondering, I see the point of the time complexity mn. However I keep wondering you will need a word of over 10^26 characters for mnlogn to become less efficient than (mn26)->mn. 10^26 is pretty much infinity already, I mean only for strings over 10^26 characters it would be better to do it this way. for instance, if n =10, nlogn =10 and n26=260, if n=10,000,000 nlogn = 70,000,000 and n26=260,000,000. I'm probably misunderstanding, could you please share some insight on this? Thank you!

@andrewferreira1261 5 ай бұрын

Could we just use the Counter method for strings or does the ordering of how the hash map comes out individually for each string make that unusable here?

@kirillzlobin7135 Жыл бұрын

Amazing... How come you can get to these ideas. It is simple, but it is hard to get to these solutions without hints :)

@TumblinWeeds 2 жыл бұрын

I kinda hate the optimal solution. I was like _this_ close to coming up with this answer, except dictionaries aren't hashable, and I couldn't figure out how to compare letters on the spot other than spending an hour making a letters dictionary ({a:0, b:1} etc). I feel like most people wouldn't come up with ascii on the spot in an interview. This question relies entirely on already knowing the "trick" so to speak

@ReadTheUnderstory 10 ай бұрын

Why do we use the average size of the strings in the Big O Time Complexity, rather than, say, the size of the largest string?

@TheMakownage Ай бұрын

My solution, which doesn't use defaultdict, and uses squares to avoid collisions: dict = {} for str in strs: hash = 0 for c in list(str): hash += ord(c)**2 dict.setdefault(hash, []).append(str) return dict.values()

@rafaftp5044 4 ай бұрын

how am i supposed to come up with this on the spot?

@aliquewilliams3080 10 ай бұрын

There's no need to subtract the ascii value of each character from A's ascii value. Just use the characters ascii directly as the key in the hashmap, so change *count = [0] * 256.*

@pythoncoding1092 10 ай бұрын

But why? It will increase memory costs

@aliquewilliams3080 10 ай бұрын

@@pythoncoding1092 A memory increase of 1840 bytes is completely insignificant. The increased code complexity is probably a bigger overhead.

@kenmatsuru3470 2 жыл бұрын

Should we trade the term log(n) to 26 in the time complexity? Is the test case that large?

@sp33r 2 жыл бұрын

Constant multiple like 26 does not contribute to big O complexity but multiplying by log(n) does. Assume large test cases when thinking about time complexity

@DavidLalo Жыл бұрын

@@sp33r I think You underestimate how small log(n) is. 2^26 = 67 million. I feel confident that the case wouldn't be that large, and that's just for breaking even

@sweetphilly2 Жыл бұрын

yeah I would've never thought of the ASCII value stuff. Twas thinking about going with Counter and using the result as a key if possible

@mahadreamz 2 жыл бұрын

Loved your explanation! Very crisp and neat.. Way to go.What would be the space complexity for this problem.

@Jimmy-ng3ci 2 жыл бұрын

It should be O(N) for the dictionary in which the worst case you have a key to value pair for each word in the list. The array is not counted because that is always a constant of 26 so becomes O(1) at the end which is less than O(N)

@chandlerbing8164 7 ай бұрын

Joye just noticed : constrains says length of word is going to be max 100 so in that case sorting (logN) is faster then 26 times looping.

@shinygoomy2460 2 жыл бұрын

Any able to do this solution cleanly in cpp? With using the set as a key for hash?

@jugsma6676 5 ай бұрын

This is with O(m*n*logn) solution: def groupAnagrams(strs: list[str]): sorted_strs = [''.join(sorted(s)) for s in strs] res = {} for i, s in enumerate(sorted_strs): if s in res: res[s].append(strs[i]) else: res[s] = [strs[i]] return list(res.values())

@latinochild3131 2 жыл бұрын

is there a reason why when i ran the first method suggest it gave a better time, than the method that was used as a solution?

@SuubUWU 2 жыл бұрын

Sorting time for the first method is log(n) This gives us a run time of O(m * nlogn) or O(m * n * logn) The second solution gives us a run time of O(m * n * 26) Meaning that our first solution is faster in cases where: logn < 26 We know from our Constraints that: 0 26 n > 67,108,864 Which even in the real world, i find it hard to imagine a string larger than 67 million.

@denysivanov3364 Жыл бұрын

[0] * 26 I guess this vector representing a..z number of a..z can be directly converted into sorted string.. (aabbbcccc => {0: 2,1: 3, 2: 4}) So its basically exactly the same == should be same nlogn if written in same language. If its faster then we discovered sort algorithm faster than n log n >.< Which is not likely to be the case.

@crankitsourav8686 Жыл бұрын

3:42 there is also one more constraint that 0

@balamageshkumar3048 2 жыл бұрын

Easier Code ig d={} #sorted string : list of anagrams for word in strs: key="".join(sorted(word)) if key in d: d[key].append(word) else: d[key] = [word] return d.values()

@jwastken8814 2 жыл бұрын

underrated comment

@simply.nonchalant6006 3 ай бұрын

Here's a more intuitive key-value pair that I used. KEY --> sorted word (just call sorted() on a word in strs) VALUE --> same as Neet's soln, the list of anagrams (not sorted) like in the video, you'll need to convert the keys to tuples

@sprolltobias4443 Жыл бұрын

shouldn't the first solution be faster then the 'optimal' solution for any string array with reasonable average string length? Just beacuse 26 >> log_2(n) up until n is O(10^6)?

@dibehemoth401 10 ай бұрын

What if we do not know about of what letters are our words. We can then use HashMaps for counting and as our keys, right?

@mruduladdipalli5417 Жыл бұрын

One doubt isn't the sorting of string will be of constant time? considering we have characters form a-z ?

@EE12345 9 ай бұрын

One thing I noticed in Python. If you do res += s, it appends each character of the string to the list separately. Only with append(s) does it add the entire string to the list. Why is that?

@SMBlolwut 5 ай бұрын

In CPP, is there any elegant solution to this? I can't make the HashMap with the array due to some limitation. The go to solution it seems is to literally just use the sort algorithm and hashmap with a string of the sorted words.

@treya111 6 ай бұрын

for the second solution, I dont get why time complexity is O(26mn)? Isn't it just O(mn)? Or is the author considering the cost of changing that count list (is it a big deal of cost to change the value of a list in python?) ???

@tusharsnx 10 ай бұрын

It's wild but in this case O(M * N * logN) is better than O(M * N * 26). Because in order for logN to be 26, you need N to be 10^26 which is practically impossible. That's why I consider logN to be constant because it literally is (in the programming context).

@e-techinfo9313 8 ай бұрын

Good explanation, thank you.

@rajchinagundi7498 8 ай бұрын

Easier solution, "from collections import defaultdict class Solution: def groupAnagrams(self, strs: List[str]) -> List[List[str]]: d=defaultdict(list) for st in strs: key=''.join(sorted(st)) d[key].append(st) return d.values() "

@catcoder12 8 ай бұрын

Interestinly, the sorting solution beats the non-sorting ones for some reason. Even I used the non-sorting solution and was wondering why my solution is not better than the ones who used sorting.

@swapnilkumarsahoo3591 5 ай бұрын

This solution just blew my mind

@TadesseDev Жыл бұрын

Nice explanation @NeetCode. Mind me but I have something to discuss, I'm not a master at this, just a thought I'm having. Considering we have 100 characters in a substring n, it will take us 100 characters to check the character and save it in the hash instead of just 26, wouldn't it be better to sort it with "log-n". I believe this might make the worst memory usage, since instead of counting, this time we will be saving the whole string.

@eddyqiu8883 Жыл бұрын

sorting should be nlogn

@ravi72munde 10 ай бұрын

I tried to replicated this in Java and for anyone else who tries as well. int [] won't work has a HasMap key because it has no hashCode implementation of it's own. Using ArrayList is key is an option but I found it to be slower

@yrrep27 Жыл бұрын

Set up a dictionary and output list. Iterate thru the input strings. For each input string, create a separate sorted string. If the sorted string is a key in the dictionary, append the input string to a list in the key's value If the sorted string is not a key in the dictionary, set up a new key value pair where the key is the sorted string, and the value is a list holding just the input string. Iterate thru the values of the the dictionary, append them to the output list. Return this output list.

@LawrenceAaronLuther 6 ай бұрын

way over my head, to just over my head, to within reach if I jump. many thanks

@none0n 2 жыл бұрын

As a JS developer, I was unable to understand this. Is there any chance to get the JS solution?

@SandeepKumar16 2 жыл бұрын

@babatundeololade6765 2 жыл бұрын

This is my JS solution var groupAnagrams = function(strs) { const hashMap = {} for(let i = 0; i < strs.length; i++) { const count = new Array(26); count.fill(0); const word = strs[i] for(let j = 0; j < word.length; j++) { count[ word.charCodeAt(j) - "a".charCodeAt(0) ] += 1; } if(count in hashMap) { hashMap[count].push(word) } else { hashMap[count] = [word] } } return Object.values(hashMap) };

@yohanmathew2013 Жыл бұрын

gosh when I tried it myself my solution was so much longer and tedious, but your solution looks beautiful lol

@nathannguyen6310 9 ай бұрын

Can't the O(m*n) solution also be n^2 in the worst case? Since m is the number of words in the string, and n is the avg length of each word, couldn't we have a case where the length of each word is as long as the list which leads to an n^2 worst case time complexity? Or would this not happen since n is the AVG length of each word. My question might not make any sense but if anyone can answer I would greatly appreciate it

@yeskez3470 2 жыл бұрын

Amazing solution! I was here thinking after 30 minutes of brainstorming, "how's he going to approach this one". 😆 p.s If you don't want to use the imported package def groupAnagrams(self, strs: List[str]) -> List[List[str]]: store = {} for word in strs: count = [0] * 26 for char in word: count[ord(char) - ord('a')] += 1 try: store[tuple(count)].append(word) except: store[tuple(count)] = [word] return list(store.values())

@nirupamsuraj1634 2 жыл бұрын

Or you can always use the get method to deal with uninitialised values, like - store.get(tuple(count),[]) store.get(tuple(count).append(word)

@Yougottacryforthis Жыл бұрын

You can sort a string s in o(|s|) time and o(1) time complexity, so I went with first solution: def groupAnagrams(self, strs): """ :type strs: List[str] :rtype: List[List[str]] """ classDict = {} for string in strs: #cannoinze letters = list(string) letters.sort() key = join(letters,",") if key not in classDict: classDict[key] = [] classDict[key].append(string) return list(classDict.values())

@leetcodemonkey 9 ай бұрын

class Solution: def groupAnagrams(self, strs: List[str]) -> List[List[str]]: hash_map = defaultdict(list) for word in strs: key = ''.join(sorted(word)) hash_map[key].append(word) return hash_map.values()

@michelef406 Жыл бұрын

Referring to around 2:50 in the video, does the count string looks something like this [1,1,1,0,0, ... 0] for the string "abc"? Just wanna check my understanding

@ruslankireev6421 Жыл бұрын

yes, thats true!

@godcomplex8251 5 ай бұрын

The thing that made this solution difficult for me to come up with is that I was using a hashmap for the frequency map and you can’t use a hash map as a key for another hashmap. This idea of using an array of size 26 for the letter count is actually pretty non intuitive, even then I didn’t know a tuple could be used as a hashmap key…

@sunlau-k5b 11 ай бұрын

In c++ ,you can use an integer as keys. And use bits operation to represent the words contains what alphabet. No matter what sequence the alphabet comes in.