Find the intersection between arrays: Coding Interview Question

Рет қаралды 56,141

Күн бұрын

Пікірлер: 103

@johndebord7802 6 жыл бұрын

If there are no duplicate numbers per array, I believe using a hash table to count the occurrences would be a good solution that would work for sorted and unsorted arrays.

@MorisonMs 5 жыл бұрын

It works also for duplicate elements, you just need to consider the minimum value out of all hash tables.

@mandeepubhi4744 5 жыл бұрын

I was thinking the same, dynamic programming (we can record occurrences)

@rakshitar4779 4 жыл бұрын

You can do it for duplicates as well. Subtract 2 from the total count (for count>=3) and append the corresponding number of times the element to the output array. For example, if the count is 3, you will append it once (3-2), if there is a single duplicate, the count will be 4 so you will append it twice (4-2) and so on.

@havalopez 3 жыл бұрын

@@rakshitar4779 you could also use strings. So you would have a table that shows something like 2 => “101” meaning it was in the first and third array. You could use the same representation with integers and anything >= 11 is duplicate or subtract 100, 10, 1 to figure out which arrays it was in.

@osquigene 6 жыл бұрын

If the input lists are unsorted then you can count (for each list) the number of occurrences of every element that appear in the union of the three lists. For example, if your lists are: L1 = [2,6,6,9,11,13,17] L2 = [3,6,6,7,10,13,18] L3 = [4,5,6,6,9,11,13] you will have in your counts: - 2: [1, 0, 0], - 3: [0, 1, 0], - 4: [0, 0, 1], - 5: [0, 0, 1] - 6: [2, 2, 2] By definition there is an intersection if all elements in a given count list are all greater than 1. If you do a strange intersection like suggested in the video then the number of intersection is equal to the minimum of you count. Time and memory complexity become O(kn) where k is the number of lists. In general this is a faster solution than sorting as long as k = o(log n). ------------------------------------------------------------ from collections import defaultdict from random import shuffle L1 = [2,6,6,9,11,13,17] L2 = [3,6,6,7,10,13,18] L3 = [4,5,6,6,9,11,13] # Total: n elements in k lists inputs = [L1,L2,L3] shuffle(L1) shuffle(L2) shuffle(L3) def intersection(lists): nb_lists = len(lists) count = defaultdict(lambda:[0]*nb_lists) # time: O(kn) (the k factor is hidden in the defaultdict that create a when new element value occurs), memory: O(kn) for i in range(nb_lists): for e in lists[i]: count[e][i] += 1 # time: O(kn) for value, frequencies in count.items(): for j in range(min(frequencies)): yield value print([e for e in intersection(inputs)]) ------------------------------------------------------------ Which prints [6,6,13] (note that 6,6 and 13 appear in arbitrary order).

@osquigene 6 жыл бұрын

If you only want a normal intersection (which return a single "6" here) you can have a bitmap instead of your count and save some space.

@serhiypidkuyko4206 6 жыл бұрын

let len(lists[i])=n_i, len(lists)=k, n_m=minimum(n_1,...,n_k) then the memory complexity will be only the n_m because you only need the dictionary with keys=lists[m] (these and only these elements can belong to the intersection)

@osquigene 6 жыл бұрын

I'm not sure to understand your suggestion. From what I understand: "you only need the dictionary with keys=lists[m]" -> "only the shortest list has interesting keys". Which is obviously true, but it's also true for all the other lists. If you had: L1 = [2,3,4] L2 = [2,2,2,3,3,3] L3 = [2,2,2,2,2,2,2,2,2] Then L1 is the shortest list, but it's the one that has the most different keys and therefore the one that would need most storage in the dictionary (if done correctly). If you wanted to do something better you should take the list that has the minimum number of keys. In the end that would allow you to save memory in some cases, but the complexity would remain the same, for example in this case: Li = [1, 2, ..., n] ... Lk = [1, 2, ..., n] You could try to optimize but in the end the complexity would be O(kn), unless you store things in a smarter way (which I don't say is impossible).

@serhiypidkuyko4206 6 жыл бұрын

ok, once more: let all the lists has the same length. so when you say that memory complexity is O(k*n) you are not correct. the memory complexity is ONLY O(n). because you ONLY NEED to add to the frequencies dictionary the keys with are the elements of ONLY ONE list (whichever you choose) (the intersection set is ALWAYS the subset of ANY of its items). So if you have the lists with different lengths you can choose the one with the minimum length (in your example above it will be 3). I don't need to analyse the content of L2 or L3 (to get less than 3 elements ). I only assert that the memory complexity is ("less or equal") O(n) and NOT O(k*n). For example: if you have 100 lists with of 10 elements every my assertion is the dictionary ALWAYS NEEDS ONLY (

@osquigene 6 жыл бұрын

Ahhh I think I get what you say this time! What I understood is: Instead of starting with an empty dictionary I should rather start with a dictionary with a fixed set of keys equal to those of one of the lists keys (say list L_m which indeed can be selected based on |L_i| only). By doing so I would have at most |L_m| different keys. And therefore |L_m|*k lists inside my dictionary. If we define n = min(|L_i|) in then end the complexity would be O(nk) (without any supplementary change). (Note that in the worst case, all lists have exactly the same size.) Maybe we got confused because my first solution actually was O(nk^2) and not O(nk) (my bad). But saying that we could change the dictionary to behave like 'min' instead of being a lists in which we add everything to be reduced in the end. I don't even know why I didn't think about it, but it have to admit it sounds kind of obvious xD ... This would give O(n) as we would only store a single value per key (when reading list L_i, then for any key x and j

@watcher_109 5 жыл бұрын

If the arrays are not sorted, we can use another approach: Keep the smallest array unsorted. Sort both the other arrays Traverse the smallest one and binary search for all its elements in both the 2nd and 3rd array, if found in both, store it in result. Time Complexity O(klogk) , k size for largest array

@mohamedkandeel6553 4 жыл бұрын

what about duplicates ?

@robinverma6901 4 жыл бұрын

it will work for duplicates as well... but in above sol time com is O(a1+a2+a3) 'a' rep size..

@robinverma6901 4 жыл бұрын

if duplicates are not there then mapping will do it in O(n) n is the size of largest ar

@Career_Growth_Teaching 6 жыл бұрын

Great solution if the arrays are sorted. If the arrays are not sorted, then we could 1. Iterate arr2 and arr3 and put elements respectively in two hashSets called set2 and set3. 2. Now iterate over arr1 and check if set2 and set3 contains the element. If yes then collect them into a collection lets say another set called intersectionSet. time Complexity=O(n). Space complexity O(n)

@raunakvijan5739 6 жыл бұрын

dhirendra sinha that won't work for duplicate elements

@marriagedance8657 6 жыл бұрын

we can use a hashmap and take the count of elements as the values of the keys

@sachinkalkur 5 жыл бұрын

@@marriagedance8657 Here we would be added additional space complexity.

@sachinpaul2111 6 жыл бұрын

Very important suggestion/part of these interviews. You need to add time complexity analysis. Let n = max(x,y,z) where x,y,z are the lengths of these arrays. For example , the brute force for this problem is O(n^3) yeah? Now sorting and doing the 3 iterators brought it down to O(nlogn) which is an amazing reduction. If we had 2^10 elements in the biggest array for example , your algorithm would only take 2.8 hours compared to 298261 hours otherwise.

@Eschen_ 6 жыл бұрын

You can't calculate the time that easily. Did you assume one line per second or something? I'm pretty sure in reality the faster one would take less than a millisecond for your example.

@KhanSlayer 6 жыл бұрын

The arrays were already sorted so the solution is O(n).

@sauravaggarwal98 5 жыл бұрын

We can use set. We will make 3 sets of 3 arrays and iterate through first array and everytime checking if that element is present between the sets

@volekvolek1 6 жыл бұрын

Use hasmap. Read arrays; Value of each array would be key in hashmap, go to the key or create a new key , increment value by one . At the end print any hashmap key which their value is equal to number of arrays(3 in this case)

@ankurharna1 5 жыл бұрын

it will not work as one array can have repeating elements...

@Bojl95 6 жыл бұрын

hey man thanks for the videos, they're very helpful! Can I just make one request/suggestion? Maybe show an example of how to handle a situation in which you don't know the answer / the answer doesn't come to you immediately, just to show how to handle that situation.

@NeymarJr-uj1wf 6 жыл бұрын

yes, great idea. We would be glad, bro

@ScarzChosenspokesmen 5 жыл бұрын

Exactly, its so easy to show a rehearsed video

@yadneshkhode3091 3 жыл бұрын

just google

@xVermyy 6 жыл бұрын

If the three arrays were not sorted, you could use three hash sets to store elements you've seen so far. Sure, it's not very space efficient, but it would allow you to solve it in O(n + m + k), n, m, and k being the sizes of the three arrays.

@maryamr.aliabadi8091 5 жыл бұрын

Using hash map, we are also able to find the intersection of three unsorted arrays with time complexity of O(n1+n2+n3).

@kamleshsilag5332 5 жыл бұрын

But then space complexity will become O(n1+n2+n3) .

@pradipnitw 6 жыл бұрын

Hey Irfan, how about binary searching the largest of the 3 indexes in the other 2 array... this way we can skip a lot of elements and overall complexity would order of N1 + 2lognN1 ( assuming N1 is the largest size )? This method would also give us fast exits ... in case of edge cases.. wdyt ?

@TheBestNameEverMade 5 жыл бұрын

I was also thinking about binary searching on every loop to find the next token. You could also do it from both ends. In the best case it would be Log(L+N+M) however the worst case where they all are the same values would be O((L+N+M)log(L+N+N)).

@manto7 4 жыл бұрын

Why not to use a for loop for iterating over first array. On each iteration, check if the current number of the target array is in other two arrays, if so push the number to results array. So it would be something like, for item in arrayA: if item in arrayB and item in arrayC: result.append(item)

@jugsma6676 5 жыл бұрын

My Implementation, that works for Sorted and Unsorted Arrays: private static Set findInterection(Integer[] arr1, Integer[] arr2, Integer[] arr3, int val){ Set set1 = new HashSet(Arrays.asList(arr1)); Set set2 = new HashSet(Arrays.asList(arr2)); Set result = new HashSet(); for (int i=0; i

@flavmusic 5 жыл бұрын

Using an array of lists (instead of having the 3 lists by argument), which is almost exactly the same thing : def intersectValues(lists): k = len(lists) hashtable = {} output = [] for lc in range(0, len(lists)): list = lists[lc] for i in range(0, len(list)): number = list[i] if number not in hashtable: hashtable[number] = [] for a in range(0, k): hashtable[number].append(1 if a == lc else 0) else: if hashtable[number][lc] == 0: hashtable[number][lc] = 1 for number in hashtable: if min(hashtable[number]) == 1: output.append(number) return output print intersectValues([[2, 6, 9, 11, 13, 17], [3, 6, 7, 10, 13, 18], [4, 5, 6, 9, 11, 13]])

@aribakodes5453 3 жыл бұрын

I think I am in love seriously... This is how I wanna keep myself through a high stake interview

@PALTUBABY 4 жыл бұрын

very nice technique with a linear time complexity. also works for finding intersection of 2 arrays. i have not tried. buy may be this technique will work for intersection of more than 3 arrays. thanks for the tutorial :)

@jordancoppert9894 6 жыл бұрын

Provided that you're using a language in which array length is stored as a property of the array (like in Java), or if you computed the lengths of the arrays prior to the loop and passed them into OOB, I believe the time complexity would be O(min(n1, n2, n3)). I'm assuming here that the summation of the array lengths being the time complexity in your example comes from the necessity to determine the lengths of the arrays in OOB(). The while loop should fail as soon as the end of the shortest array is hit.

@rahulpillai1271 5 жыл бұрын

I used two hashsets to solve this. Is this considered efficient ? : public static void main(String[] args) { int[] arr1 = {2,6,9,11,13,17}; int[] arr2 = {3,6,7,10,13,18}; int[] arr3 = {4,5,6,9,11,13}; HashSet set1 = new HashSet(); HashSet set2 = new HashSet(); for(int i=0;i

@shensean993 6 жыл бұрын

No matter sorted or not. 1 use hashmap 2. Key and value(counts) 3. return all the keys with value==3

@shensean993 6 жыл бұрын

getOrDefault

@Al-wt5kf 5 жыл бұрын

A optimization is to find max of the first elements, min of last elements to clip array.

@hrqnogueira 6 жыл бұрын

This is awesome! I would advise you to make the indentation of your code clear. It is not the case for this video, but I have seen others that the lack of indentation makes it difficult to understand.

@johncanessa2250 6 жыл бұрын

Try the following: static int[] arr1 = {1, 2}; static int[] arr2 = {1, 2}; static int[] arr3 = {1, 2}; arr: 1 2 arr.length: 2 arr: 1 2 arr.length: 2 arr: 1 2 arr.length: 2 pushing: arr1[0]: 1

@sandippatel7751 3 жыл бұрын

I guess there should be some provision for two numbers being equal but third being not for example take the following arrays: A = [1,3,5,7,8,10,12,13] B = [2,3,6,8,9,13] C = [1,2,3,7,8,10,12,13] when x = 1, y = 0 and z = 1 (indicies starting from 0) algorithm get struck. I may be wrong.

@TheRevAlokSingh 6 жыл бұрын

You should write a helper function to handle two arrays and use a fold/reduce to handle the case of n arrays.

@powder77777 6 жыл бұрын

List.retainAll(list1). Return intersection in Java.

@sinharakshit 6 жыл бұрын

Woah really? I didn't know that. Thanks for sharing

@arnab1prince 5 жыл бұрын

Hi Irfan, thanks for the video. However I'm confused as to why you have not compared x and z before incrementing index. What i thought is we have to increase the index of that array which is having the smallest value of that point of time. Please help. :)

@mudassirbit 5 жыл бұрын

Incrementing index condition looks wrong, what if for the same logic. Arr 3 as arr1. It won't work. X should increment if both y and z are greater. Not only y.

@adnanzaid6086 6 жыл бұрын

Good job man! Thanks for sharing.

@MiguelSoaresPessoa 6 жыл бұрын

Thank you for your videos! I really appreciate that.

@dimplescharles2539 5 жыл бұрын

can u make a video on classes,objects and methods

@ankitmishra-dp9tx 4 жыл бұрын

Can we have DFS problems please. They have a huge scope.

@anushree3744 5 жыл бұрын

Why have you stopped making videos??

@deepakmehrotra3426 5 жыл бұрын

Are all the arrays having same length?

@kasthurishravankumarhpc 5 жыл бұрын

Nice explaination

@ananyeagarwal9770 6 жыл бұрын

Got it in 15 seconds

@TuralFerhadov 6 жыл бұрын

Just considering this case : a1[x] = 2, a2[y] = 3, a3[z] = 0; x will be incremented , but z should be incremented. Am I missing something in the code?

@einSteppenwolf 6 жыл бұрын

Tural Ferhadov Z may get incremented in a later iteration.

@lkrsball 6 жыл бұрын

Although it might be more intuitive to increment the index of the smallest value, it actually doesn't matter which index you increment as long as you're not incrementing the index of the largest value. As ien Steppenwolf mentions- if there is a 3 in a1 and a3, it doesn't matter in which order you increment x or z. As long as you don't increment y, x and z will eventually reach the index where a1[x]=a2[y]=a3[z]=3.

@K-9_DZ 3 жыл бұрын

This Java implementation using two hashsets with time complexity O(n) with unsorted lists public class duplicates { public static void main(String[] args) { int array1[] = { 1, 2, 4, 6, 12, 50, 11 }; int array2[] = { 1, 5, 4, 12, 24, 11, 34 }; int array3[] = { 1, 5, 6, 12, 24, 78, 11, 64 }; HashSet hashValue = new HashSet(); HashSet hashValue2 = new HashSet(); for (int i = 0; i < array1.length; i++) { hashValue.add(array1[i]); } for (int x = 0; x < array3.length; x++) { hashValue2.add(array3[x]); } for (int j = 0; j < array2.length; j++) { if (hashValue.contains(array2[j]) && hashValue2.contains(array2[j])) { System.out.println("the duplicate value is " + array2[j]); } } } }

@ytravi 6 жыл бұрын

If you have never seen this problem.. how long do you think it would take to actually come to this solution?

@pjaveri 5 жыл бұрын

can we not take 1 array and then do binary search on the remaining two , the solution will parallelize well and will be much quicker Am I missing something here ?

@TheBestNameEverMade 5 жыл бұрын

Worst case when everything is the same it would be O(A×log(B+C)) best case it would be O(A), so more expensive in some cases then O(A+B+C). However if you where to take turns at binary searching all 3 arrays to move the marker you could get best case O(log(A+B+C)) and worst O((A+B+C)log(B+C)) (ie much better best case and a worse worse case).

@vivekv2948 6 жыл бұрын

What if A[x] is 2, B[y] is 3 and c[z] is 1 .. it still increments x

@einSteppenwolf 6 жыл бұрын

Vivek V Why not increment x? Is it possible that the intersection of the 3 arrays contains the value 2?

@pedastrianc3185 6 жыл бұрын

I passed the interview by talking shit and just used IEnumrable.Intersect. The difference is that the array is not sorted.

@yunuskocatas3879 2 жыл бұрын

this is my way

@CraftandDogs 5 жыл бұрын

It would be appreciated, if you tittle your videos. It would make it easier for us to navigate.

@mukeshgupta5366 6 жыл бұрын

what if arr1[[x]=2, arr2[y]=1,arr[z]=1? what will happen.

@mukeshgupta5366 6 жыл бұрын

got it! arr1[x] doesnot contain 1 . so move y .

@n_fan329 4 жыл бұрын

let res=[] nums1.forEach(val => nums2.includes(val) && nums3.includes(val) && res.push(val)) console.log(res) // [6,13]

@adeelsyed7291 5 жыл бұрын

This solution assumes sorted arrays of equal lengths, I doubt you will get such an easy question on a technical interview.

@eugnsp 4 жыл бұрын

It doesn't.

@FredoCorleone 5 жыл бұрын

That's optimal. You could have also built an hash table and check intersections in pairs (first array1 against array2, then their intersection against array3). Even if you walk 100 times over those arrays time complexity remains O(n) so doing in multiples steps with an hash table is not a problem. More than that with an hash table you don't need ordered arrays.

@wadhwa_neeraj 6 жыл бұрын

Can I suggest one programming problem?

@IrfanBaqui 6 жыл бұрын

By all means, please do

@wadhwa_neeraj 6 жыл бұрын

So the question is that given IDs of players and their strengths, we need to make two teams such that both teams should have equal number of players and total strength of every team should be maximum and the difference between the strengths of both teams should be minimum.

@osquigene 6 жыл бұрын

Basically you have weights and you want to have the best balance. Note that the "total strength of every team should be maximum" only makes sense if we don't use all listed players to build your teams, but in this case you'll have two optimums (since you can't satisfy the "best balance" and "max total value" at the same time). If you are asking for both you need to have a satisfaction function (how much do you value each objective). Example of a case were you can't achieve both objectives at the same time: Players: 19, 11, 2, 1 Team sizes: 1 minimise total difference = { {2}, {1} } maximise total value = { {19}, {11} } A problem that would have more sense as an interview question would be to minimise the teams total strength difference and use all players (number of players is even).

@osquigene 6 жыл бұрын

That actually sounds like a hard problem, unless you go for a brute force solution :p. But i'll try.

@osquigene 6 жыл бұрын

Do you prefer a fast estimate or a slow exact solution?