Google Data Scientist Algorithmic Coding Interview

  Рет қаралды 32,486

Jay Feng

Jay Feng

4 жыл бұрын

Today I'm joined by Sergey who is a data scientist consultant and former programming competition winner.
Question page: www.interviewquery.com/questi...
Want to be featured in the next mock interview video? Apply here: airtable.com/shrdQrwKK7xxGLm6l
👉 Subscribe to my data science channel: bit.ly/2xYkyUM
🔥 Get 10% off your next data science interview prep: www.interviewquery.com/pricin...
❓ Check out our data science interview course: www.interviewquery.com/course...
🔑 Get professional coaching here: www.interviewquery.com/coachi...
🐦 Follow us on Twitter: / interview_query
Sergey has worked at a variety of places such as a quant on wall street to a startup that got acquired by Grubhub. He joins me today in solving a very tough algorithms problem asked by Google. If you're looking to learn programming from him, check out Interview Query coaching!
More from Jay:
Follow me on Twitter: / datasciencejay
Follow me on LinkedIn: / jay-feng-ab66b049
Read More:
The Google data scientist interview guide: www.interviewquery.com/blog-t...
Example Google data science interview questions: www.interviewquery.com/blog-g...

Пікірлер: 38
@mikescott2257
@mikescott2257 3 жыл бұрын
I'm so happy that expert programmers also need a million goes to get something to run
@LJ-ph4hg
@LJ-ph4hg 2 жыл бұрын
tbh he got to the solution in around 6 minutes. Although I'm not sure why he didn't square the last location difference, must have looked past it
@sethdeneroff6490
@sethdeneroff6490 3 жыл бұрын
When you asked him to make it faster were you actually looking for a numerical method for speed or to make the code run faster algorithmically. The simplest speed up I can see is to store already computed distances for the pairs so you dont have to recompute. Might get a tiny bit tricky to do this in the best way possible but that seems like a solid way forward to me
@the_sunlit
@the_sunlit 4 жыл бұрын
Incredible! Very helpful to see the mindset and walkthrough - would be awesome to see more of these with a "feedback" of solution as well
@dougbopp2072
@dougbopp2072 3 жыл бұрын
Wow, Sergey is an intense thinker and this video is quite a pleasure to watch an expert problem solver in action!
@vignanreddy3491
@vignanreddy3491 4 жыл бұрын
This is great, thanks Jay!
@vladimir0681
@vladimir0681 3 жыл бұрын
If you have a million points, you can randomly sample 10000 to approximate a solution. If you do it a couple of times, you may get an idea of how stable solutions are (=how close they are to each other). That validates your answer.
@divijrao7314
@divijrao7314 3 жыл бұрын
The smaller problem would be for just one dimension. Assume 9 points at index 0 and one point at index 50. The centre of mass is at 5, but the real answer is to choose final location at index zero. Because every index move away from that, the 9 people have to move away and one person from the index 50 has to travel less. After playing with few more examples, I realized we need to think of geometric median. And ANY point between the two middle ones would be an optimal place. As example, consider below: 1,2,3,5,19,20 The solution above would be any point between 3 and 5, inclusive. If we choose 3, total distance is: 2+1+0+2+16+17 which would be same as if we chose 5 as host: 4+3+2+0+14+15. Hence the median. Any point between 3 and 5 would also suffice. Coming to two dimensions, or more - I couldn't think of a rigorous proof if this would work - but had an idea of doing PCA for the three dimensions, to reduce to one dimension and then do this geometric median. Kindly let me know any thoughts on this approach.
@taylorkaplan2614
@taylorkaplan2614 3 жыл бұрын
Optimization solutions are tough to do like this.
@youngwill6213
@youngwill6213 Жыл бұрын
U can solve this easily with Euclidean distance subtract each persons point from there point calculate dot product of that distance transposed with the distance then reshape the array take the mean of all points across each row and find the minimum of the 4,1 array
@philipodoemena2671
@philipodoemena2671 4 жыл бұрын
What resources would you suggest to prepare for coding interview with python?
@Russel_at_whatever
@Russel_at_whatever 3 жыл бұрын
use a clustering algorithm and then cluster the biggest cluster, and so on until you reach the center of the biggest cluster. That's a quick and practical solution (K-means, or HDBSCAN)
@samirfarid5539
@samirfarid5539 4 жыл бұрын
Great video
@ticTHEhero
@ticTHEhero 3 жыл бұрын
Aight, for this "host a party" problem. Im a flight mechanics engineer, heres my solution friends locations are given with respect to some arbitrary n^3 linear space right? 1.calculate the middlepoint with respect to all the locations (i.e. centroid) 2.move cordinate system to that middle point (i.e. change the basis) 3. calculate lengthes of new vectors 4. pick the smallest one, cuz he is the man in the middle 5. ??? 6. profit will attach sample code later. What do you think people? upd: ah mmm i see, he did this after all, aight
@svgtdnn6149
@svgtdnn6149 2 жыл бұрын
I am thinking of this too... But not sure if it is theoretically true
@THB_DX
@THB_DX 2 жыл бұрын
Yes, that's exactly how I thought of it. This is k-medoids with k=1. def find_host(friends): cX = sum([x['location'][0] for x in friends])/len(friends) cY = sum([x['location'][1] for x in friends])/len(friends) cZ = sum([x['location'][2] for x in friends])/len(friends) D = [] for i,friend in enumerate(friends): D.append(sum(tuple(map(lambda i,j: abs(i-j)**2,(cX,cY,cZ),friend['location'])))) return friends[D.index(min(D))]
@giorgikochiashvili6947
@giorgikochiashvili6947 3 жыл бұрын
this code gets it done in 2n: friends = [{'name':'bob', 'location':(5,2,10)}, {'name':'david', 'location':(2,3,5)}, {'name':'mary', 'location':(19,3,4)}, {'name':'skyler', 'location':(3,5,1)}] means = [0,0,0] for host in friends: means[0] += host.get('location')[0]/len(friends) means[1] += host.get('location')[1]/len(friends) means[2] += host.get('location')[2]/len(friends) best_host_index = None best_host_sum = 9999999 for idx,host in enumerate(friends): temp_sum = abs(host.get('location')[0]-means[0]) + abs(host.get('location')[1]-means[1]) + abs(host.get('location')[2]-means[2]) if temp_sum < best_host_sum: best_host_sum = temp_sum best_host_index = idx print(friends[best_host_index])
@sobhhi
@sobhhi 2 жыл бұрын
Are you sure that the closest point to the mean is also the closest point to everyone else…?
@rishhibala6828
@rishhibala6828 2 жыл бұрын
What about generating a MST structure and DFS through it with a count?
@yujiefu2251
@yujiefu2251 2 жыл бұрын
You are such a genius!
@kavyaprasad2019
@kavyaprasad2019 4 жыл бұрын
Shouldn't this output 0 in cases where we are computing costs with the same point? and in such case 0 would be the best cost? We have not taken into consideration to not calculate cost for points with itself.
@amangarg96
@amangarg96 3 жыл бұрын
In the question, we are not finding the two friends with minimum distance between them. Rather, we are finding the person which has the minimum total distance from all other friends (|x_0 - x1|+|x_0-x2|+......).
@zertulu
@zertulu 4 жыл бұрын
what about a solution using the centroid?
@iqjayfeng
@iqjayfeng 4 жыл бұрын
Can you elaborate?
@adibhatlavivekteja2679
@adibhatlavivekteja2679 4 жыл бұрын
@@iqjayfeng I think he meant, Just take the average of points(Centroid) and then measure distances of each point to the centroid. Whichever point has the least distance would be the host.
@filyok
@filyok 3 жыл бұрын
The centroid minimizes the sum of the squares of distances. The guy actually mentioned at 11:28 that if the problem was about squares of distances, he had a theoreical answer in mind. But for regular distances, the centroid would not work. What one needs is geomteric median, which according to wiki can be computed in almost linear time, with reference to a paper from 2016. Of course, I wouldn't expect anyone to come out with any of that during the interview.
@sicheng_wang
@sicheng_wang 3 жыл бұрын
I think medoid (extension of median to higher dimensions) may work. If it doesn't have to be at a house, one can use geometric mean.
@sobhhi
@sobhhi 2 жыл бұрын
@Qasim Wani Finding the “l1 centroid” as you call it is easy by taking the median of each coordinate. the issue is that this point may contain coordinates from 3 different people.
@sauravgupta7415
@sauravgupta7415 2 жыл бұрын
Sort the location and return the MEDIAN.
@sobhhi
@sobhhi 2 жыл бұрын
@Pruthvi Raj R G ee17b114 exactly… cannot sort in R^n when n > 1
@yujiefu2251
@yujiefu2251 2 жыл бұрын
Can we just calculate the middle point of the solution (mean for each dimension), then find out which location is the closest to the middle point? The time complexity is going to be O(N). However, I am not pretty sure, the location founded is the correct answer. Lack of mathematical proof
@maryamaghili1148
@maryamaghili1148 2 жыл бұрын
Exactly thinking about this
@bellahao1344
@bellahao1344 Жыл бұрын
I think we can't. The problem ask for the right person located in exactly his/her location.
@pingdingdongpong
@pingdingdongpong 5 ай бұрын
This is the correct solution. The mean is the closest spot to all the points. The closest to the mean will be be the best because the sum square function is a parabola sitting on the 3d space. Further you get from the min point, more you increase the sum distance.
@basicmaths3443
@basicmaths3443 2 жыл бұрын
I copied his first code and its not working. my answer is coming Bob with his code. how?
@anonymusann3575
@anonymusann3575 2 жыл бұрын
return best_host should be outside the for loop
@basicmaths3443
@basicmaths3443 2 жыл бұрын
@@anonymusann3575 yes i solved it..
@DistortedV12
@DistortedV12 3 жыл бұрын
Did he pass? 🤔
Uber Data Scientist Mock Interview: Ride Requests Model
31:06
The Amazon Data Science Interview
13:01
Jay Feng
Рет қаралды 33 М.
Increíble final 😱
00:37
Juan De Dios Pantoja 2
Рет қаралды 109 МЛН
Пробую самое сладкое вещество во Вселенной
00:41
Tom & Jerry !! 😂😂
00:59
Tibo InShape
Рет қаралды 56 МЛН
A/B Testing Interview with a Google Data Scientist
13:06
Jay Feng
Рет қаралды 33 М.
Google Machine Learning System Design Mock Interview
25:18
Jay Feng
Рет қаралды 50 М.
Three Tricky Analytics Interview Questions with Andrew
25:03
Jay Feng
Рет қаралды 79 М.
The Microsoft Data Science Interview
11:09
Jay Feng
Рет қаралды 20 М.
Google Coding Interview With A Facebook Software Engineer
49:59
Clément Mihailescu
Рет қаралды 924 М.
Data Scientist Interview Tips & Career Advice (Uber, ex-Amazon)
7:40
How I Failed the Google Coding Interview (and lessons I learned)
14:24