Timestamps: Intro: (0:00) Interview Question: (0:11) Explore the Dataset: (0:38) Solution Approach: (1:27) Coding - JOIN The Tables: (3:05) Coding - Explanation of the SQL Window Function: (4:22) Coding - Finishing up the Solution: (6:37)
@mohitupadhayay14392 жыл бұрын
2 months ago I was so afraid to even walk through an SQL video on KZbin. But look at me now! I am diving deeper and learning so much every day. Thankfully i didn't gave up!
@ProdbyTrehnt3 жыл бұрын
legit thought I sucked at SQL but doing some questions and reading deeper I was able to teach myself a lot of concepts just by practicing and seeing what I got wrong.... Thanks for the content Nate 👍
@stratascratch3 жыл бұрын
That's great! Keep on improving! Glad you've found the content helpful.
@aashishmalhotra2 жыл бұрын
Ok i write steps on my own by breaking down question. But how to implement it helped me. Also converting to float is learnt thanku
@stratascratch2 жыл бұрын
That is great. Keep moving forward.
@DaniMrtini2 жыл бұрын
I get the overall gist of solving questions like this but breaking jt down so nicely like you is something I'm strugglong with. Got final interviews with Amazon coming up and just looking around
@adhirajmajumder2 жыл бұрын
You're the best SQL/Data science KZbinr.
@stratascratch2 жыл бұрын
Thank you! And thanks for watching my videos!
@AdvikSSC20233 жыл бұрын
Really heart touching . With few seconds you will soled this query with beter understanding
@mercantilism9542 жыл бұрын
This is amazing intro for windows function. Love the video. For those who want to replicate his code, "order_cost" has been replaced with "total_rder_cost"
@ShivaniPandhare3 жыл бұрын
Very well explained. I am amazed how the difficult question seemed easy after the explanation!! 😁 I am going to try and use the approach you've used here. Thanks!!
@stratascratch3 жыл бұрын
Definitely give the framework a try! It'll help break the more complex portions of the problem into simpler bite size pieces.
@marinamondadorigessinger63903 жыл бұрын
thanks a million, it helped me a lot with the understanding of Partition by clause!
@alfredama3 жыл бұрын
That was really neat , i love the clear and concise train of thought before delving into code
@okygy21253 жыл бұрын
Very clean explanation. Short and sweet :-) Thanks a lot!
@jmmmmmmmmmk2 жыл бұрын
Very very good video and well explained. Thank you
@stratascratch2 жыл бұрын
Thank you. Glad you find it helpful.
@yasoram80073 жыл бұрын
Simple and excellent explanation Nate! . Thank you
@bachlam98413 жыл бұрын
Thank for your vid, I didn’t think about partitioning method, and intend to use sub query with group by id for calculate the percentage before selecting it
@angelomartinez98433 жыл бұрын
Great explanation of partition by. I was so lost before watching this video! Thank you!
@thulanidendere39862 жыл бұрын
You have explained this well sir,
@subhashgn17753 жыл бұрын
Very well explained. Thank you for providing the great resource.
@cabaymau51323 жыл бұрын
very well-explained, Thank you and big subcribed
@madhurasawarkar83733 жыл бұрын
Really Good tutorial Nate!!! Thanks for uploading such a good content
@stratascratch3 жыл бұрын
Thank you for watching. I'm glad you enjoyed the video and hope you take a look at the other sql videos I've posted.
@oscarherediamiranda6162 жыл бұрын
Hi guy. Thanks you for this video. You have a new subscriber now. I am studying software engineering here in peru and I would like to have a master at data engineering or data scientist.
@bakkster293 жыл бұрын
Amazing tutorial Nate!! Keep the awesome content coming!
@stratascratch3 жыл бұрын
Thank you for watching! Will keep churning out SQL videos. And will jump into some python stuff soon!
@dollajihukum3 жыл бұрын
so nice 🙏....
@sauravkumar94543 жыл бұрын
Your explanatory videos provide a great deal of knowledge. Thanks.
@stratascratch3 жыл бұрын
Thanks so much and thanks for watching! Let me know if you have any recommendations for topics and I can try to make a video.
@sauravkumar94543 жыл бұрын
@@stratascratch It would help If you can cover what questions are asked in the interview from other topics as well and then you can keep adding videos to that section just like SQL.
@stratascratch3 жыл бұрын
@@sauravkumar9454 Thanks for the suggestion. This is something I do plan on creating in the near future in addition to doing some python videos. Thanks for keeping engaged.
@sauravkumar94543 жыл бұрын
@@stratascratch Great. Will be tuned for the upcoming stuff.
@Holu542 жыл бұрын
@ Nate, thanks for the effort in putting these stuffs together. I observed that we were silent on an assumption that was made in the question "Assume each customer has a unique first name (i.e., there is only 1 customer named Karen in the dataset) and that customers place at most only 1 order a day". Don't you think this assumption would change the dynamics of the solution? Thanks
@stratascratch2 жыл бұрын
Yes, those are assumptions that would definitely change the solution. We were silent in the video but on an interview, I would definitely mention those assumptions. Thanks for watching!
@yusufbas0352 жыл бұрын
waow that was so nice nice work dude
@yusufbas0352 жыл бұрын
it looks like simple but it is helpful
@halildurmaz78272 жыл бұрын
Do interviewers allow us to create example tables (2 rows + Vertical), on paper? Because, it really helps me to think about the solution.
@stratascratch2 жыл бұрын
I think that is okay. Everyone has their own process of coming up with a solution. Talk to your interviewer.
@manuelsojan90933 жыл бұрын
Hi Nate...I did it like this, but I'm getting different values for the percentages for Eva: SELECT first_name, order_details, ROUND((ORDER_COST/TOTAL_PER_CUSTOMER) * 100,0) AS Percent_of_total_spend FROM (select *, SUM(order_cost) OVER(PARTITION BY cust_id) AS TOTAL_PER_CUSTOMER from orders INNER JOIN CUSTOMERS ON ORDERS.CUST_ID = CUSTOMERS.ID) AS D1 ORDER BY First_name
@stratascratch3 жыл бұрын
I like this solution. Nothing obvious right now but post this in the user discussion in the question. Someone from my team will help you.
@manuelsojan90933 жыл бұрын
@@stratascratch ok sure
@rakibraihanrimon87844 жыл бұрын
Well explained Bro, thanks
@stratascratch4 жыл бұрын
Thanks Rakib. If there's any videos you'd like me to create, let me know. I have more windowing functions videos coming up, case whens, and some data pivoting too.
@Daro202310 ай бұрын
woaw super helpful
@ashutoshanand40403 жыл бұрын
You got a new subscriber 🙂
@keifer78132 жыл бұрын
I'm really starting to like window functions. Question though - could this have been done using a group by?
@stratascratch2 жыл бұрын
Yeah, you could use a group by but window functions are a bit more flexible. Depending on the question, sometimes it's just more clear to use a window function rather than a group by. It really depends on the question!
@keifer78132 жыл бұрын
@@stratascratch Ahh got it. Thank you. Keep the videos coming!
@asholkkumar46853 жыл бұрын
Very good explanation .thanks for that. if u don't mind share the tables source code ..
@RobertoSolanoM3 жыл бұрын
great video man!! you won a follower
@stratascratch3 жыл бұрын
Thanks for watching man!
@HowardLinca2 жыл бұрын
GENIUS...
@GoktugAsc1232 жыл бұрын
That is a really clear explanation but aren't we supposed to use the order_quantity column to calculate the total cost per person, too?
@Vintagetube3103 жыл бұрын
Thank you for that clear explanation. For someone who wouldn’t know to use the partition clause, would there be another way to solve this?
@stratascratch3 жыл бұрын
The only other way I know is to create CTEs that would create views like the one the partition clause creates. Then you can join or use the CTE view so that it gets you the output you want.
@EminentJade2 жыл бұрын
lifesaver before interview, tyvm.
@chriskeo3922 жыл бұрын
Content 🔥
@gustavmoller89883 жыл бұрын
Nice tutorial :)
@Fux7042 жыл бұрын
I was less sophisticated than you on my attempt (btw the parameters of the challenged have changed a bit). Guess I just love subqueries 😅. Sum over partition was genius. SELECT c.first_name , o.order_details , round(total_order_cost/sub.total_spend,2) as perc_total_spend from orders o join (select cust_id, sum(total_order_cost) as total_spend from orders group by 1) sub on o.cust_id = sub.cust_id join customers c on o.cust_id = c.id order by 1,2,3
@stratascratch2 жыл бұрын
Amazing! Thank you.
@IrakliChitishvili3 жыл бұрын
Great. One question: Using the same window function, how to show total spent by customer? Like Eva: 180, Farida 260 etc without repeating sum over each order_cost?
@observer6982 жыл бұрын
I think we need to use group by then over name field using the SUM(order_cost)
@keifer78132 жыл бұрын
@@observer698 Would there even be a point in the window function if we just used a group by?
@gowthamjeevanantham61443 жыл бұрын
Excellent explanation thanks
@stratascratch3 жыл бұрын
Thanks so much for watching!
@alexlawson27504 жыл бұрын
This is really well made and well explained. Thank you so much for this! Quick question: is this possible with MySQL? (I would check, but I don't want to sit in front of my computer anymore today lol)
@stratascratch4 жыл бұрын
Hey! It is possible in MySQL. You'd use the same over() PARTITION BY functions. The only difference between postgres and MySQL is how one would cast data types. In postgres I used a ::FLOAT to change an integer to a decimal but in MySQL, I think you do something like cast(column_name as decimal). But you only need to do this if you're trying to convert it to a percentage. If you're going to keep it as a ratio (e.g., 0.55 for 55%), then no need to do any of that.
@anmolchawla6163 жыл бұрын
Thank you so much.
@nargisparvin42674 жыл бұрын
Hello sir, can you pls make videos on nodejs developer?
@stratascratch4 жыл бұрын
I wish I could =(. But I don't know anything about nodejs...
@meghasyam4273 жыл бұрын
That's a cool function
@msvrk1233 жыл бұрын
Hey Nate! Great problem, I found another solution to the problem: with sum_of_cost_per_cust(cust_id,tot_sum) as (select cust_id, sum(order_cost) from orders group by cust_id order by cust_id) select c.first_name,o.order_details, round(o.order_cost / s.tot_sum::FLOAT * 100) "Percentage" from customers c join orders o on c.id = o.cust_id join sum_of_cost_per_cust s on o.cust_id = s.cust_id order by c.first_name; By any chance could this query be optimized further?
@stratascratch3 жыл бұрын
I like it! I like to use CTEs like you've done as well in my code. I find that it really helps separate out logic. Thanks for sharing.
@amazhobner3 жыл бұрын
Clear explanation but I used to think order by was mandatory, partition was optional when using over clause, so can you please explain in which scenario order by is mandatory when using the over clause?
@localmartian90472 жыл бұрын
Order by is mandatory when using ranking functions like rank dense_rank row_number lead lag, but for normal aggregation it can be skipped
@thepogchamp77832 жыл бұрын
So I did the same thing without a window function by just having a nested select in the from... is that better or worse? And if I can do that, when do I really need a window function?
@stratascratch2 жыл бұрын
You can choose when to apply a window function vs another method. The nested select is slower because it's another operation inside a bigger operation that SQL will need to perform. It's more of an optimization and performance question when you compare the 2
@kli90053 жыл бұрын
Nice explanation!! Thank you. However I was a little distracted by “eVa”.
@manishbolbanda98723 жыл бұрын
Well explained 👍
@stratascratch3 жыл бұрын
Thank you so much!
@sophiema14333 жыл бұрын
Just thinking, what if there are customers with the same first name, is it better to partition by customer id?
@stratascratch3 жыл бұрын
Yup it is better. I think in a later version of the question, I had an assumption that all names are unique. In reality, partitioning by id is the best practice.
@SandyCoco12 жыл бұрын
The way you explain these sql queries are a 💯 Please what's the symbol used in converting int to float? Is it colon (:)
@stratascratch2 жыл бұрын
It's a double colon (::) for postgres. But for other db engines like MySQL it's cast() -- e.g., cast(column_name as int). You can use cast() for postgres too but most just use :: to save time.
@stratascratch2 жыл бұрын
Also thanks for the kind words!
@SandyCoco12 жыл бұрын
@@stratascratch Thank you for the extra tip. I'm learning on MySQL ❤
@gursimarkaur93402 жыл бұрын
Can this be done using a simple group by?
@stratascratch2 жыл бұрын
the edge here is that a group by will de-dup while a partition by will preserve all rows. So it might work on this dataset but a group by might fail with another dataset.
@jaychopra89322 жыл бұрын
what is the syntax for :: in mysql
@stratascratch2 жыл бұрын
It's a cast function in mysql. So something like cast(data_column as datatype)
@chrislin63084 жыл бұрын
Hi great video explanation on this question, I tried to use ROUND( order_cost / sum(order_cost) over (partition by first_name)::FLOAT , 2 ) to round the percentage to 2 decimal places but it doesn't work. Is there another way to do it?
@stratascratch4 жыл бұрын
It's because of a data type mismatch. Round() expects a decimal dtype but we're passing a FLOAT. If you run this query into the platform (link to the question in the description): SELECT c.first_name, o.order_details, round(((o.order_cost / sum(o.order_cost) over (PARTITION BY c.first_name)::decimal)*100),2) AS percentage_total_cost FROM orders o JOIN customers c ON c.id = o.cust_id This will return what you want.
@stratascratch4 жыл бұрын
Also, any feedback for me? Things I can do better? Is the code viewable? Do you want to see any other type of concepts? I have more window function videos coming soon and some CASE WHENs starting in Jan.
@chrislin63084 жыл бұрын
@@stratascratch You explained it very clearly, very easy to follow. I would love it if you can do a video that categorizes different type of sql questions that frequently appear in the interview. (what are the must-knows)
@stratascratch4 жыл бұрын
@@chrislin6308 Good idea. I actually have that in my queue of videos to make. I'll make that after a few other videos users have requested so stay tuned. I think I can make this video in Jan or Feb. Thanks for the great idea.
@VishalKumar-iv8gl2 жыл бұрын
Which platform you are using?
@stratascratch2 жыл бұрын
This is StrataScratch. The coding is in postgres but MySQL is also available. Python is also available on the platform.
@VishalKumar-iv8gl2 жыл бұрын
@@stratascratch Thanks..I will start practice on your site..😅
@saivarunkolluru3 жыл бұрын
Which one is better and faster, group by solution or window func?
@stratascratch3 жыл бұрын
Probably the group by I would imagine. The window function is a bit like a group by but it might be more complicated if you have functions and partitions in there and would take more time to execute compared to a simple group by.
@margimehta96093 жыл бұрын
Thanks for explaining it really well. I had a quick question: what if there are 2+ identical entries for order_details? In this example, what if Evan purchased a coat twice and slippers twice, still we wanted to sum both of those entries to give just one row per person per order_detail?
@stratascratch3 жыл бұрын
That's a great edge case. Right now there's no way of really knowing that without another column to make each row unique. It would be nice to have a date column to be able to differentiate these rows. That's something that I'd put in there. Or you can add an ID row in the very beginning to separate out all orders.
@Sam-zg4vc3 жыл бұрын
Should we create another partition by which will by on both columns first_name and order_details . That can be then decided with total cost
@pratikparbhane86772 жыл бұрын
Thanks A Lot !!!!!!!!!!!!!!
@sharanchakradhar3 жыл бұрын
Where can I find such questions more to crack ??
@stratascratch3 жыл бұрын
There's over 600+ coding questions on StrataScratch!
@rohankhubchandani36943 жыл бұрын
you made this so simple omg
@stratascratch3 жыл бұрын
Glad you understood it! Thanks!
@nikilkvn3 жыл бұрын
I am curious. why is that 'order_quantity' not included. It would not change the relative percentages. But, definitely changes the percentage value individually. The code when executed was a success by the system. can someone please explain. :( - Nice explanation BTW. Thanks Nate. :)
@stratascratch3 жыл бұрын
Thanks for watching! Would you be able to ask this in the discussion forum in the platform? Someone from my team or myself will help to answer it! Sorry about that, it's easier to handle these questions if they're logged on our side!
@immanuelsuleiman75503 жыл бұрын
Very cool
@stratascratch3 жыл бұрын
Thanks so much man! Let me know if you have any requests.
@malice1122 жыл бұрын
can this be solved with a GROUP BY instead?
@stratascratch2 жыл бұрын
Yes you can but there might be duplicate rows that effect the outcome. That's just something to keep top of mind while coding it up
@NotFound-iu8wx4 жыл бұрын
Hello Nate, I have a question, is it possible to 'loop' in sql
@stratascratch4 жыл бұрын
Yes you can. Here are 2 resources to help you write loops. 1. www.wiseowl.co.uk/blog/s348/loops-in-sql.htm 2. www.sqlshack.com/sql-while-loop-understanding-while-loops-in-sql-server/ But I would recommend just porting the loops to python and doing them there. It's much easier to write loops in python and you can even add automation to anything you're trying to do. I would stick to data manipulation for SQL and do all automation, loops, advanced stuff in python.
@mangoout86832 жыл бұрын
Why we didn’t take order_quantity into calculations? Apart from this solution is really easy if you know how to use partition by function
@96merluzzo3 жыл бұрын
Why did I discover your channel just today? I have an interview tomorrow! :( :'(
@96merluzzo3 жыл бұрын
Great content by the way
@stratascratch3 жыл бұрын
Good luck! You should binge watch all of this! Try out the platform too to get an idea to see if you're ready for the coding portion!
@96merluzzo3 жыл бұрын
@@stratascratch Thanks
@atharnayab35473 жыл бұрын
why not use group by first_name?
@fabio336ful2 жыл бұрын
Why didnt you use group by?
@teaspells99943 жыл бұрын
omg this is pretty hard! I can write some simple queries, but how can I learn the advanced material?
@stratascratch3 жыл бұрын
Keep practicing! Try the StrataScratch platform and others like Leetcode to get better. There's hundreds of hard questions on both platforms. And there's many free questions to try out.
@MashiroRedo2 жыл бұрын
I think the hard part is understanding the data and what they mean by total customer spend. If someone misreads this part it would be difficult.
@stratascratch2 жыл бұрын
Agreed and it's hard to write the question in a clear manner without giving away too much of the solution. On an interview, you can at least ask clarifying questions.
@tarushikhasharma53282 жыл бұрын
where can we practice SQL?
@stratascratch2 жыл бұрын
stratascratch.com is what you see in this video. Give it a try!
@rakeshorrikan3 жыл бұрын
Why don't we use group by?
@stratascratch3 жыл бұрын
If you use a group by with all those columns in the SELECT clause, it will cut the data by all those columns. We're just interested in summing the order cost by user name so we use a window function to bypass the order details and order cost columns in the SELECT. Give it a try using the link in the description. You can code up your solution and see if the solution matches mine. Thanks for watching and appreciate the comments.
@stratascratch3 жыл бұрын
@@hugomichaelnunezmejia3384 Yes that's true. So you'd group by first name and then the order details and percent of total purchases are aggregated by the group by. But to take the percentage you need the granularity of the order details (the numerator of the %) and then the overall total that is found by summing and grouping by first name (the denominator). So you lose the numerator by doing a groupby first name.
@mohit49023 жыл бұрын
There are 2 kinds of people - 1) That use rank over partition by 2) That create a subquery
@stratascratch3 жыл бұрын
Ranking is sometimes faster so if someone asks you about query optimization...use ranking > subquery =)
@rollinas14 жыл бұрын
God damn MVP you are :)
@stratascratch4 жыл бұрын
Thank you Rolandas =). Let me know if there's other concepts/topics you want me to cover! Happy to make them.
@Reemo3 Жыл бұрын
What is this using? :: does not convert the data type
@stratascratch Жыл бұрын
:: is a cast function in postgres. If you're using other sql engines, the function is usually cast(column_name as data_type)
@Reemo3 Жыл бұрын
@@stratascratch very cool and simple, seen it in Python. Hopefully will have some Postgres exposure soon
@alexandersfetcu11212 жыл бұрын
CAST(order_cost AS NUMERIC) / SUM(order_cost) OVER( PARTITION BY first_name) * 100 AS Ratio
@alexandersfetcu11212 жыл бұрын
Easy
@egor.okhterov3 жыл бұрын
Why PARTITION BY instead of GROUP BY?
@stratascratch3 жыл бұрын
Mainly because I wanted to sum all the orders by the user's name with still preserving the information in order details and order cost. Window partition was the only way to do it. Great question.
@jtanakala29762 жыл бұрын
Hi, This is the table called comments_and_translations id comment translation 1 very good 2 good 3 bad 4 ordinary 5 cdcdcdcd very bad 6 excellent 7 ababab not satisfied 8 satisfied 9 aabbaabb extraordinary 10 ccddccbb medium The output should be comment very good good bad ordinary very bad excellent not satisfied satisfied extraordinary medium please help me out with the query.
@emiliogarza64462 жыл бұрын
Edit: Just saw the data changed, mine appears as 20, 125, 60 and yours appears as 20, 100, 60. I tried solving this with a correlated subquery but it gave me different percentages than yours. For example, for Eva your percentages are 56, 11, and 33. Mine appear as 60.98, 9.76, 29.27. If you do manual verification, you'll see that Eva spent $205 total, so the percentages (125, 20, 60) should be 60.98, 9.76, 29.27. Weird thing is that I'm getting the same percentages as you in Farida, what could be the mistake here? This is my solution: SELECT c.first_name, c.id, o1.order_details, ROUND(o1.total_order_cost / (SELECT SUM(total_order_cost) FROM orders o2 WHERE o2.cust_id = o1.cust_id)*100, 2) AS per_total_spent FROM orders o1 JOIN customers c ON c.id = o1.cust_id ORDER BY c.first_name, 100
@stratascratch2 жыл бұрын
You're right; the question and the datasets have been updated in the meantime. There's no error; the data for Farida stayed the same. From what I see, your code should work OK and return the correct solution on the platform.