Data Science SQL Interview Question | Recommendation System | Complex SQL 13

  Рет қаралды 17,674

Ankit Bansal

Ankit Bansal

Күн бұрын

Product recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis. In this video we will learn how to find products which are most frequently bought together using simple SQL. Based on the history ecommerce website can recommend products to new user.
Playlist for complex SQL questions:
• Complex SQL Questions ...
Here is the ready script to practice it:
create table orders
(
order_id int,
customer_id int,
product_id int,
);
insert into orders VALUES
(1, 1, 1),
(1, 1, 2),
(1, 1, 3),
(2, 2, 1),
(2, 2, 2),
(2, 2, 4),
(3, 1, 5);
create table products (
id int,
name varchar(10)
);
insert into products VALUES
(1, 'A'),
(2, 'B'),
(3, 'C'),
(4, 'D'),
(5, 'E');

Пікірлер: 69
@devrajpatidar5927
@devrajpatidar5927 Ай бұрын
Hello sir , nice video Here is my approach :- with cte as ( select o1.product_id as p1,o2.product_id as p2,count(*) as purchase_freq from orders_recm as o1 inner join orders_recm as o2 on o1.order_id = o2.order_id where o1.product_id < o2.product_id group by o1.product_id ,o2.product_id ) select STRING_AGG(name,' ') as product_pair,A.purchase_freq from cte as A inner join products_recm as B on B.id = A.p1 or b.id = A.p2 group by A.p1,A.p2,A.purchase_freq ;
@PrashantSharma-sw2jr
@PrashantSharma-sw2jr 8 ай бұрын
with t1 as ( Select a.order_id,a.customer_id,p1.name as name1,p2.name as name2,(p1.id+p2.id) as pair_sum,monotonically_increasing_id() as idf from orders a inner join orders b on a.order_id = b.order_id and a.product_idb.product_id left join products p1 on a.product_id = p1.id left join products p2 on b.product_id = p2.id ) , t2 as ( Select order_id,customer_id,name1,name2,pair_sum, row_number() over(partition by order_id,pair_sum order by idf asc ) as rnk from t1 ), t3 as ( Select *, concat(name1, ' ',name2) as pair from t2 where rnk=1 ) Select pair,count(distinct order_id) as frequency from t3 group by pair order by 2 desc
@ujjwalvarshney3188
@ujjwalvarshney3188 Жыл бұрын
create temp table orders1 as ( select * from orders a join products b on a.product_id = b.id); select m2 as pair ,count(distinct order_id) as purchase_freq from ( select a.order_id ,a.customer_id , concat( case when a.product_id> b.product_id then b.name else a.name end , case when a.product_id< b.product_id then b.name else a.name end) as m2 from orders1 a join orders1 b on a.order_id = b.order_id and a.customer_id= b.customer_id and a.product_idb.product_id) group by 1 order by 2 desc , 1
@sivasrimakurthi206
@sivasrimakurthi206 Жыл бұрын
the step by step approach to solve a query is amazing, the steps you walk us through to put the thoughts in right direction is wonderful. Thanks for the valuable lesson and keep educating all :-) loved most of your videos as they teach learners in correct direction. Thanks again @AnkitBansal
@ankitbansal6
@ankitbansal6 Жыл бұрын
Glad to hear that 🙏
@shahrukhkhanpatan1707
@shahrukhkhanpatan1707 Жыл бұрын
What id the product ID is not an integer but some varchar. Will the Pr-ID1>Pr-ID2 work? Or any other approach should be taken?
@ankitbansal6
@ankitbansal6 Жыл бұрын
It will work
@shekharagarwal1004
@shekharagarwal1004 2 жыл бұрын
Thanks Ankit and removing the duplicates was tricky .
@amrutaborse2163
@amrutaborse2163 9 ай бұрын
with ord as (select order_id orid,cust cusid,product_id pid,name from products12,order2 where id = product_id) select distinct nm2,cnt from (select a.orid,a.cusid,a.pid, a.name||' '||b.name nm2 , count(*) over(partition by a.name,b.name order by a.name) cnt from ord a,ord b where a.orid = b.orid and a.pid b.pid and a.pid
@haleynguyen5721
@haleynguyen5721 Жыл бұрын
what do you think if the question is not 2 but 3,4,5, ... buy together
@harshSingh-if4zb
@harshSingh-if4zb 2 жыл бұрын
Amazing!! self join questions are always tricky for me :-( .. any suggestion for good self join resources ?
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Check out this video kzbin.info/www/bejne/fWjZaIeQaN6feqs I will suggest after understanding it try yourself once..
@shaikalfina-u4z
@shaikalfina-u4z Жыл бұрын
I have used Lead() to get the next value : with cte1 as( select *,row_number() over(order by order_id,customer_id,product_id) as rn from orders join products on product_id=id ), cte2 as ( select name,lead(name) over(order by rn) as leaded from cte1 ) select concat(name,' ',leaded) as combo,count(*) as prod_freq from cte2 group by concat(name,' ',leaded) having len(concat(name,' ',leaded))>1
@Datapassenger_prashant
@Datapassenger_prashant 5 ай бұрын
amazing problem statement.. thanks for sharing this with solution.
@ankitbansal6
@ankitbansal6 5 ай бұрын
My pleasure
@arthurmorgan9010
@arthurmorgan9010 Жыл бұрын
Sir I tried this way: with cte as ( select o.*,p.name from orders o join products p on o.product_id = p.id ) select CONCAT(a.name,b.name) as va,count(CONCAT(a.name,b.name)) from cte a join cte b on a.product_id < b.product_id where a.customer_id = b.customer_id and a.order_id = b.order_id group by CONCAT(a.name,b.name)
@exanode
@exanode Жыл бұрын
This is how I solved as well
@anushakrishnaa
@anushakrishnaa Жыл бұрын
Hi Ankit here is my solution with cte as( select * from orders o inner join products p on o.product_id =p.id),cte2 as( select * ,LEAD(name,1) over( partition by customer_id order by (select null))next_name from cte), cte3 as( select * ,CONCAT(name,next_name) d from cte2) select d, COUNT(d)count_value from cte3 where next_name is not null group by d
@HuzaifaKhan-od6mz
@HuzaifaKhan-od6mz 8 ай бұрын
Hi , Any idea how will do suppose a user gives a product_id and according to that recommendation top 5 should pop up.
@adityakishan1
@adityakishan1 Жыл бұрын
with tb1 as( select A.order_id, A.product_id product1_id,B.product_id product2_id from orders A, orders B where A.order_id = B.order_id and A.product_id < B.product_id ) select B.name +' '+ C.name product_pair, count(1) as purchase_freq from tb1 A, products B, products C where A.product1_id = B.id and A.product2_id = C.id group by B.name +' ' + C.name
@rrohit1713
@rrohit1713 2 жыл бұрын
Hey btw could you suggest me a course from where I could learn SQL and my primary goal is hands on learning , while building and playing with it . I couldn't find a decent course anywhere . Everywhere they are only focussing on specific topics and it's just basics mostly.
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Thanks for asking. I need to check on this . Will let you know.
@nivadonga6980
@nivadonga6980 Жыл бұрын
with joined_table as (select o.*, pr.name from orders o LEFT JOIN products pr ON o.product_id=pr.id order by order_id, product_id) select concat(name_A,'',name_B) as product_pair, count(1) as frequency from( select a.name as name_A, b.name as name_B from joined_table a inner join joined_table b on a.order_id=b.order_id where a.product_id
@SudhirKumar-rl4wt
@SudhirKumar-rl4wt 2 жыл бұрын
Thanks for the question and solution.
@taniyasaini6830
@taniyasaini6830 2 жыл бұрын
Is it a good idea to do group by on names? because two products with different ids can have the same name.
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Since we need product names in output we did it. In real time you will just have to store IDs
@anirvansen2941
@anirvansen2941 10 ай бұрын
MYSQL Solution with recommendation as ( select o.*,p.name from orders o inner join products p on o.product_id = p.id), order_count as ( select *,count(1)over(partition by order_id) as cnt from recommendation), base as ( select customer_id,name from order_count where cnt > 1 ) select concat(b1.name,b2.name) as pair,count(1) as frequency from base b1 inner join base b2 on b1.customer_id = b2.customer_id and b1.name < b2.name group by concat(b1.name,b2.name)
@praveensinghrathore4542
@praveensinghrathore4542 2 жыл бұрын
WOW, mind = blown, this is so good
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Thanks Praveen 😄
@vikaskasaraneni6111
@vikaskasaraneni6111 2 жыл бұрын
I have tried the same but not getting the same in oracle console.. Receiving duplicate records by doing self join
@ankitbansal6
@ankitbansal6 2 жыл бұрын
For removing duplicates you need to put that condition P1>P2
@PranayDutta18
@PranayDutta18 Жыл бұрын
Hi Ankit, How to remove duplicate rows of products if the product_id is alphanumeric ? @6:07
@ajayk9605
@ajayk9605 Жыл бұрын
We can use greater than for alphabets also, it will impliment it by alphabatical order
@shankrukulkarni3234
@shankrukulkarni3234 Жыл бұрын
what is this bro I adicted to your real time senario sql complex queries😜
@Prasad1487
@Prasad1487 12 күн бұрын
Ankit == Ultimate thats it
@ayushshubhankar858
@ayushshubhankar858 2 жыл бұрын
I tried this...please anyone tell if its correct or not:- select distinct(concat(p1.name, p2.name)) as nm, count(*) as freq from orders o1 join orders o2 on o1.order_id= o2.order_id join products p1 on p1.id= o1.product_id join products p2 on p2.id= o2.product_id where o1.product_id< o2.product_id group by 1
@2412_Sujoy_Das
@2412_Sujoy_Das 11 ай бұрын
Key Takeaway: CROSS JOIN is equivalent to INNER JOIN b/w same table on same column; WITH CTE_1 AS (Select A.order_id, A.customer_id, concat(C.name,' ',D.name) as pair from orders A CROSS JOIN orders B INNER JOIN products C ON A.product_id = C.id INNER JOIN products D ON B.product_id = D.id WHERE A.product_id < B.product_id AND A.order_id=B.order_id) Select pair, count(pair) as freq FROM CTE_1 GROUP BY pair;
@garvitchaudhary4499
@garvitchaudhary4499 10 ай бұрын
how do u people solve these with different solutions. I cant do any one of these hard ones without looking at solution. whats the secret?
@2412_Sujoy_Das
@2412_Sujoy_Das 10 ай бұрын
@@garvitchaudhary4499 Even I faced the similar issue too, but I followed this approach: 1) Try it yourself first - This gives you the power to think of the ways you can use a syntax properly 2) After being unsuccessful, see the solution and jot it down in your notebook 3) Literally revise them once done - There's no shame in doing so!!!
@AVIRALBHARDWAJD
@AVIRALBHARDWAJD 2 жыл бұрын
Jhakaaasss bhai thanks
@ankitbansal6
@ankitbansal6 2 жыл бұрын
:)
@vijaypalmanit
@vijaypalmanit 2 жыл бұрын
Superb !
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Thanks 🙏
@arpanscreations6954
@arpanscreations6954 4 ай бұрын
My solution before watching the video: with orders2 as ( select o.order_id, p.name as product_name from orders o inner join products p on o.product_id = p.id ) , pairs as ( select distinct o.order_id, LEAST(o.product_name, o2.product_name) AS P1, GREATEST(o.product_name, o2.product_name) AS P2, CONCAT(LEAST(o.product_name, o2.product_name) , GREATEST(o.product_name, o2.product_name)) as pair from orders2 o inner join orders2 o2 on o.order_id = o2.order_id and o.product_name o2.product_name ) select pair, count(1) as freq from pairs group by pair
@nikhitamaruvada9385
@nikhitamaruvada9385 8 ай бұрын
with cte as ( select o.*, p.name, lead(name,1) over (partition by order_id) as leed from orders o join products p on o.product_id = p.id ), cte2 as ( select *, lead(leed,1) over (partition by order_id) as leed1 from cte ), cte3 as ( select *, concat(name,leed) as prod , concat(name,leed1) as prod1 from cte2 where leed is not null ), cte4 as ( select prod as products from cte3 UNION all select prod1 as products from cte3 ) select products , count(products) as cnt from cte4 where length(products) = 2 group by products
@nihal2725
@nihal2725 2 жыл бұрын
good method
@florincopaci6821
@florincopaci6821 2 жыл бұрын
Thank you!!!
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Hope it is clear now 🙂
@florincopaci6821
@florincopaci6821 2 жыл бұрын
@@ankitbansal6 Yes Thank you so much! All the best!
@piyushjoshi3910
@piyushjoshi3910 Ай бұрын
with cte as ( select id,order_id,name,isNull(lag(name) over(partition by order_id order by name),LAST_VALUE (name) over(partition by order_id order by name rows between current row and unbounded following))as l from orders2 join products on orders2.product_id = products.id ), cte1 as( select case when name != l then CONCAT_WS(' ',name,l) else null end as concatcols from cte) select *,count(concatcols) as countcols from cte1 group by concatcols having count(concatcols) != 0 order by countcols desc ; this is another approach
@shwetasaini6892
@shwetasaini6892 Жыл бұрын
My version with cte as ( select product_id, added_product, p.name as product_name, f.name as added_product_name, count(*) as frequency from ( select s.*, r.product_id as added_product from orderss s join orderss r on s.order_id = r.order_id and s.product_id < r.product_id )as x left join products p on x.product_id = p.id left join products f on x.added_product = f.id group by product_id, added_product, p.name, f.name ) select concat(product_name, added_product_name) as products, frequency from cte
@tupaiadhikari
@tupaiadhikari 2 жыл бұрын
I am a Machine Learning Engineer, and was very curious to solve this. This is my solution before watching your video, but I took learnings from your another video : Self Join Powerful ``` WITH cte AS ( SELECT a.*, b.name FROM ordersDS AS a JOIN productsDS AS b WHERE a.product_id=b.id ) SELECT -- c.order_id, -- c.name AS ITEM1, -- d.name AS ITEM2, CONCAT(c.name," ",d.name) AS ItemList, COUNT(CONCAT(c.name," ",d.name)) AS TotalCount FROM cte AS c JOIN cte AS d ON c.order_id=d.order_id AND c.named.name AND c.name
@ankitbansal6
@ankitbansal6 2 жыл бұрын
Awesome
@tastebuds6762
@tastebuds6762 11 ай бұрын
Here's my solution with orders_cte as ( select order_id, customer_id, product_id, name from orders o join products p on o.product_id = p.id ) ,final_cte as ( select name, prd_nm, count(*) as purchase_freq from ( select o1.order_id, o1.customer_id, o2.name, case when o2.product_id < o1.product_id then o1.name else null end as prd_nm from orders_cte o1 join orders_cte o2 on o1.order_id = o2.order_id and o1.customer_id = o2.customer_id) where prd_nm is not null group by name, prd_nm) select name||prd_nm as pair, purchase_freq from final_cte order by 1;
@130_vanshika3
@130_vanshika3 24 күн бұрын
with my_cte as ( select a.order_id,a.product_id as first,b.product_id as second from orders as a join orders as b on a.order_id = b.order_id where a.product_id != b.product_id order by a.order_id, a.product_id ) ,cte2 as ( select * from my_cte where first
Family Love #funny #sigma
00:16
CRAZY GREAPA
Рет қаралды 33 МЛН
这是自救的好办法 #路飞#海贼王
00:43
路飞与唐舞桐
Рет қаралды 102 МЛН
A/B Testing Interview with a Google Data Scientist
13:06
Jay Feng
Рет қаралды 38 М.
Data Analyst Spotify Case Study | SQL Interview Questions
22:52
Ankit Bansal
Рет қаралды 21 М.
An Awesome Tricky SQL Logic | Complex SQL 10
7:05
Ankit Bansal
Рет қаралды 30 М.
Learn SQL In 60 Minutes
56:24
Web Dev Simplified
Рет қаралды 2,1 МЛН
Family Love #funny #sigma
00:16
CRAZY GREAPA
Рет қаралды 33 МЛН