I'd like to thank the author. You really do a great job. Everything is structured, decomposed and coherent. Some guys just jump in complex coding without really explaining what's going on there.
@gardnmi4 жыл бұрын
The best new feature with merge is the validate option to make sure your join is 1:1, 1:M, etc. This is very useful for machine learning projects or end user reports that rely on upstream data that is updated regularly. It's saved me headaches a few times.
@dataschool4 жыл бұрын
The "validate" option is great, I agree! I also like "indicator", which I explained here: twitter.com/justmarkham/status/1153653794829418496
@milindbebarta22262 жыл бұрын
You are really good at explaining things. One of the better teachers on youtube. Thanks a ton for this video and I hope there's more coming.
@dataschool2 жыл бұрын
Thank you!
@jonass74563 жыл бұрын
Dude! Let me tell you, you saved me a lot of time and work! Thank you so much!
@dataschool3 жыл бұрын
Great to hear!
@themustknowfacts5103 жыл бұрын
I'm not able to read that file "u.item" , I copied the same code from GitHub but pandas wasn't able to read that. It showed me Unicode Error... How do I solve that issue..
@ChrisMao_7083 жыл бұрын
insert this encoding='latin-1' and you will be fine
@citizen_deb4 жыл бұрын
Thank you so much Kevin, your neat explanation along with the file you share makes it so clear, was really needing it!
@NiireNolweva3 жыл бұрын
Very clear and informative. Thank you very much.
@dataschool3 жыл бұрын
You're very welcome!
@jaysoni78124 жыл бұрын
Were is the link of the data set which has been used in this video. I want to practice this with your data set can you please send me link?
@BC-gc7bv4 жыл бұрын
You are an excellent teacher!!! I'm a fan. TY.
@tommonks24904 жыл бұрын
Excellently explained as always. Keep up the great work!!
@dataschool4 жыл бұрын
Thank you!
@SamSam-mh5jt3 жыл бұрын
Thank you so much for the clear and concise explanation
@dataschool3 жыл бұрын
You're welcome!
@joseluisbeltramone5993 жыл бұрын
Thank you very much for the precise explanation, just what I needed to know!
@dataschool3 жыл бұрын
You're very welcome! 🙏
@nowyouknow22494 жыл бұрын
Thanks a lot Kevin We have missed you.
@dataschool4 жыл бұрын
Thank you! 😊
@SR-lf3ic2 жыл бұрын
hi, when I used pd.concat([df1,df2]), I got a tuple object instead of a dataframe object. I am using Python 3.9 environment. I would like to know what should I do to get a dataframe object rather than a tuple object?
@zapy4224 жыл бұрын
Thank you for this video. I have been struggling with merge and concat today :)
@dataschool4 жыл бұрын
You're very welcome! Glad it's helpful to you!
@fschmidkonz3 жыл бұрын
You're great teacher! I see the despite having a large 100K row file, the number of rows do not get expanded after the merge. They beautifully stay the same and just add the movie titles to the reviews. Can you comment on why this is not always the case. I have tried and my output file gets expanded by a few rows (17 out of 1000) and I have not been able to figure out why. I have checked multiple videos and some come absurd not practical solutions (like the files are the same size) or arbitrarily eliminate any dups (despite some may be valid rows), but none explain the reason and how to identify those rows that could be dups. Your comments are appreciated.
@AsMa-eg3 жыл бұрын
thank u so much. very clear and to the point.
@dataschool3 жыл бұрын
You're welcome!
@Isabel-ec2sq4 жыл бұрын
Thank you!! I finally got the dataframe I wanted!
@zezodiaa10252 жыл бұрын
great video. my question is when im working on project when exactly i have to combine ?
@Moc2Talk3 жыл бұрын
slowly talk is very helpfull to me. I have 2 questions. The first is : What's if i want merge only one certain column (rating) from df rating to df movie . The second: What's if I want to sum the rate of each Movie_Id . Tks you so much and looking for your answer.
@ayodejiakinfenwa Жыл бұрын
Plesae i am trying to merge two datasets as you have explained but it is giving an error that i should check for duplicates
@christleiroezi88784 жыл бұрын
I have a data frame and I have a list and a tuple , I want to merge all three together . I am aware merge can only do two tables at a time, but do you have any helpful hints on how to go about merged the table , list and df. I want make to make the result a new data frame
@mohammadj.shamim93424 жыл бұрын
Dear Kiven, I have some difficulties in fine tuning PLSRegression sklearn.cross_decomposition.PLSRegression. Can you please touch this issue one day?
@dataschool4 жыл бұрын
Thanks for your suggestion!
@michael32262 жыл бұрын
the resulting dataset I got has a value of null. What do i do?
@saikiranhr2 жыл бұрын
Thanks for the amazing video. One simple question. How to join tables on multiple indices (like 4 or 5)?
@tirtha93 жыл бұрын
Lets say a pandas df and mysql have column A, B, C and same schema, Column A in SQL is the primary key. now how to upsert a pandas df to mysql table? When primary key conflicts, then update the remaining columns, when doesn't conflict/exists, then do an Insert Into.. Whats the most efficient way to do this?
@AnoNymous-dh2sv2 жыл бұрын
What's the concat video? You say there is one, but I can't find it with search.
@dataschool2 жыл бұрын
It's at the end of this video: kzbin.info/www/bejne/Z2bUXpypbbWSfpY Hope that helps!
@bommubhavana87943 жыл бұрын
I am a beginner in python, I am not sure what join is the best to use in different scenarios. Can you help me through it?? I genuinely learnt a lot from your videos. I would really appreciate your help. Thank you in advance
@pradeepkapoor3553 жыл бұрын
Thanks for putting u[p some amazing content on pandas data manipulation and analysis. Can you please make a video on how to get results of the unmatched rows after performing a join/merge. In real-world scenarios, many time there are unmatched records from 2 data frames which need to checked for a match in a 3rd data frame. So please help in explaining this piece as well.
@dataschool3 жыл бұрын
Thanks so much for your suggestion! I'll consider it for the future.
@pradeepkapoor3553 жыл бұрын
@@dataschool Looking forward to it.
@lualmeidasouza4 жыл бұрын
How do I merge df1 and df2 by two columns (fiels) at clausula on? For example: dfUltStatus = pd.merge(dfUltStatus, dfDescStatus, on=['CODIGO_STATUS','SUB_CODIGO_STATUS'], how = 'left') The object is merge the two data frames through these two fields to bring the description field.
@job2k63 жыл бұрын
Very helpful, thank you.
@dataschool3 жыл бұрын
You're welcome!
@WaqasAhmed-om8ph4 жыл бұрын
sir, I hope that you and your family are good and healthy. sir, I have two questions if you have time kindly answer. 1. pandas have a lot of function and each function has a bulky parameter, how to remember all the functions and their parameters? (2) sir, although I practice but every new exercise a lot of bugs waiting for me. Thank you....!
@dataschool4 жыл бұрын
Practice! That's all there is to it.
@WaqasAhmed-om8ph4 жыл бұрын
@@dataschool thnax....!
@ДмитрийИгнатьев-з5т4 жыл бұрын
Hello, Many thanks for you tutorial. It's great!!! But i.m stuck is any techics to join two dataframes if one of them stack other not stack?
@joshuabarragan8414 Жыл бұрын
I need help
@codewithluq3 жыл бұрын
Hi Kevin, I have a troublesome Question Here I am analyzing a dataset which is totally textual. I want to assign Grading for certain text in a column by appending a new column of Grading to each existing column. I have achieved it using a for loop but I can't save the dataframe created because the for loop overwrites the created it. I need help. Code of for loop for (ColumnName,ColumnData) in b_questions.iteritems(): b_questions['Grading'] = b_questions[ColumnName].map({'Consistently Good':4,'Outstanding':5,'Satisfactory':3}) data = b_questions.loc[:,[ColumnName,'Grading']] print(data)
@dataschool3 жыл бұрын
If I'm understanding your question, I think you just need to run this one line of code: b_questions['Grading'] = b_questions['Insert column name here'].map({'Consistently Good':4,'Outstanding':5,'Satisfactory':3}) Hope that helps!
@shaheenalhirmizy96484 жыл бұрын
Hi kevin how are you doing, is there any way using pandas or another library for conditional merging?, if I want to choose from two data Thank you very much
@dataschool4 жыл бұрын
Could you describe in more detail what you mean by "conditional merging"? Thanks!
@shaheenalhirmizy96484 жыл бұрын
I mean if we have two different tables has same numbers of columns and We want to merg them but, not all data only the rows of data we want using condisonal formulas
@dataschool4 жыл бұрын
You should perform the operation in two steps: first do the filter, and then do the merge.
@SM-ie7ge4 жыл бұрын
Thanks for another great video. How do we join on multi-index?
@cvishnuteja5974 жыл бұрын
Hi, can you please post video on realtime large csv file having millions of rows using chunks or modin and how to merge those chunks after importing in Pandas.
@dataschool4 жыл бұрын
Thanks for your suggestion! FYI, if your computer does not have enough RAM to load a large DataFrame into memory, reading the DataFrame in chunks will not solve that problem. It will be just as large once you merge the chunks back together (which you can do using the "concat" function.)
@yassaryelurkar36312 жыл бұрын
where to add column names?
@wilsonmupfururirwa65234 жыл бұрын
Hi wanted to ask how you check for data consistency in columns. Like checking for integers in a string column or trying to find values like 2A in a column with double letter values eg. AA, BB etc
@dataschool4 жыл бұрын
Great question, though there's no "one way" to catch all of these issues! Here are some tricks that might be helpful, though: kzbin.info/www/bejne/iJ2smombnsxmnsU
@veddev84934 жыл бұрын
can you upload tutorial on dask dataframe because it is necessary to work with large dataset,or any tutorial of pyspark
@dataschool4 жыл бұрын
Thanks so much for your suggestion!
@diegorosa22923 жыл бұрын
This is the video i was looking for, thank you so much, very well explained. Just one question: When you are joining the 2 indexes with different names, I figured out that, unlike the first example you made (where ID's name was the same), the 2 id's I joined are showing up both in the result. So in my case i have "Subj ID" and "ID" that are the same except for the name that's different. When I use pd.merge(db1,db2, left_on="Subj ID" rigtht_on="ID") what happens is that the 2 keys are showing up both in 2 different collumns (and this happens in your tutorial aswell). Is there a way so I can remove one? because at the moment i have one more collumn that shows the same key. Thank you very much!
@Octaphea2 жыл бұрын
Hey have you figured this out?
@ranjithphd5163 жыл бұрын
In Python how to row data into colum
@tanmaysinghi18682 жыл бұрын
thanks for the content, id appreciateit even more if you taught at a quicker pace, playing the video at 1.25x makes it better.
@jochenbrosien95564 жыл бұрын
Kevin - I like how easy you make it look like. But here's my question - after watching I tried to apply the knowledge. I have a df1 with 3 columns and dtype='object', I have df2 with 7 colums and dtype='object'. When applying pd.merge(df1,df2) and applying .shape, I only get the column headers, no rows. What am I doing wrong?
@calluma84723 жыл бұрын
This means you don't have any matching data between the two dataframes. Sounds like you are looking for pd.concat , which just blindly glues together.
@smstoaj4 жыл бұрын
I need help in solving a problem assume two dataframe df1 = pd.DataFrame({'Text': ['Some text 1', 'Some text 2','The monkey eats a banana','Some text 4']}) df2 = pd.DataFrame({'Keyword': ['apple', 'banana', 'chicken'], 'Type': ['fruit', 'fruit', 'meat']}) df1 Text 0 Some text 1 1 Some text 2 2 The monkey eats a banana 3 Some text 4 df2 Keyword Type 0 apple fruit 1 banana fruit 2 chicken meat Thus, the preferable outcome would be: Text Type 0 Some text 1 - 1 Some text 2 - 2 The monkey eats a banana fruit 3 Some text 4 - the problem, however, is that banana is in a sentence not a standalone value. Thanks in advance
@yufeizheng51494 жыл бұрын
May I ask how to use "on"? thank you!
@karakol864 жыл бұрын
Can you do a video about group by and agg?
@IntotheLloyd4 жыл бұрын
I very rarely join on indexes as most data I analyze already has a unique identifier in the core table that I keep left joining too. On a high-level, I understand how indexes work and what an index is, but I was just wondering if anybody has a practical reason as to why you would join on an index?
@andreacazzaniga84884 жыл бұрын
The groupby gives you an indexed df where the index is the field of the groupby. If you are crazy enough you can skip resetting the index and keep working with the indexed df. Save a line of code, fuck up an entire codebase.
@dataschool4 жыл бұрын
Some people like to put a unique (and meaningful) identifier in the index, whereas other people prefer it as a column. If you prefer the former, then it's most natural to join on an index. Does that make sense?
@IntotheLloyd4 жыл бұрын
Andrea Cazzaniga I always just add .reset_index() at the end of my groupby and merge it back to my dataframe.
@saragordon69023 жыл бұрын
Could you do a video on how to compare in Pandas two columns in each excel file to see if they match and if they do add a column called matches to the first excel file with results of true or false?
@barefootalex3 жыл бұрын
df['match column'] = (df['col1'] == df['col2']) df = df.match_column.apply(lambda x: "Match" if x == True else "No match")
@maamounhajnajeeb2092 жыл бұрын
Thank you very much
@dataschool2 жыл бұрын
You're welcome!
@sreecharandyaga75774 жыл бұрын
Legend @16:26
@fanwang62795 ай бұрын
Good stuff
@dataschool5 ай бұрын
Thanks!
@_rsk_4 жыл бұрын
Thanks a lot for the video Kevin. Helped me understand Panda's Merge better. The Pandas documentation doesn't mention about pd.merge(df1,df2) and suggests usage as df1.merge(df2). Whats the difference between the two ? Also, there are two references for merge in the Pandas documentation 1. pandas.merge (pandas.pydata.org/pandas-docs/stable/reference/api/pandas.merge.html#pandas.merge) and 2. pandas.DataFrame.merge (pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html#pandas.DataFrame.merge). While contents and examples in both are same, it creates confusion :-(
@dataschool4 жыл бұрын
I agree that it's confusing! The first page that you linked to, pandas.merge, is pd.merge. That's because pd is just an alias for pandas. The second page you linked to, pandas.DataFrame.merge, is the DataFrame merge method. The first is a top-level function, and the second is a DataFrame method. The pandas documentation doesn't recommend one or the other, but I definitely recommend the first. Hope that helps!
@_rsk_4 жыл бұрын
@@dataschool Thanks Kevin. That helps.
@islamalmsarrhad21522 жыл бұрын
thank you
@dataschool2 жыл бұрын
You’re welcome!
@JoshDan124 жыл бұрын
... K!
@jeongbaekim36104 жыл бұрын
Love your videos, very helpful to me with learning pandas and thank you for your consistent updates~ After writing this I found that your link(CODE FROM THIS VIDEO: nbviewer.jupyter.org/github/j... ) above is broken, could you pls fix it?
@dataschool4 жыл бұрын
Thanks for your kind words! I see that the nbviewer service is currently having problems rendering new notebooks, so you can use this link instead to access the notebook: github.com/justmarkham/pandas-videos/blob/master/pandas_merge.ipynb
@summerzhang94844 жыл бұрын
Thanks for the videos Kevin! I love your teaching style and how you make each concept so crystal clear. Please keep making these videos! Just signed up to become a patron of yours and am taking your course on Data Camp (I wish you taught more courses on there!) Once I master Pandas will try out your machine learning course too :) ps your son is so adorable
@dataschool4 жыл бұрын
You are too kind, Summer! Thank you SO much for your kind words AND for becoming a patron! 🙌
@rawanfouda22914 жыл бұрын
That was honestly really good! thank you so much for your work
@lolkids78334 жыл бұрын
Thanks, Kevin.. this is the clearest explanation of the merge I have seen.
@dataschool4 жыл бұрын
Thank you so much!
@vitoroliveira63634 жыл бұрын
wonderfull, loved your slow passed english, that helped me a lot
@dataschool4 жыл бұрын
Glad it helped!
@mschuer1004 жыл бұрын
This, by far, is the best explanation of these concepts. Thanks for sharing.
@dataschool4 жыл бұрын
Wow, thank you so much for your kind words! 🙏
@dannylockett94452 жыл бұрын
I really enjoy your tutorials, thanks so much! I have 5 csv files that come out daily each containing a date column. i want to merge them all using the date as the merge field. i tried a basic merge with 2 of the csv files and date was used as the merge-on field by default - so it worked. ultimately i just need one date column in my masterfile with all the other column data merged. should I continue to do this or is it better to set the date column as the index, or something else?
@autonish4 жыл бұрын
Brilliant Stuff, All videos are awesome. Clearly explained all fundamentals...Thanks for making this stuff easy. On a different line, you remind me of "Sheldon" from the TV series The Big bang theory and this is a compliment. :)
@dataschool4 жыл бұрын
Ha! So many people have said that 😄
@alndr4u Жыл бұрын
How to merge two dataframes based on 4 common columns with repatative elements?
@LuisRivera-ce9lm2 жыл бұрын
I just wanted to thank you for such a great explanation of joins. I did not have it explained to me and struggled for the longest time to understand them. It takes a good teacher and someone who can understand it simply for one to understand it. Seriously, you are amazing!!
@dataschool2 жыл бұрын
Thank you so much! 🙏
@da_ta4 жыл бұрын
Thanks Kevin I have been looking for this for long time!
@dataschool4 жыл бұрын
Awesome! I'm so glad to hear this is the video you needed! 🙌
@JustJoelTV2 жыл бұрын
Great video, informative and clear. Thanks
@dataschool2 жыл бұрын
You're welcome!
@Octaphea2 жыл бұрын
Great video. However I have a little issue. I have 3 data frames that I am trying to merge together. The first is a pretty long database with columns (cust_id, gained_on gained_from_supplier, lost_to_supplier, sales_channel_id) the second is the supplier data frame (supplier_name, supplier_id) what I am trying to do is merge the supplier id and name from the second data frame, to the database frame which has the ID so supplier id to the number using the lefton/right on but instead it returns both columns - the supplier ID and name of both dataframes. Then the same with the channel data frame (sales_channel_name, sales_channel_id) and merge this with the sales_channel_id in the database dataframe and show the name instead. Any help would be appreciated, thank you!
@akinsikuelizabeth57804 жыл бұрын
Superb!!! I got Evey explanation, thanks
@dataschool4 жыл бұрын
You're welcome!
@amrita3011572 жыл бұрын
This is one of the best ever videos on pandas functions that I have watched. Well done Data School. I will look forward to more such videos.
@dataschool2 жыл бұрын
Thank you so much! 🙏
@sanjay1236443 жыл бұрын
Excellent way of teaching. Thanks Kevin
@dataschool3 жыл бұрын
Glad it was helpful! 🙌
@omidadib50523 жыл бұрын
Awesome tutorial, Thank you very much man!
@dataschool3 жыл бұрын
You're welcome!
@anahata17104 жыл бұрын
*Python по скайпу. Научу мыслить нестандартно. Решаем задачки, строим утилиты, игры. Data Science и всё, что с эти связано. Телега у меня в контактах. Напиши мне*
@osmanhussein38933 жыл бұрын
This is very helpful. Thank you so much.
@dataschool3 жыл бұрын
You're very welcome!
@bilalahmad91773 ай бұрын
You are a great instructor. I have learned a lot from you regarding pandas. The video with title "How do I merge DataFrames in pandas?" has left some queries in my mind. I would be thankful to you if you clear those too. What type of join is used here movie_ratings = pd.merge(movies , ratings)? if it is inner join it should result in 1682 rows in total in movie_ratings dataframe, as movies dataframe has 1682 rows. But in video i have observed that movie_ratings results in 100,000 rows of data.
@vipinamar83233 жыл бұрын
Nice teaching method. precision over pace.
@dataschool3 жыл бұрын
Glad it was helpful!
@svengunther76534 жыл бұрын
You are doing a really great job with this. Thank you so much! :)
@dataschool4 жыл бұрын
Thanks!
@bharatapar39374 жыл бұрын
Hi , if there is a duplicate record in second dataframe (like 'Houston' in below case) and i want to print only one Houston in the final output after doing outer join (as only one Houston is present in first data frame) but its not happening and in the final output after outer join, Houston is repeating in left dataframe also as mentioned below. Pl see the below test data. Pl help. DataFrame -1 ============= ID1 City Population 1 CHICAGO 3000 5 HOUSTON 14000 7 NEW JERSEY 18000 7 NEW JERSEY 20000 DataFrame -2 ============= ID2 City POPULATION 4 ARIZONA 2000 5 HOUSTON 3000 5 HOUSTON 4000 5 HOUSTON 5000 7 NEW JERSEY 3000 8 MICHIGAN 4000 det = pd.merge(df1,df2,left_on=['ID1'],right_on=['ID2'],how='outer', indicator='indicator',suffixes=('_A','_B')) Actual Output: ============= ID1 City_A Population ID2 City_B POPULATION indicator 0 1.0 CHICAGO 3000.0 NaN NaN NaN left_only 1 5.0 HOUSTON 14000.0 5.0 HOUSTON 3000.0 both 1 5.0 HOUSTON 14000.0 5.0 HOUSTON 4000.0 both 1 5.0 HOUSTON 14000.0 5.0 HOUSTON 5000.0 both 4 7.0 NEW JERSEY 18000.0 7.0 NEW JERSEY 3000.0 both 5 7.0 NEW JERSEY 20000.0 7.0 NEW JERSEY 3000.0 both 6 NaN NaN NaN 4.0 ARIZONA 2000.0 right_only 7 NaN NaN NaN 8.0 MICHIGAN 4000.0 right_only Expected Output: ============ ID1 City_A Population ID2 City_B POPULATION indicator 0 1.0 CHICAGO 3000.0 NaN NaN NaN left_only 1 5.0 HOUSTON 14000.0 5.0 HOUSTON 3000.0 both 2 5.0 NaN NaN NaN HOUSTON 4000.0 both 3 5.0 NaN NaN NaN HOUSTON 5000.0 both 4 7.0 NEW JERSEY 18000.0 7.0 NEW JERSEY 3000.0 both 5 7.0 NEW JERSEY 20000.0 NaN NaN NaN both 6 NaN NaN NaN 4.0 ARIZONA 2000.0 right_only 7 NaN NaN NaN 8.0 MICHIGAN 4000.0 right_only
@vijayreddy17304 жыл бұрын
Hi Kevin , First of all thanks for the wonderful lecturer , I am facing a problem to merge two data frames which i have shown you below .. Data frame 1: BackupServer BackupDay StartDate ClientName BackupStatus Backup re-run(Y/N) Incident Reason for the Backup Failures Backup Final Outcome RGSIBAK004 01-05-2020 2020-04-30 06:40:29 RGBPLNM110 Completed NaN NaN NaN NaN RGSIBAK004 01-05-2020 2020-04-30 06:53:07 RGPIAPP037 Completed NaN NaN NaN NaN RGSIBAK004 01-05-2020 2020-04-30 15:32:38 RGPIISD001 Failed Yes IN893523 VM disconnected Failed RGSIBAK004 01-05-2020 2020-04-30 18:00:08 RGPPFTP005 Completed NaN NaN NaN NaN RGSIBAK004 01-05-2020 2020-04-30 18:00:02 RGPQWEB069 Completed NaN NaN NaN NaN Data Frame 2 : BackupServer BackupDay StartDate Client Name Backup Status Backup Rerun (Y/N) Incident Failures Backup Final Result RGPIAUN003.FDNET.COM 2020-05-01 Thu Apr 30 21:00:03 EDT 2020 rgpqbda112.fdnet.com Activity completed successfully. NaN NaN NaN NaN RGPIAUN003.FDNET.COM 2020-05-01 Thu Apr 30 21:00:03 EDT 2020 rgpppcc051.fdnet.com Activity completed successfully. NaN NaN NaN NaN RGPIAUN003.FDNET.COM 2020-05-01 Thu Apr 30 21:00:03 EDT 2020 rgpppcc050.fdnet.com Activity completed successfully. NaN NaN NaN NaN RGPIAUN003.FDNET.COM 2020-05-01 Thu Apr 30 21:00:03 EDT 2020 rgpppcc011.fdnet.com Activity completed successfully. NaN NaN NaN NaN RGPIAUN003.FDNET.COM 2020-05-01 Thu Apr 30 21:00:03 EDT 2020 rgpdbda105.fdnet.com Activity completed successfully. NaN NaN NaN NaN Although the two data frames have three column names "Backupserver" , "Backupday" and start date ...the content in the columns is different and i am not able to merge these two data frames into one ? Can you help me on this?
@sch0ll13 жыл бұрын
Thanks man! You saved my weekend :*
@dataschool3 жыл бұрын
Glad I could help!
@shashi_kamal_chakraborty2 жыл бұрын
Thanks! very nicely explained. Now, I can perform joins using Pandas, quite effortlessly.
@dataschool2 жыл бұрын
Glad it helped!
@shashi_kamal_chakraborty2 жыл бұрын
@@dataschool Yeah! beside books, I follow you, especially for Pandas. Great help. Thanx...
@dataschool Жыл бұрын
You're welcome!
@JunaidInHenan4 жыл бұрын
above logic is beautifully explained, hi kevin, i have a question if you could please reply, I have three csv files csv1(20000 rows), csv2(20000 rows),cvs3(20000 rows), i want to merge these files into single data frame without losing a single record? Like i want to read these files into a one data frame that should have 60000 rows ideally. P.S: All the files have same columns (PostID, time, tweetURL, Content, RetweetNum , LikeNum, CommentsNum, Verified, Following, Follower). And in the resulting data frame i want to have all these columns at once as heading and want all 60000 rows. Is it possible ? kevin i will wait for your reply man, i know this post is old, maybe your read my question. THANK YOU
@jqts64903 жыл бұрын
Thanks for the video. I was able to successfully meagre and find some errors from Ids I did not find using VBA vlookups. I was curious. Is there a way to highlight difference between columns in this merged database. example: Number of Vehicles_SS: 7 vs Number of Vehicles_SA: 2 and it would highlight the row, or even just those those values, base on the ID it was merged on? I am having a hard time find this. Trying to get rid of VBA, which i have doing this, But it is SUPER slow with the data I have to process.
@vinayakchikkorde81513 жыл бұрын
I have the source file and target file. so in that, I have to compare 140 columns and show the result if it matches or not. for example, there is a column as Country1 in source and in target as Country2. to compare that i will use if(source['country1]==target['country2])return True else return false. to compare 140+ columns it will take time to compare 140 columns. and in both of the file columns are not in ordered. so how can I solve this?
@gregf91604 жыл бұрын
Thank you so much for the concise clear explanation. Much appreciated.
@jeevakumara55992 жыл бұрын
hi bro, I am currently working in a project. The mentors says that use foreign keys and primary keys in pandas and create table with the keys. so my question is, the usage of foreign and primary keys in pandas is possible or if we can't what shall I do to merge the two tables contains the same column which we are doing in the MYSQL coding. Thank you.
@mochammadirfanbaihaqi2793 жыл бұрын
Love the way you explain it, thanks for your vids. Keep it up (thumbs)
@cgpmth64492 жыл бұрын
How to merge multiple large dataframes in a fast way? I joined with usual merge() but it seems too slow. I found a clue of using pandas.Index() with the merge method, but i don't know how to use it.
@hardikvegad35084 жыл бұрын
Sir if we have hundreds of columns without the name. Then how can we name them using pandas and a for loop or lambda function because if we try to name them using names=[] it will be a very time-consuming process. The name of the columns can be col1, col2 , col3...etc.
@mehnazjabeen2 жыл бұрын
How to verify if all the columns are incorporated in the merged DataFrame by using simple comparison Operator in Python after merging two DataFrame?
@hectoralvarorojas19184 жыл бұрын
Great work as allways. Very useful. Thanks for sharing it! By the way, any chance you get some video done about PySpark? It will be very usefull to treat this from the biginning considering examples based on a local connection (one computer) first and then a couple of examples emulating a cluster connection.
@dataschool4 жыл бұрын
Thanks for your kind words as always, Hector! Sorry, I don't have any videos about PySpark, but I appreciate the suggestion! 👍
@hectoralvarorojas19184 жыл бұрын
@@dataschool I would love for you to do that. I am possitive that you will get a lot of interested guys, among them me of course. My best regards!
@BHARATHEEYUDU.4 жыл бұрын
I looking python data analyst jobs What are the prerequisites tools and technologies should I learn django django is must be for pandas data analysis Please advise me
@АлексейДуховный-ф1г3 жыл бұрын
Единственный англоговорящий человек, которого можно понять не зная английский
@ruthliganad82743 жыл бұрын
how about not a specific file? for example all .csv or all .tsv file? how to concatenate a header to that file? Thanks
@eliasaudi28772 жыл бұрын
What would we use to show ONLY all the values that do not match ? .... i.e. anything other that inner join
@CristianBittel4 жыл бұрын
Great as teacher, calm, taking your time to clearly explain fundamentals!
@dataschool4 жыл бұрын
Thanks so much for your kind words, I truly appreciate it!
@shivamsaway68034 жыл бұрын
Does it happen while merging two data frames, only heads get to merge, No data get merged inside the new data frame?
@ramachalprajapati11763 жыл бұрын
How to get the common mobile number from two different csv file having the different column name
@vighneshmane20804 жыл бұрын
can you explain how to print the row header and column header if I have particular condition In the row? EX: day1 day2 day3 day4 day5 place 1 2 5 6 7 8 place 2 1 1 1 6 8 place 3 2 3 5 10 11 the condition here is place with more the 5 unit at particular day? OUTPUT I need is place1 day2 place2 day4 place 3 day3 help me with this