Super. Thank you. I am kneeling in front of you, I finally understand these charts.
@eatbreathedatascience95933 жыл бұрын
I've been watching your seaborn series. I wonder after plotting, how do I interpret the plot ? How to translate the plot to useful information in EDA to make recommendations or input to ML modelling ? Thanks very much for the video. I really appreciate your work.
@olgakim96372 жыл бұрын
Thank you for your helpful videos! Please, let me know how to remove 'ns' statistic annotation from box/bar seaborn plots, when p value more than threshold?
@bggaming21462 жыл бұрын
how do you make a bin range for barplot, if you can't is there any other command that could??
@web64dev2 жыл бұрын
Very easy to understand! Subscribed
@MarathiVoiceAudioRecording3 жыл бұрын
in case of order...what we do for big datasets ?
@MarcusVinicius-lq3fe3 жыл бұрын
how choose specific color for each bar based on name? let's say 'pinguin' = green bar color...etc? I did not find in documentation
@Hullo523 жыл бұрын
Hey Kimberly, great video! Would it be possible to increase the font size of the species names /individual bar plots? I have thirteen "bars" in my barplot and their names are very small. I tried using " plt.rcParams["axes.labelsize"] = 50 " but that only works on the overall axis labels. Any help would be great! Thanks.
@KimberlyFessel3 жыл бұрын
Yes -- very good question. Seaborn's fonts can be tricky. Couple options for you. You could increase the font size for everything. To do this, include the code "sns.set(font_scale=1.5)" before your barplot. That increases all font sizes by 50% for everything. And all the fonts on every plot you create after this will also be larger... you can always do sns.set() to reset if you don't want that. Or if you just want the tick labels to be larger, you could import matplotlib's pyplot as plt and then add "plt.xticks(fontsize=20)" after your barplot. That would increase the x tick fonts to be 20 point for that specific plot.
@mojomonk67373 жыл бұрын
Your material is so helpful!
@KimberlyFessel3 жыл бұрын
Oh wonderful! Very glad to hear that 😀
@laxmanbisht26383 жыл бұрын
hi, i am wanted to annotate the bar plot. Could you make a short video on that as well. Thanks!
@KimberlyFessel3 жыл бұрын
Hi there 👋 - as far as I know, there is no seaborn barplot argument to easily label the bars with their values. I would probably revert to using the matplotlib pyplot text function to loop through the values and label the bars like I did in this video: kzbin.info/www/bejne/hHO8q4aXiceimq8 You will just need to have the aggregate values from the barplot and perhaps you could get those values by doing a pandas groupby.
@hameedkaryab Жыл бұрын
Thanks for the clear explanation. Would you please make a video for How to Visualize the top\botom ten values in seaborn barplots. That would be very nice of you.
@KimberlyFessel Жыл бұрын
Thanks for the suggestion! I would probably do this by filtering down to the top/bottom values with pandas first, saving that as a new variable, and then just plotting those rows. 👍
@basharatghani69503 жыл бұрын
Can you please make a video on how to put value of count or percentage on bars?
@lindak.10813 жыл бұрын
Thanks Kimberly for all these tips. Is there any chance you know how to customize a plot with a palette we've created with hex codes ?
@KimberlyFessel3 жыл бұрын
Oh yes -- I love building my own color palettes with hex codes! If you know how many colors you need, you can just pass a list of color codes to the palette argument... for example, if I have three bars, I could include palette = ['#FF0000', '#000CFF', '#FFFF00'] as an argument in the barplot. If you have more than 3 categories, seaborn will just cycle back through that list again. I'm planning to do a "colors in Seaborn" video eventually 😀
@lindak.10813 жыл бұрын
@@KimberlyFessel Thank you!!!
@rajatgupta73444 жыл бұрын
sns.displot(data=penguins, x="species",hue="sex", multiple="stack") we can plot stack in seaborn but I am not sure about sum case I tried some combination but not got success till hope it might help you if does plz tell me also how you did it?
@KimberlyFessel4 жыл бұрын
Yes! -- this is a cool new feature in seaborn 0.11 for all of the distribution plots I think (displot, histplot, kdeplot, etc.) You can now do stacking for distributions... but these stack quantities like observation counts, probabilities, or densities. Not sums of columns from what I have seen. Though maybe you could use columns as distributional weights as a hack... Here, I had to resort to the large amount of code I showed at the end of the video to do the sum barplot like pandas can do. Maybe we will see a stacked barplot in the next version of seaborn! 😄
@rajatgupta73444 жыл бұрын
@@KimberlyFessel Yep, True I tried by using 0.11 but even I think we have to wait for next version.
@boubkersaoud73383 жыл бұрын
Hi Kimberly thanks for yours videos. Do you know how on a barplot to have only the first 10 countries sorted according to the amount? "sns.barplot(x='Country', y='Amount',data=data_foo,ci = None)"? Thanks for your help
@KimberlyFessel3 жыл бұрын
Oh good question. It would be really nice if the seaborn barplot automatically came with a sorting feature, but I would probably solve this one with pandas. So if I have data_foo as my dataframe, I might do top_ten = data_foo.groupby('Country').Amount.mean().sort_values(ascending=False)[:10] or you could do .sum() if you want to add up the amounts. Then either do a barplot on the top_ten dataframe with the index reset (sns.barplot(x='Country', y='Amount', data=top_ten.reset_index()) or if you want the confidence intervals you could do sns.barplot(x="Country", y="Amount", data=data_foo[data_foo.Country.isin(top_ten.index)], order=top_ten.index) where the order argument sorts them into the proper order.
@boubkersaoud73383 жыл бұрын
@@KimberlyFessel Thanks a lot for your help it's perfect
@mavi_kiz1233 жыл бұрын
Hello, i have just seen your beneficial, informative video. Thanks so much.It helped a lot to me
@KimberlyFessel3 жыл бұрын
Very happy to hear my video helped - cheers!
@ignaciosanchez93064 жыл бұрын
Hi Kimberly, i just saw all your videos and i learned a LOT! thank you!!! i have a question, for me the most difficult part is learn to filter the data, before plotting, do you have any advise where i can start to learn this?
@KimberlyFessel4 жыл бұрын
Glad you enjoyed the video and they have been helpful! Definitely agree with you -- filtering is a big part of working with pandas and often a necessary step before jumping into seaborn. I remember learning a lot about this by watching Wes McKinney's videos on pandas. After that it was all about practicing over and over again with as many new datasets as possible. But I will definitely keep this in mind as a potential topic to make videos about!
@ignaciosanchez93064 жыл бұрын
@@KimberlyFessel thank you! I'll keep practicing! I can't wait for more videos 🎉
@YogeshNavandhar Жыл бұрын
@@KimberlyFessel I actually wanted this advice only for my question, thanks Ma'am...
@emirhanpakyardm91423 жыл бұрын
thanks for the video. I just want to ask a question. I have imdb top250 data. It has columns like "genre" or "actors". I can see a number of values with a code "imdb.genre.value_counts()". It gives me a result like "Action: 132 Thriller: 35 Crime: 128" But I couldn't figure how to display them with sns.barplot "sns.barplot(x = "genre", y = imdb.genre, data = imdb);" not working since its not numeric What should I do? thanks in advance
@KimberlyFessel3 жыл бұрын
Sounds like you need the seaborn countplot instead of the barplot. The barplot aggregates numerical data over categories, but the countplot displays the count of observations per category (like number of movies per genre). I just made a video about the countplot recently if you want to check it out: kzbin.info/www/bejne/boaYmWZ7f9qobq8
@emirhanpakyardm91423 жыл бұрын
@@KimberlyFessel I'll check it out. Thank you
@rajatgupta73444 жыл бұрын
How bootstrapping for np.max works and which value it plot is it " mean of bootstrap" or "max of bootstrap"? If it is the max of each bootstrap is it also 95% of all max bcoz we are getting strip here too like Confidence Interval?
@KimberlyFessel4 жыл бұрын
Yes -- the height of the bar will be the overall, true max for each species of penguin. Then the confidence intervals will span 95% of the maxes seen in the bootstrapped samples. So sometimes the max will be smaller since the true max will sometimes get left out of the bootstrap samples. Note how the confidence intervals dip below the height of the bars but not above since the bar heights are the true max values. (There is a bit of styling going on so it may looks like the CI peeks over top of the bar, but it actually does not 😄)
@rajatgupta73444 жыл бұрын
@@KimberlyFessel thnx now got it. Actually confused 5:40 - 5:53 u said: "We are getting the sense here what is the maximum body mass for each species and also how does the max potentially varies when we doing the bootstrap sampling." bcoz of that line I that seaborn do bootstrap and then plot max of each species as the height of the bar 😄. Thnx, you are one of the best on youtube who doing great work love to see your more videoes.
@sarbanchakraborty20533 жыл бұрын
Fantastic Video, Thanks a lot. How do I make a box plot of a data set, with different values in different columns??
@KimberlyFessel3 жыл бұрын
Hi - seaborn also has a boxplot function. If you want to see boxes for all of the numerical variables in your dataset, you can do "sns.boxplot(data=df)" where sns refers to seaborn and df refers to your pandas DataFrame.
@sharonsfella3 жыл бұрын
Hi love your videos . Is it possible to group years of sales data into decades , then get the mean of each decade and put that in a bar plot? Thanks in advance.
@KimberlyFessel3 жыл бұрын
Sure you can! If you have a datetime column in your dataframe, you could just create a new column, say, df['decade'] = df.my_date.dt.year//10*10 (that rounds down to the nearest decade). Then you could use that column as your "x" in the seaborn barplot and the column you want to average as your "y".
@sharonsfella3 жыл бұрын
@@KimberlyFessel excellent , thank you so much
@ashishverma76673 жыл бұрын
How can I create multiple layer using Hue?
@atharvasharma87773 жыл бұрын
This is exactly what i needed ❤👍
@KimberlyFessel3 жыл бұрын
Oh very glad to have helped! 👍
@AS365884 жыл бұрын
Hi! Can you explain how to calculate the area under the curve please? ...ive done some research and ive found np.trapz .
@KimberlyFessel4 жыл бұрын
Sure, numpy's composite trapezoid integration (np.trapz) should provide you with a decent integral estimate. I have also used scipy's quadrature function (docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html). This blog post compares the two methods a bit as well: kitchingroup.cheme.cmu.edu/blog/2013/02/02/On-the-quad-or-trapz-d-in-ChemE-heaven/
@AS365884 жыл бұрын
Thanks for your time and all the support , that its awsome
@mohab5514 жыл бұрын
Hi! I got stuck using the two categorical variables, the ticks weren't in the middle of the bar plot, they are shifted either right or left. I dug online and found out that I might not have enough data on "hue". I am clueless, can you recommend something to make it more neat?! My code: sns.barplot (x="07 Objekt-Straße", y="Average", data=df, palette="dark", hue="Fuel", dodge="False")
@KimberlyFessel4 жыл бұрын
Oh interesting -- I think I know what you mean. Do you have at least one observation of every "Fuel" type for every "07 Objekt-Straße" type? If there are missing combinations, then you will get gaps in your bar plot. One option is to do what you are doing -- but you want to set dodge=False (without the quotation marks). That will stack the bars on top of each other.
@mohab5514 жыл бұрын
@@KimberlyFessel This went magical! Thank you. I am a big fan btw.
@KimberlyFessel4 жыл бұрын
@@mohab551 Oh great -- glad to hear that worked. Cheers! :)
@מריהגרוס3 жыл бұрын
Dear Kimberly, first of all THANKS A LOT for your AMAZING tutorials! I have a question about barplots: how can I annotate a vertical bar plot, so that on top of each bar I'll see the corresponding value? I tried to search online and I understand it has something to do with "patches", but I can't wrap my head around it. THANKS A LOT IN ADVANCE! Maria.
@KimberlyFessel3 жыл бұрын
You are most welcome - glad that you are enjoying my videos! There currently is not a seaborn barplot argument to easily label the bars with their values. So, I would probably revert to using the matplotlib pyplot text function to loop through the values and label the bars like I did in this video: kzbin.info/www/bejne/hHO8q4aXiceimq8 You will just need to calculated the aggregate values of the barplot separately and use the values (0, 1, 2, ...) for the positions of the categories. But I will definitely look into this more and consider making a video to answer this question since it seems to be a common one!
@ayushisrivastava83772 жыл бұрын
how to add labels on bar plot with many bars in a bar plot in seaborn
@aditijain65513 жыл бұрын
Hi, Can we do cumulative bar plot?
@KimberlyFessel3 жыл бұрын
Hi 👋 The seaborn barplot doesn't come with a cumulative option since it's used for separate categories that are typically not summed across. If you are showing a data distribution, however, the histplot and the kdeplot do have "cumulative" options as part of their keyword arguments. Or you can use an ecdfplot to display an empirical cumulative distribution function plot, which is inherently cumulative 😀
@aditijain65513 жыл бұрын
Thanks Kimberly Fessel for your response. I was looking for y value w.r.t to x value in cumulative form, to check the flat curve of y value corresponding to x. for which barplot with cumulative mode is required. As a work around i modified my data set using .cumsum(), with this i have generated a bar graph. This suits now my application
@seoshinyoun93313 жыл бұрын
Assignment lifesaver
@KimberlyFessel3 жыл бұрын
To the rescue! 💪 😄
@TheSambita203 жыл бұрын
thanks a lot, it's very helpful
@KimberlyFessel3 жыл бұрын
Most welcome -- glad it helped!
@philippekerr37663 жыл бұрын
Hello Kimberly! Thank you for your amazing tutorials - they are amazing. I'm currently trying to change my legend labels. The dataset I'm currently working with has the variable 'sex' coded as 1= female and 2= male, so the default legend labels are 1 and 2. I'm trying to switch this back to female and male and struggling. Thanks in advance for your time and help, I really appreciate it :) Philippe
@KimberlyFessel3 жыл бұрын
Why thank you! Glad to hear you are enjoying my content 😄
@agungsukariman2 жыл бұрын
Love you 3000!
@game_of_trades2 жыл бұрын
Thank you
@elastvd7503 Жыл бұрын
Perfect,thanks
@chethankm38623 жыл бұрын
how to increase the width of the bar
@KimberlyFessel3 жыл бұрын
This one is a bit more complicated than I first thought! Unfortunately, you can't just pass a "width" keyword argument through to the matplotlib bars. Looks like you can do this by actually updating the underlying patches though, and the answer in this post worked for me: stackoverflow.com/questions/34888058/changing-width-of-bars-in-bar-chart-created-using-seaborn-factorplot
@KimberlyFessel3 жыл бұрын
Oh it also looks like this will be updated in a future iteration of seaborn 😄🎉github.com/mwaskom/seaborn/issues/2508
@shubhamtalks97183 жыл бұрын
How to plot barplots in a sorted manner?
@KimberlyFessel3 жыл бұрын
Great questions -- I sure wish seaborn's barplot had some sort of sorting included, but the best options I have seen are to either sort your data before plotting it (with pandas for example) or to figure out what order you want your bars in and then send that to the order argument of barplot as a list or a pandas series or index.
@shubhamtalks97183 жыл бұрын
Can we plot Treemap in Python?
@KimberlyFessel3 жыл бұрын
There isn't an easy way to do this with Seaborn or matplotlib that I know of... but I have heard about a Python package called squarify that will make treemaps. I have been meaning to check it out and put together a demo. Hopefully soon!
@shubhamtalks97183 жыл бұрын
@@KimberlyFessel Thanks for the reply. I found the answer. Plotly library has Treemap. Any plans of making tutorials on Plotly library, as it is popular also?
@KimberlyFessel3 жыл бұрын
@@shubhamtalks9718 Yes -- Plotly is another great visualization tool. I have been considering making videos about it as well. It has many excellent interactive elements, too.
@victoriabraimoh8994 Жыл бұрын
df['Decade'] = ['Age//10*10 for Age in df.age]. i tried this code but its saying unsupported operand types