Data Analysis 0: Introduction to Data Analysis - Computerphile

  Рет қаралды 275,008

Computerphile

Computerphile

Күн бұрын

Big Data does not equate to Big Knowledge - unless you use data analysis. This is part 0 of the Data Analysis Learning Playlist: • Data Analysis with Dr ...
This Learning Playlist was designed by Dr Mercedes Torres-Torres & Dr Michael Pound of the University of Nottingham Computer Science Department. Find out more about Computer Science at Nottingham here: bit.ly/2IqwtNg
This series was made possible by sponsorship from by Google.
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottsco...
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Пікірлер: 182
@Computerphile
@Computerphile 5 жыл бұрын
Check out the full Data Analysis Learning Playlist: kzbin.info/aero/PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba
@lenoraballantyne8367
@lenoraballantyne8367 3 жыл бұрын
@Z3U5 ]
@TheChondriac
@TheChondriac 5 жыл бұрын
Mike is definitely my favorite from Computerphile. As a college computer science student, these videos are amazing for me. I definitely think that this channel should host a niche playlist of more involved programming videos for those of us that are aspiring programmers! Thanks for all of this great content.
@kuhluhOG
@kuhluhOG 5 жыл бұрын
He is my second favourite. (My favourite is Professor Brailsford )
@benanderson3701
@benanderson3701 5 жыл бұрын
Big up mike, he's my favourite
@megumi_0
@megumi_0 4 жыл бұрын
agree
@TheFoosie
@TheFoosie 5 жыл бұрын
10 part series with mike pound yes plz
@JesusisAlive_33
@JesusisAlive_33 5 жыл бұрын
i love this guy, the way he explains things so clearly and his voice
@leonhardeuler9839
@leonhardeuler9839 5 жыл бұрын
With open source datasets
@chethelesser
@chethelesser 5 жыл бұрын
>computerphile >Starting counting from 1 Yeah right
@theinnominatoidk3821
@theinnominatoidk3821 3 жыл бұрын
@@JesusisAlive_33 his accent is cool too
@RubiconDota
@RubiconDota 5 жыл бұрын
I love it when someone who is actually a Dr explains complex topics to me in an accent I can understand.
@hamedal-khateeb7360
@hamedal-khateeb7360 2 жыл бұрын
what accent did you use to hear?
@tarkamlokar9709
@tarkamlokar9709 Жыл бұрын
@@hamedal-khateeb7360 Probably indian.
@israelcyabukombe2267
@israelcyabukombe2267 5 жыл бұрын
When KZbin recommends you a useful data analysis video, to help you understand, how you always get recommended the other stuff.
@spockfan2000
@spockfan2000 5 жыл бұрын
Numberphile, NetFlix style: whole season marathon! :) Love it! (hehe... I meant Computerphile!)
@rialyandriamiseza9814
@rialyandriamiseza9814 5 жыл бұрын
Beginning my thesis on Learning Analytics. "LA is the measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs" - Siemens, G., & Gašević, D. (2012). Special issue on learning and knowledge analytics. Educational Technology & Society, 15(3), 1-163. Your videos are very helpfiul :) Thanks a lot
@xnossisx5950
@xnossisx5950 5 жыл бұрын
yay two series in one day extending over 2 1/2 hours each how wonderful
@brantwedel
@brantwedel 5 жыл бұрын
Using R for the explanations, but using a 0 based index for the series titles ... That's not going to be confusing at all 😆
@caughtbynothing
@caughtbynothing 5 жыл бұрын
Nope, not confusing at all. 0 ist the first index of EVERY enumeration, makes sense for all of us ;)
@RIURIU4
@RIURIU4 5 жыл бұрын
@@caughtbynothing R is not zero indexed
@andre9897
@andre9897 5 жыл бұрын
@@caughtbynothing really?
@MaximilianBerkmann
@MaximilianBerkmann 4 жыл бұрын
@@RIURIU4 Literally every computer language aside from R, Matlab and Fortran are zero-indexed.
@BrunoCarlin
@BrunoCarlin 4 жыл бұрын
@@MaximilianBerkmann Lua
@tamnguyentri6405
@tamnguyentri6405 5 жыл бұрын
Love for counting the episodes from 0. It helps me a lot with the terms as well. Thank you.
@ninja_goose4360
@ninja_goose4360 5 жыл бұрын
The instance I hear 'data mining' in is usually finding data where it is not easily accessible. For example, you can data mine the console log of a video game to figure out how the game works and whatnot.
@Plan36c
@Plan36c 5 жыл бұрын
You have a talent for teaching and technology. I’m currently interviewing for my first analysis position so this is very reassuring information. Thank you for sharing.
@wandersgion4989
@wandersgion4989 5 жыл бұрын
Is computerphile transitioning into being a tutorial channel? I like it.
@adityamankar8723
@adityamankar8723 5 жыл бұрын
Thank you so much for this amazing playlist. As a CS student, this was very helpful.
@crystalsoulslayer
@crystalsoulslayer Жыл бұрын
I absolutely love the product descriptions for the saws.
@RamkrishanYT
@RamkrishanYT 5 жыл бұрын
When you realize that it's a 10 part series, but it uses R *A small price to pay for salvation* When you see that R indexes from 1 *Reality is often disappointing*
@sherwinparvizian2414
@sherwinparvizian2414 5 жыл бұрын
@@krakensoup7439 Mostly convention. A lot of languages use 0-based indexing.
@caughtbynothing
@caughtbynothing 5 жыл бұрын
​@@sherwinparvizian2414 Have never worked with a programming language that does not start the enumeration at 0. Once more you see that R was developed by statisticians
@VandalIO
@VandalIO 5 жыл бұрын
@Matthias Julia also indexes from 1 , there are other languages out there which index from 1 too , R is built up on FORTRAN and FORTRAN also indexes arrays from 1
@PasselivreEdicoes
@PasselivreEdicoes 5 жыл бұрын
Was looking forward for this playlist :`). Thanks Mike and Computerphile.
@mohammaddawas481
@mohammaddawas481 5 жыл бұрын
This is the series I always waited from Mike Pound! Thank you! But I have a note, at 13:16, high dimensional data (samples with large number of variables) is not necesserily a big data, it's just a very high dimensional data.
@kalebjuliu7944
@kalebjuliu7944 5 жыл бұрын
You have no idea how much i need this, Thanks!
@chandrahasaroori317
@chandrahasaroori317 5 жыл бұрын
Can you guys do a machine learning series as well? This is really amazing!
@Jupiter__001_
@Jupiter__001_ 5 жыл бұрын
Lol the text on the DIY website is really funny! The chainsaw even had The Lumberjack Song xD
@skydrow4523
@skydrow4523 5 жыл бұрын
I see Dr. Mike, I click.
@raadwan
@raadwan 5 жыл бұрын
Feel bad for all the people that skipped this video and went to Part 1. lol
@nonaviandino8387
@nonaviandino8387 5 жыл бұрын
R index starts at 1 lol
@Wolves2314
@Wolves2314 5 жыл бұрын
Just Being Socially Awkward no it's not
@Wolves2314
@Wolves2314 5 жыл бұрын
@Just Being Socially Awkward you don’t know what you’re talking about and you should feel bad
@Elite7555
@Elite7555 5 жыл бұрын
@@Wolves2314 But R can suffer from really bad performance; often it is even much worse than CPython. So in that sense I would just use Python or C++. There are some great libraries out there for statistical analysis. Or use Julia which serves the same purpose as R but is much faster and also has awsome tools for plotting.
@MaximilianBerkmann
@MaximilianBerkmann 4 жыл бұрын
@@nonaviandino8387 True and it shouldn't be the case. But aye, it was designed for statisticians who don't count from 0.
@matzeh3498
@matzeh3498 5 жыл бұрын
ARE YOU SERIOUS, I CANNOT WATCH THIS AND TOM SCOTT AT THE SAME TIME
@pgnhd_
@pgnhd_ 5 жыл бұрын
Same
@BicyclesMayUseFullLane
@BicyclesMayUseFullLane 5 жыл бұрын
$DEITY DAMN IT, GOOGLE. I CAN ONLY PAY ATTENTION TO ONE THING AT A TIME.
@s0ngf0rx
@s0ngf0rx 5 жыл бұрын
tom scott?
@iamtheusualguy2611
@iamtheusualguy2611 5 жыл бұрын
yup, same
@nobodykid23
@nobodykid23 5 жыл бұрын
same dude, same
@zohebdholakia3782
@zohebdholakia3782 5 жыл бұрын
Beautifully explained
@znull3356
@znull3356 4 жыл бұрын
Doing this in R instead of Python probably won't age well, but I'll still watch it because of Dr. Pound.
@garmands
@garmands 5 жыл бұрын
And then another series with Robert Miles
@iraklismoutidis204
@iraklismoutidis204 5 жыл бұрын
Amazing series.
@teekanne15
@teekanne15 5 жыл бұрын
just had a course "introduction into data analysis in R" at uni a few weeks ago. This is a nice revisit :D
@rag3r926
@rag3r926 4 жыл бұрын
Birkbeck, UOL?
@andrarias
@andrarias 5 жыл бұрын
I always thought that Data Mining was acquiring raw data from 3rd parties, like Twitter, Facebook and the like, using their API. For example I once knew of someone trying to predict earthquakes in Japan with Twitter (I don't know how successful this attempt was), but I makes sense to me that this is what Data Mining is -in opposition to working with data you or your company collect, or the cleaner data that other organizations publish.
@EliteGamerpk
@EliteGamerpk 5 жыл бұрын
Man I Can't miss any of these videos, ringing those bells :D
@faustin289
@faustin289 5 жыл бұрын
An intelligent model is the one that would predict what I NEED; not what is similar to what I looked at in the past.
@chloem.872
@chloem.872 4 жыл бұрын
Are you ready for that? Tell me, truly, are you ready for an algorithm to predict what you need? Imagine the data they need in order to do that. Smart houses coming to a neighborhood near you...
@AndreAmorim-AA
@AndreAmorim-AA 5 жыл бұрын
Data Analysis is important but... Is it possible to have some talk about the boundaries of data analysis regards to personal privacy?
@Youtuber111-p2x
@Youtuber111-p2x 5 жыл бұрын
your personal privacy is protected. the names and IPs would be data that would not be used, just the purchases.
@ricma9710
@ricma9710 5 жыл бұрын
@@KZbinr111-p2x yes but they still get to know all about you while you don't know nothing about people that analyse you, your privacy might be protected from other civilians but not from companies that crunch your data
@dudeguy8864
@dudeguy8864 5 жыл бұрын
god bless mike pound 🙏🏾
@beautyofsylence
@beautyofsylence 5 жыл бұрын
>starts series at episode zero >uses program for series which starts arrays at 1
@heksqer1022
@heksqer1022 5 жыл бұрын
To think this and Tom's playlist come out at the same time :D
@hiqwertyhi
@hiqwertyhi 5 жыл бұрын
A+ product descriptions for those saws
@JemMawson
@JemMawson 5 жыл бұрын
I always knew "Data Mining" to be about sourcing data (e.g. web scraping), not pre-processing your existing data.
@StvnTheFuzzy
@StvnTheFuzzy 5 жыл бұрын
I suppose data sourcing would imply categorizing the actual data you're scraping, right? So in a sense you're pre-processing the data that has yet to be sourced, which of course in turn you're able to (pre-)process further for analysis. Come to think of it, using 'mining' as a definition for it would imply you're digging for gold in land that you might or might not own, hence the ambiguity and why Mike describes it as a buzz word.
@4.0.4
@4.0.4 5 жыл бұрын
Buzzword soup: "data mining big data with AI-based cloud computing"
@zerokelvin3626
@zerokelvin3626 5 жыл бұрын
I will binge watch this
@subschallenge-nh4xp
@subschallenge-nh4xp 5 жыл бұрын
Peter Parker:like Datacamp comment
@KUBKO17
@KUBKO17 4 жыл бұрын
Dr Mike is Bear Grills of the IT 👍
@sepgorut2492
@sepgorut2492 5 жыл бұрын
This tutorial might not be in Python but Python appears at 11:23 with the lumberjack song. Anyone else notice this?
@matthewrussell5448
@matthewrussell5448 3 жыл бұрын
I'm a lumberjack and I'm OK, I sleep all night and I work all day...
@tallwaters9708
@tallwaters9708 5 жыл бұрын
Jasus... this is a great idea for a series, but even an introduction is going to take a year of videos lol. Good luck.
@DominicGo
@DominicGo 4 жыл бұрын
love this guy (i could listen to his voice all day 🤣)
@gooadam
@gooadam 5 жыл бұрын
We were using "data mining" in place of scraping in the early 2000's. That's likely where that got fuzzed into analyzing and finding use for data.
@bruinflight
@bruinflight 5 жыл бұрын
BRILLIANT. THANK YOU!!!
@marcinszubryt2088
@marcinszubryt2088 5 жыл бұрын
There will be subtitles for this series? On what it depends if YT will generated it. Automatically generated are fine.
@palmpixiplus
@palmpixiplus 5 жыл бұрын
❤️ loved this!!!
@exm3266
@exm3266 5 жыл бұрын
If the first part of the actual series is episode [0], then what number do we give the introduction without some people interpreting it as the last episode in the list?
@1zl541
@1zl541 5 жыл бұрын
i
@alexgarratt5693
@alexgarratt5693 5 жыл бұрын
10 part series with Dr Mike, sign me up
@chickenshieee
@chickenshieee 5 жыл бұрын
Great teacher
@wiez543
@wiez543 3 жыл бұрын
R right?
@pureatheistic
@pureatheistic 2 жыл бұрын
I would let Dr. Mike POUND, take me to POUND town any day.
@silentinferno2382
@silentinferno2382 5 жыл бұрын
Google: let's sponsor computerphile and Tom Scott to make long playlists that you have to watch, and make people spend more hours on KZbin than 24/day.
@o0December0o
@o0December0o 5 жыл бұрын
Thumbs up for using R!
@Afdac247
@Afdac247 5 жыл бұрын
i think 'you may also like' works horribly because its data related to the number of people who bought the item not on data about what they see as missing in their lives. its not peer pressure to have a webpage say 'people also bought" but companies should pay each other to advertise based on tht missing concept
@vhz04
@vhz04 5 жыл бұрын
Great content. Is there a repository where we can get the data files?
@TheSam1902
@TheSam1902 5 жыл бұрын
13:00 I wouldn't agree with your definition of data mining. In a company setting, it's hard to get your hands on a lot of data to start preprocessing, analysing etc, that's when data mining is required: you ask around each department what data they have lying around on hard drives, collect that or spam an API to get more data. So that's prior to preprocessing (because you need the raw material first).
@BiancaAguglia
@BiancaAguglia 4 жыл бұрын
I think most of us make that mistake when we first hear the term "data mining". 😁Once we learn what data mining actually is, then the term makes sense: we "mine" the data we already have to discover valuable information in it. The process of getting data from other departments or other sources is called, prosaically enough, data gathering or data collection. 😊
@satoshinakamoto171
@satoshinakamoto171 5 жыл бұрын
please make more on AI and ML and data science
@nashaut7635
@nashaut7635 5 жыл бұрын
I thought it was just me until I checked another audio source: sound seems to be «deaf», i.e. lacking a fair amount of treble. I seem to encounter this with more and more youtube videos/channels although I haven't done exhaustive nor detailed analysis, just an observation. Anyway it feels a bit uncomfortable to listen to. Is it also perceived that way by anyone else?
@wiez543
@wiez543 3 жыл бұрын
Maybe get your ears checked?
@nashaut7635
@nashaut7635 3 жыл бұрын
@@wiez543 Sorry, can't hear you.
@wiez543
@wiez543 3 жыл бұрын
@@nashaut7635 🙉
@Jamie-st6of
@Jamie-st6of 5 жыл бұрын
these videos should REALLY have subtitles
@juliavanderkris5156
@juliavanderkris5156 5 жыл бұрын
Eyyyy Mike uses a thinkpad! My man!
@tenseikenzx-3559
@tenseikenzx-3559 5 жыл бұрын
Standard issue at uni I guess?
@rodrigolara6733
@rodrigolara6733 3 жыл бұрын
How can I get him to teach me more about computers? Is there a place we can go so that he can specifically teach us or pay for his classes?
@Dusk-MTG
@Dusk-MTG 4 жыл бұрын
How big is big? I don't know.
@daanielacosta2395
@daanielacosta2395 5 жыл бұрын
Why all the professors or scientists in the videos of CF use the same kind of sheet? Are you recycling or something?
@FAKETV96
@FAKETV96 5 жыл бұрын
Visualization*
@Ankit-vc3gx
@Ankit-vc3gx 5 жыл бұрын
Expecting in python ....
@charlieangkor8649
@charlieangkor8649 3 жыл бұрын
"sponsorship from by Google" - was this piece of English generated by Google's AI?
@leonhardeuler9839
@leonhardeuler9839 5 жыл бұрын
Mike, what are you going to do with that saw?
@ruairihair
@ruairihair 5 жыл бұрын
No mention of sas?
@jasondads9509
@jasondads9509 5 жыл бұрын
Darn, don't really like R
@M1cr0p47
@M1cr0p47 5 жыл бұрын
I usually say data mining when I can't think of the word parsing.
@samuelposejpal2983
@samuelposejpal2983 5 жыл бұрын
What program did you use to create Browserisor(fake browser)? Thanks
@balping
@balping 5 жыл бұрын
What about FORTRAN?
@NF-ru8on
@NF-ru8on 5 жыл бұрын
11:20 "...and then you end up having 10 saws...don't know how to use any of the saws, but you know, the retailer's job is done..." lol I'm excited about learning how to analyse data using tools like R; but capitalism sure is depressing.
@Littellittel1
@Littellittel1 5 жыл бұрын
The Egss with ai on it are secretly brilliant in germany. In german 'e'i means egg and it is pronunced just like 'I'
@daft_punker
@daft_punker 5 жыл бұрын
I'm Mike, and you've just been Pounded
@kkslidersnan
@kkslidersnan 3 жыл бұрын
Why doesn't this comment have more likes?!
@bahtiyarozdere9303
@bahtiyarozdere9303 2 жыл бұрын
This is beautifull 🤩
@Momentaryoratory
@Momentaryoratory 5 жыл бұрын
Forgive my noob question but how does it know the dimensions of the matrix without a header file?
@chloem.872
@chloem.872 4 жыл бұрын
This is my best answer, someone more knowledgable please feel free to correct me if I'm wrong. read.csv() automatically returns a data frame, which is very similar to a matrix, but has additional properties and operations that can be performed on it. If you pass a data frame with higher dimensions, then the function will automatically treat each element in it as 2D unless told otherwise. Judging by how difficult it is to figure out just how to assign higher dimensions to data frames, I'm going to guess that most csv files are 2D and that this isn't a common problem. I'm not sure what you mean by header file, unless you're coming from C++ where header files are a sort of prototype for classes you'd like to implement. C++ is a compiled language, and these header files are preprocessed by the compiler (all lines are gotten from the files) and IIRC, they are not included in the compiled object files. The compiler refers to these files for class and function definitions, so it can know how the class or function is supposed to be structured. It's considered a 'strongly typed' language, where arguments given to a function or class are checked for validity and rejected if invalid according to the definition. R is in contrast to this, because it's an interpreted language and is much more relaxed about type checking. In such languages, types are automatically assumed unless told otherwise. For example, you may need to frequently convert between "factor" and "character" data types in R, because characters are often assumed to be factors (at least, in RStudio they are). Beyond this, you can leave arguments out or explicitly state which argument you're passing to it by name, making the number and order of arguments irrelevant, eliminating the purpose of a header entirely. I hope this helped answer your question! :)
@DarshanSenTheComposer
@DarshanSenTheComposer 5 жыл бұрын
Spider-boy has grown up!
@Smile8924
@Smile8924 5 жыл бұрын
ML|AI == Rectangle|Squares
@nimagameroil1099
@nimagameroil1099 3 жыл бұрын
Challenge: take a shot Everytime he says Data
@perschistence2651
@perschistence2651 4 жыл бұрын
What is data? Am I data?
@duydangdroid
@duydangdroid Жыл бұрын
In this series, we will explain how to get people to buy things they'll never use for profit.
@mileswilliams527
@mileswilliams527 5 жыл бұрын
Who gave this a dislike? Why?
@wouldntyaliktono
@wouldntyaliktono 5 жыл бұрын
Lol, an example in R indexed at zero
@andrestone
@andrestone 5 жыл бұрын
"Big data", "Machine Learning", "Data mining", "AI": Good old computer science and statistics rebranded so Silicon Valley can market old tech as new tech. And investors are buying like Apple fans.
@samirz7
@samirz7 4 жыл бұрын
"We as a species produce a lot of data."
@marin.aldimirov
@marin.aldimirov 5 жыл бұрын
4:38 - vectrices?!
@wiez543
@wiez543 3 жыл бұрын
Somehow I feel like KZbin wants me to become a data scientist.
@postmodernist1848
@postmodernist1848 2 жыл бұрын
Of course you start counting from 0
@dijoxavier
@dijoxavier 5 жыл бұрын
Excellent, I am python guy, should I learn R ? Is it a good investment of time to learn R ?
@LordOfTheBing
@LordOfTheBing 5 жыл бұрын
No. With python, you get everything R can do with pandas, matplotlib, and numpy, but faster, and with ML with tensorflow/keras. And it's an actual programming language, you can do some actual process with your data then.
@jaffreyjoy
@jaffreyjoy 5 жыл бұрын
What does this guy not know about 🙌
@JesusisAlive_33
@JesusisAlive_33 5 жыл бұрын
i like the 3d effect
@JonnyRobbie
@JonnyRobbie 5 жыл бұрын
Ok, R does not equal rstudio. rstudio is one of many IDEs for R and you don't have to use it. I kinda got triggered by the list at the beginning.
@AshishAgrawal-br1ji
@AshishAgrawal-br1ji 5 жыл бұрын
Which IDE is that?
@Iknash
@Iknash 5 жыл бұрын
RStudio
@valizeth4073
@valizeth4073 5 жыл бұрын
Data analysis isn't bound to a single language though. Any real language can do it well. The problem with languages like python for data analysis, is that usually you're not just analysing the data, perhaps you need a fast and powerful neural network to implement something using the fetched data. Then it would better to perhaps use C and just pipeline R or Python, Ruby.. you name it.
@doctorpex6862
@doctorpex6862 5 жыл бұрын
Training neural network... same as training population
@verybrd
@verybrd 5 жыл бұрын
So data/bits are like quantum particles, they do not really exist, unless observed... Do i "get" quantum physics now?
Data Analysis 1: What is Data? - Computerphile
12:14
Computerphile
Рет қаралды 159 М.
Data Analysis 2: Data Visualisation - Computerphile
17:42
Computerphile
Рет қаралды 126 М.
Как подписать? 😂 #shorts
00:10
Денис Кукояка
Рет қаралды 8 МЛН
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 997 М.
Data Analysis 6: Principal Component Analysis (PCA) - Computerphile
20:09
What REALLY is Data Science? Told by a Data Scientist
11:09
Joma Tech
Рет қаралды 3,8 МЛН
CPU Pipeline - Computerphile
21:48
Computerphile
Рет қаралды 65 М.
Erlang Programming Language - Computerphile
16:05
Computerphile
Рет қаралды 253 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 330 М.
Master Data Analysis on Excel in Just 10 Minutes
11:32
Kenji Explains
Рет қаралды 2,1 МЛН
Coding a Web Server in 25 Lines - Computerphile
17:49
Computerphile
Рет қаралды 338 М.
ChatGPT Jailbreak - Computerphile
11:41
Computerphile
Рет қаралды 353 М.