Protecting Privacy with MATH (Collab with the Census)

  Рет қаралды 442,617

minutephysics

minutephysics

Күн бұрын

Пікірлер: 816
@burtosis
@burtosis 5 жыл бұрын
How I protect my privacy with math. Step 1 I start doing math. Step 2 Person approaches me and sees I'm doing math. Step 3 They slowly back away Step 4 Privacy!
@Abdega
@Abdega 5 жыл бұрын
Then one person is like “Oh cool! Privacy jitter calculations!”
@opsoc777
@opsoc777 5 жыл бұрын
So THIS is why I never get laid...
@illiiilli24601
@illiiilli24601 5 жыл бұрын
Instead of running away, you're approaching me?
@DefektoPrime
@DefektoPrime 5 жыл бұрын
I wanted to read your comment, but i saw the word "math", and then slowly backed away
@MirceaKitsune
@MirceaKitsune 5 жыл бұрын
Easier way: When the census takes place, don't answer your door or go away from home in a trip. Though the US loves a bit tyranny of its own, they can't fine thousands of people for not being at home at a certain moment in time. Problem solved!
@qqq1701
@qqq1701 5 жыл бұрын
New census data: There may or may not be people in the USA.
@llllllllllllllllllllllllIIIIl1
@llllllllllllllllllllllllIIIIl1 5 жыл бұрын
The probability is 50-50 right ? Either yes there are people, and no there isn't 🤔🤣🤣
@R2Cv1
@R2Cv1 5 жыл бұрын
@@llllllllllllllllllllllllIIIIl1 Not necessarily, depends on how you acquire people. If you randomly blink people in and out of existence, the the probablility of "no" could get very low if possible cases include 10 people, or 300 people, or 129 people, or 2000000000 people... how likely would it be that you'd get to zero at some point?
@thinboxdictator6720
@thinboxdictator6720 5 жыл бұрын
@@R2Cv1 1/n
@R2Cv1
@R2Cv1 5 жыл бұрын
@@thinboxdictator6720 (That was rhetorical)
@thebreakfast8055
@thebreakfast8055 5 жыл бұрын
Shrödingers population
@davidgustavsson4000
@davidgustavsson4000 5 жыл бұрын
I wonder how many minutephysics episodes have been scrapped because they couldn't think of a closing pun.
@sagemccarthy4115
@sagemccarthy4115 3 жыл бұрын
I bet 1 decillion
@ggsap
@ggsap 3 жыл бұрын
@@sagemccarthy4115 on what?
@nerfgunguy4575
@nerfgunguy4575 2 жыл бұрын
@@ggsap I'm pretty sure he meant that he bets 1 decillion videos have been scrapped
@hahanamegobrrr6667
@hahanamegobrrr6667 5 жыл бұрын
beginning of video : citizenships end of video : algebra and probability
@conficturaincarnatus1034
@conficturaincarnatus1034 5 жыл бұрын
and a nifty pun that almost ruined the sense of wonder almost.
@tanmayuniyal6603
@tanmayuniyal6603 5 жыл бұрын
Read the name of the channel, might give you an idea
@holypython4418
@holypython4418 5 жыл бұрын
Its not that complicated
@veggiet2009
@veggiet2009 5 жыл бұрын
Highschool Students Everywhere: Algebra is stupid, when will I use this in life? minutephysics: Well...
@onelazynoob15
@onelazynoob15 5 жыл бұрын
Actually, since we're looking for the slope at a peak it's more calculus. Derivatives are ez pz tho.
@NoahPillow
@NoahPillow 5 жыл бұрын
So they can jitter my census data to ensure privacy, but when I jitter my tax returns, all I'm ensuring is jail time? Doesn't seem fair
@diedewip7912
@diedewip7912 5 жыл бұрын
Unless you add positive amounts to it
@papa_pt
@papa_pt 5 жыл бұрын
also remember when Equifax didn't do their one job
@niter43
@niter43 5 жыл бұрын
That's a meaningless comparison. Government is supposed to know accurate information about it's citizens that's needed to operate properly. And so it does. Data jittering is done for purposes of keeping individual privacy when publishing data to *public* (keeping privacy as in making sure that third-party can't restore individual data from published data, not as in keeping secrets from government). Pretty sure they store correct census data of each individual for internal purposes as well.
@MRDAVIDJHEMMINGER
@MRDAVIDJHEMMINGER 5 жыл бұрын
@@niter43 Last time I checked I am part of the government too!! and I really hope this government "of the people, by the people, for the people, shall not perish from the earth"!!! But I also don't want us to be vulnerable to Nazi behavior. What if census form asked if you wanted your data jittered? (like a micro point system [+/- 1 PersonPoint]) If there was a way to do this then the more people who opt out of the jitter the more transparent our data can become, and, it can become this way democratically. We could some day brag that we are so proud and brave that we do not fear others being able to pinpoint us based on our demographics because, for example, those around us our neighbors and fellow citizens would stand up for our human dignity in spite of the differences between us.
@goldengryphon
@goldengryphon 5 жыл бұрын
@@MRDAVIDJHEMMINGER Um ... That sounds amazingly dangerous. Granted, I was one who was all for free access to information years ago - had an open Wi-Fi, used Linux, did all the things - then was accused of storing illegal stuff on my computer. Not a fun experience and one that pushed me to re-examine how well I knew my neighbors and how good an idea having Open Source stuff and sharing information really was. End result, if I understood the video correctly, your vote to *not* 'jitter' your information can have the effect of allowing my information to be more easily discovered. (As this whole video reminded me of those word problem set ups you do when trying to find out who of three people with red hair rode a bicycle and ate a pie in a six person neighborhood. I'm not good with maths, so I'm trusting y'all to keep me honest.) In the course of using your un-jittered information, my information is more easy to extrapolate. Basically, your freedom to share your information is also sharing my information, that I may not want out there. I don't mind standing for human dignity. Nor do I really mind being part of certain kinds of scientific study. I do, however, mind being "voluntold" for a study I don't want to participate in just because someone else thinks it's a great idea. The freedom to *not* be included is also important and should be protected. I don't feel a need to have to brag to my great-grandchildren about how "brave and proud" I was to allow social scientists access to my personal data. I can feel brave and proud through having actual accomplishments like doing a job well, or writing a book, or creating something artistic. Being part of a movement to end privacy and personal information from being shared with whomever can cough up the money to purchase access to census data is not on my List of Things To Do. But thanks, anyway.
@ilayws4448
@ilayws4448 5 жыл бұрын
hmm... are you really 31 or did you roll a dice an added it to your real age to confuse us...?
@adama7752
@adama7752 5 жыл бұрын
2 d20s baby!
@Apersonl0l
@Apersonl0l 5 жыл бұрын
Adam A wait he can be -9 yo?
@carlwheeser140
@carlwheeser140 5 жыл бұрын
@@Apersonl0l No, no, of course not. You multiply them!
@nuumi7813
@nuumi7813 5 жыл бұрын
Hmm... so we are left with a total of 6 possible answers. Great job Sherlock!
@nuumi7813
@nuumi7813 5 жыл бұрын
Actualy the math was wrong... it's around 21 possible answers... yikes
@DelphinidaeZeta
@DelphinidaeZeta 5 жыл бұрын
Well now, it's neither a minute nor physics.
@ick-625
@ick-625 5 жыл бұрын
Been like that for a long time but I couldn't care less about title mismatch.. I am still getting my head around some of his vids when it was still minute (or two!) and physics!
@MichaelSteeves
@MichaelSteeves 5 жыл бұрын
So basically Sudoku with census data and supercomputers.
@Soken50
@Soken50 5 жыл бұрын
more like Einstein's riddle or zebra puzzle, but on a contry-wide scale :)
@JorgetePanete
@JorgetePanete 5 жыл бұрын
@@Ranakastrasz It's*
@JorgetePanete
@JorgetePanete 5 жыл бұрын
@@Ranakastrasz puzzle*
@opsoc777
@opsoc777 5 жыл бұрын
@@Ranakastrasz or*
@kmacdough
@kmacdough 5 жыл бұрын
@@opsoc777 for*
@EDoyl
@EDoyl 5 жыл бұрын
"I looked forward in time, I saw 14,000,605 futures." "How many were plausible?" "One."
@mattwinward3168
@mattwinward3168 4 жыл бұрын
Eoin Doyle Thanos should have jittered his time lines better.
@Mpire101
@Mpire101 5 жыл бұрын
As someone who actually publishes differential privacy research, I just would like to mention that a privacy budget of 30 is absurdly high; there are cases were a privacy budget of 30 would allow you to reconstruct someone's data with over 95% accuracy. On our research team, we would never consider a privacy budget above 5, and the gold standard was .01.
@ninetails6218
@ninetails6218 5 жыл бұрын
So how is this 30 split? Wouldn’t a higher number mean they have more to jitter the results?
@vermillion8521
@vermillion8521 4 жыл бұрын
Nine Tails yea, .01 means its basically the same as the original data. I think Monica Moniot is just trying to be smart
@polokan
@polokan 4 жыл бұрын
@@vermillion8521 Read that again. Privacy budget. That means how much privacy you can sacrifice. 0.1 is less than 30, therefore the first case has more privacy and less accuracy.
@chaslesvie2417
@chaslesvie2417 3 жыл бұрын
ty for the insight !
@kevinfontanari
@kevinfontanari 5 жыл бұрын
"And how do you protect your privacy?"
@anand.suralkar
@anand.suralkar 5 жыл бұрын
Nord vpn
@MelvinGundlach
@MelvinGundlach 4 жыл бұрын
Kevin Fontanari Only that those don’t really protect privacy but just move the problem.
@phsopher
@phsopher 5 жыл бұрын
This is a trailer for Henry's spinoff channel '12minutestatistics'.
@kevinmorrill8347
@kevinmorrill8347 4 жыл бұрын
I worked for the USCensus Bureau in 2005 or so, it was an interesting job, and we all took privacy and confidentiality very seriously. I hope all the men and women working on the 2020 census get to see this video.
@samposyreeni
@samposyreeni 2 жыл бұрын
It's truly rare to see a public service announcement involving math. Nicely done!
@tomsmith6878
@tomsmith6878 5 жыл бұрын
congrats on getting married
@Apersonl0l
@Apersonl0l 5 жыл бұрын
Thomas Smith gitters
@user-vn7ce5ig1z
@user-vn7ce5ig1z 5 жыл бұрын
*jitter
@patu8010
@patu8010 5 жыл бұрын
Because of the jittering, we can only know that he has 1±1 spouses
@votalis4089
@votalis4089 5 жыл бұрын
@@cameronbigley7483 1:02
@gabor6259
@gabor6259 5 жыл бұрын
Congrats on liking ice cream.
@besmart
@besmart 5 жыл бұрын
The census is a very important tool for keeping our government fair and functioning, and even though it’s been politicized lately, I’m really glad people like Henry are talking about it. Good policies start with good data.
@vkmishra364
@vkmishra364 4 жыл бұрын
But aren't we (humans, or any other intelligent life) violating the Universe's privacy to understand it better and also create new stuff that is both good and bad?
@Meekseek
@Meekseek 4 жыл бұрын
They have already have all the information they need.
@dasbootyliciousness271
@dasbootyliciousness271 5 жыл бұрын
This has been literally been the best educational video I have seen this year on KZbin. Why? Because it gives a solution for how companies should act, that collect huge amounts of data! Thank you.
@kaitlyn__L
@kaitlyn__L 5 жыл бұрын
yeah. when selling this stuff they should only sell jittered data, and only averages (even if the source data could be partially reverse engineered). right now data sold to and between ad networks is even worse than un-jittered correlatable averages, it's often just a selection of stats that omits your name, completely bare for the companies to correlate themselves without even having to try and unpack anything first. they're all technically not selling identifiable data if they say "she's a girl and likes cats" but with enough overlap between the information sharing, you only need one ad network picking up your location from a web search and they can eventually tie that together so they know all your preferences, your age, where you live.. just not technically your name. of course that last step, getting the name, is one of the easiest
@Rabcup
@Rabcup 5 жыл бұрын
So perhaps the DMV shouldn’t be selling people’s data to PI’s...
@klobiforpresident2254
@klobiforpresident2254 5 жыл бұрын
What is a PI?
@Abdega
@Abdega 5 жыл бұрын
Klobi for President Private Investigator I’m guessing
@JorgetePanete
@JorgetePanete 5 жыл бұрын
PIs*
@feronanthus9756
@feronanthus9756 5 жыл бұрын
You should take that up with your state government.
@R2Cv1
@R2Cv1 5 жыл бұрын
@@JorgetePanete (No, it's PI's) (Case is for abbreviations, numbers, individual letters, etc)
@radicalxedward8047
@radicalxedward8047 5 жыл бұрын
Most people would take a shiny box in exchange for their privacy. That’s what’s really scary.
@musikali1
@musikali1 5 жыл бұрын
You are so right... The census is nothing compared to the most intimate data corporations have on EVERYBODY
@axiostechno
@axiostechno 5 жыл бұрын
This would have been a perfect video to be sponsored by nordvpn or dashlane
@QlueDuPlessis
@QlueDuPlessis 5 жыл бұрын
Kaspersky advertised on it. Not sure how they engineered that, but I'm guessing Google was complicit. And given that Google already knows way more about each of us than all those census records...
@albingrahn5576
@albingrahn5576 5 жыл бұрын
i can hear the segue in my head lol
@ryuuji159
@ryuuji159 5 жыл бұрын
stop
@Flamingbob25
@Flamingbob25 5 жыл бұрын
@@QlueDuPlessis Well it was probably by the tags/title I believe thats how ads are chosen. That's why you will sometimes get ads for something on videos hating on that product.
@Markle2k
@Markle2k 5 жыл бұрын
@@QlueDuPlessis Kaspersky advertised on _your_ view. That just tells us what Google thinks about you.
@zoatheperson3012
@zoatheperson3012 5 жыл бұрын
When you get sponsored by a literal department of the US government...damn.
@anand.suralkar
@anand.suralkar 5 жыл бұрын
Lol and not by nord vpn
@shawniscoolerthanyou
@shawniscoolerthanyou 5 жыл бұрын
I wish the government sponsored more quality shit that I like. Not McDs hamberder dinners for athletes and F-35s.
@leovin00
@leovin00 5 жыл бұрын
“We’ve implemented complex mathematical algorithms to protect our data” Russian hacker: im about to end this mans whole career
@shrey1265
@shrey1265 5 жыл бұрын
*I LOVE HOW HE CAN ADD SCHRODINGER'S CAT TO EVERYTHING*
@udayy9897
@udayy9897 5 жыл бұрын
Hands down, one of the best creators on the internet!
@chadtarheel
@chadtarheel 5 жыл бұрын
Only ONE allowed: peak or peek. Never the two shall meet.
@alexwang982
@alexwang982 5 жыл бұрын
Endor what about a peeking peak?
@Ethan_N_A
@Ethan_N_A 5 жыл бұрын
MATH: Make America Think Harder #YangGang
@sonetagu1337
@sonetagu1337 5 жыл бұрын
Ironic your profile pic is american flag.
@ninetails6218
@ninetails6218 5 жыл бұрын
This is hilarious considering he just chickened out of the running
@AniaKovas
@AniaKovas 5 жыл бұрын
One of the most remarkable videos you've made IMHO. Thanks for all your hard work in explaining things.
@TheADHDNerd
@TheADHDNerd 5 жыл бұрын
First known census: 1086. 2019: "Oooo privacy!"
@CasshernSinz1613
@CasshernSinz1613 5 жыл бұрын
@@davidbechart7674 true
@klobiforpresident2254
@klobiforpresident2254 5 жыл бұрын
@@davidbechart7674 Funnily enough we know a census happened there at the time (more than one, actually) but we also know it cannot be the one the Bible describes, unless the Bible describes it incorrectly.
@lonestarr1490
@lonestarr1490 5 жыл бұрын
Well, there haven't been too many supercomputers around at 1086, have they?
@jorisd6584
@jorisd6584 5 жыл бұрын
@@klobiforpresident2254 Well maybe the bible scrambles tha data to ensure privacy /s
@wissamelkadamani9750
@wissamelkadamani9750 5 жыл бұрын
Me: Why the hell does anyone need privacy Minutephysics: *ICECREAM*
@freesk8
@freesk8 5 жыл бұрын
The census is authorized in the Constitution to count adults. But it is not authorized to collect information about their sex, race, income, sexual orientation, etc. So I always leave these blank on the census form. Leaving these out globally would increase the privacy of the census, without reducing its accuracy.
@kevburger
@kevburger 5 жыл бұрын
Exactly. The government is allowed to count people, nothing more. Every other demographic statistic is none of their business.
@saadisave
@saadisave Жыл бұрын
​​@@kevburgeryou cannot govern without knowing things about those whom you govern
@joshmckinney3254
@joshmckinney3254 5 жыл бұрын
This is something I am really interested in. I loved learning about combinatorics in college. I am certain that if we are going to continue to advance in technology at the rate that we are, we must revolutionize how privacy is handled. I am totally guilty of giving Google enormous amounts of information about my spending habits and hobbies, both willingly and unwillingly. I believe that the best way to create more accurate and advanced algorithms (especially with neural networks) is to "feed the beast" and as much data as possible; plus, I feel like I have nothing to hide. That being said, I am growing increasingly weary of who is getting their hands on that data and how it is being used.
@joshuadeyoung5540
@joshuadeyoung5540 5 жыл бұрын
The Level of alliteration in this video is remarkably satisfying! !
@xigong3009
@xigong3009 3 жыл бұрын
Thanks you for this video, minute physics! I have started watching this channel since middle school, back then most of the videos were about physics. Now as a senior in college, I have found differential privacy as one of my research interests. I am really grateful and surprised to see this excellent expository video on DP, from a channel I loved as a kid!
@yaitz3313
@yaitz3313 10 ай бұрын
Assuming there was no jitter, how much computing power would it take to get any remotely useful privacy-violating information out of the Census?
@artified3498
@artified3498 5 жыл бұрын
Moral, give him anything he gives u back MATH...EQ....and...ya...CATS......
@b-init1221
@b-init1221 5 жыл бұрын
See math is everywhere, you can't run from it... Even after being the President
@battleroyale1760
@battleroyale1760 5 жыл бұрын
😥😥
@jamesdolph437
@jamesdolph437 5 жыл бұрын
but he is president ... and you are not
@ninetails6218
@ninetails6218 5 жыл бұрын
Wago Howard He dropped out LOL
@ujjwaLoL
@ujjwaLoL 5 жыл бұрын
So I can tell that I am 5 years less or more than 13. So I am 18 technically and 8 also so I can play 18+ games
@ujjwaLoL
@ujjwaLoL 5 жыл бұрын
@foolish fellow OK bro
@chatboss000
@chatboss000 5 жыл бұрын
Actually, no. Jittering is for privacy reasons, not evidence - based ones. If your age is 13 +/- 5 years , there's no solid evidence that you're old enough to buy a game and you probably won't be able to buy it. You're sacrificing your ability to buy anything at or below your age to protect your privacy - but that's your call to make :)
@klobiforpresident2254
@klobiforpresident2254 5 жыл бұрын
@@chatboss000 Even worse, he could claim to be 22±5 and he couldn't buy those games. Sorry privacy conscious 26 year old OP.
@anand.suralkar
@anand.suralkar 5 жыл бұрын
Hmmm thats not right duh
@TheScienceBiome
@TheScienceBiome 5 жыл бұрын
Certainly an odd sponsor, but amazing video nonetheless!
@wesleyrm76
@wesleyrm76 5 жыл бұрын
I would be happy if every government agency did things when they're doing public outreach. Plenty of science agencies, especially NASA, have been doing this for years.
@azmyadzkiansyah279
@azmyadzkiansyah279 5 жыл бұрын
9:21 Henry, that's not how seesaws work. If one side goes up the other goes down.
@burtosis
@burtosis 5 жыл бұрын
That's only Euclidean seasaws.
@LeBonkJordan
@LeBonkJordan 5 жыл бұрын
In my opinion, the concern with privacy isn't that everyone can know an individual's information; it's that single large groups can know disproportionately large amounts of information about large quantities of people. I believe if one group can access not-necessarily-sensitive data like my age and sex, then everyone should be able to as well.
@alliesakat
@alliesakat 5 жыл бұрын
"Prominence of peaks on the possibility plot"
@Bellonging
@Bellonging 5 жыл бұрын
The summary at the end of this video is absolutly wonderful. the whole video was great but dense and the summary helps contain it all.
@dm_nimbus
@dm_nimbus 5 жыл бұрын
Adam Smith was one of my professors in grad school. I really enjoy this topic; thanks for covering it!
@ReimuandCirno
@ReimuandCirno 5 жыл бұрын
Is jittering the only tool census takers have at their disposal for protecting privacy? What other (mathematical) methods have been explored?
@sabouma
@sabouma 5 жыл бұрын
And this lads is now called Minutephysics' uncertainty principle
@cavemaneca
@cavemaneca 5 жыл бұрын
Oh hey, it's one of my favorite channels, TwelveMinuteMath
@deep.space.12
@deep.space.12 5 жыл бұрын
But... can't you extrapolate private information from previous non-rigorously-scrambled census data?
@harpiesd96
@harpiesd96 5 жыл бұрын
yes, but we gotta start sometime right?
@deep.space.12
@deep.space.12 5 жыл бұрын
​@@harpiesd96 Right. I just wonder if the "averaging the noise" at 7:38 applies to scrambling within the same census, or across time. Similarly, if the "privacy loss budget" applies across time. Unlike your password, which you can change, if census data has been "leaked" anytime in the past, the privacy will be forever lost unless that person dies. Otherwise "15 yrs old male black arab" will be "20 yrs old male black arab" 5 years later.
@TheTrueRandomness
@TheTrueRandomness 5 жыл бұрын
Differential privacy actually addresses this issue quite neatly: All the 'plausibility' stuff is always formulated in terms of changes between prior beliefs (what an attacker already knew before we released our answer) and posterior belief (what he thinks is plausible now, knowing our answer). So while the 2020 census obviously cannot undo any privacy loss from previous years, it will guarantee that even someone who did attacks on previous years will learn (very close to) no additional information about anybody from the 2020 census. Sure, it's not perfect, but you can't undo information leakage, no matter how much jittering you add to the new census ;) At least this formal thing guarantees that it won't help make attacking the old stuff easier and that you will learn basically nothing additional about the data.
@OpreanMircea
@OpreanMircea 5 жыл бұрын
I love the new longer format, I wasn't really sure I was watching minutephysics
@MrWvid
@MrWvid 5 жыл бұрын
Minutephysics posts a video which is about math, and which lasts 12 minutes. Me: Cool, that is what I subscribed for(no sarcasm)
@horselover19
@horselover19 5 жыл бұрын
Great video! Thanks Question: Do you know if these models take into account the noise inherently present in any data collection, especially on this scale? I.e., assuming even zero perturbation of the data, the knowledge you get from it is still a proxy of the real information (due to human errors, intentional misinformation, etc.), so taking that into account might give you some leeway in your "privacy budget". Maybe this can be modeled as an increase in budget without harming privacy? Just a thought :)
@Kavukamari
@Kavukamari 5 жыл бұрын
it's nice to know that at least someone in the government cares about our privacy
@CatholicElectrician
@CatholicElectrician 5 жыл бұрын
12-minute physics These longer videos require more attention but you definitely get more accurate info. I wish this guy could release videos more often
@binaryglitch64
@binaryglitch64 5 жыл бұрын
Nice explanation of the importance of considering the repercussions de-anonymization algorithms.
@dux2508
@dux2508 5 жыл бұрын
In Sweden, if you ask the government, you can get a list of the citizens of Sweden with where they live, what they earn, the Swedish form of social security number etcetera. Much information is public.
@cosmicreciever
@cosmicreciever 5 жыл бұрын
Thought this was going to be yet another video on encryption and was pleasantly surprised. Nice work!
@willis936
@willis936 5 жыл бұрын
Information theory was the hardest class in my MS but I got a lot out of it. I think everyone should learn the stats that were talked about in this video, it really grants a useful perspective.
@cube2fox
@cube2fox 5 жыл бұрын
What's MS?
@willis936
@willis936 5 жыл бұрын
Trurl master of science
@cube2fox
@cube2fox 5 жыл бұрын
@@willis936 What subject? Statistics? I did a bachelor's degree in computer science and we never had a a class on information theory. It just popped up a tiny bit in the context of error correction algorithms. Our class on theoretical computer science focused on formal languages, computability, and complexity -- information theory wasn't mentioned once.
@willis936
@willis936 5 жыл бұрын
Trurl Electrical engineering. Information theory is likely only an undergrad course for math majors (and is likely optional in most programs). It would be an optional grad course for EE, CS, and (probably) physics. For the practice of computer science (and not just programming), information theory isn’t terribly relevant, but it is necessary for a lot of practical applications (compression and encryption).
@cube2fox
@cube2fox 5 жыл бұрын
@@willis936 Sounds interesting. Encryption is unfortunately a topic we also skipped. Since I'm using Bayesian probability theory (in philosophy, not computer science) I noticed the relevance of information theory. Hard to get into it by your own, unfortunately.
@Brocseespec
@Brocseespec 3 жыл бұрын
2:45 "hiMOmIluvYou" XD i just died laughing at that
@p07a
@p07a 5 жыл бұрын
This is the most comprehensible minutesphysics video ever. Did somebody tell you to summarize your points like... three times in this video? (Very unlike your previous videos)
@chickenman297
@chickenman297 5 жыл бұрын
Database reconstruction theory is used in mechatronics to determine the state variables of a system given limited sensor information.
@Zipo214
@Zipo214 5 жыл бұрын
Although I am reluctant for huge descriptive statistics for demographics on the census for people who are still alive. Yet, the census is a massively important document for ancestry and family history documenting and discovery. The more information that is on the census, the more accurate you can find your family. It’s important. Even though it can be concerning when people get ahold of this census early. It should be noted that the census doesn’t release a census for 50 years after. So the 1970 census isn’t even publicly available yet.
@greenredblue
@greenredblue 5 жыл бұрын
I don’t know why but I got halfway through your comment before realizing it was _not_ a brilliant riff on “I am the very model of a modern major general”. Very disappointed, two stars.
@mattwinward3168
@mattwinward3168 5 жыл бұрын
Honestly though, someone could figure out way more information about you by data mining your public internet accounts. And with internet accounts there is far more relevant data at stake; like your mother’s maiden name could be found through inference from facebook posts, or the bank you use could be found from pictures, GPS data, or off hand comments. That’s with a few more questions would be enough to change the password on your bank account and wire money out. It probably would even use less processing power than bitcoin.
@cl759
@cl759 5 жыл бұрын
Aaaannnnd you need the 1970 census to find your gammy?? Ah, diddums...
@klobiforpresident2254
@klobiforpresident2254 5 жыл бұрын
You use the census for that? What happened to the ancestral chart, registrar's office and family bible?
@goldengryphon
@goldengryphon 5 жыл бұрын
@@greenredblue I am already trying to figure a way to get that song out of my head. "I've information vegetable, animal, and mineral; I know the kings of England, and I quote the fights historical, from Marathon to Waterloo, in order categorical." Argh. Double Argh.
@lodevijk
@lodevijk 5 жыл бұрын
Can't an attacker extrapolate the new census data from the "protected" statistics combined with past census data?
@sabriath
@sabriath 5 жыл бұрын
Sounds to me that bayes theorem against a random chance of false data can come in handy. Similar to asking people if they do something that is specifically embarrassing without knowing whether they actually do that specific thing....by having them roll a die or flip a coin to answer the question (a flip of heads means "answer truthfully" and a flip of tails means "always answer yes", leads far more people to answer honestly without the test taker knowing if it's true). So if you give everyone say a 10% chance of false data at a 50% range, then you end up with only a 5% draw down on the overall survey without revealing much information individually.
@drac124
@drac124 5 жыл бұрын
I already lie my age or date of birth in things that doesn`t matter such as hair dresser salon registration, forum registration, even facebook. So if you lie to US Census (in a way that you think it won`t matter) like saying you are 31 instead of 28 and then the US Census change that a little more, you get a bunch or wrong data all over.
@abhishekchary6585
@abhishekchary6585 5 жыл бұрын
I feel smart now! Truly underrated channel. It deserves more.
@abhishekchary6585
@abhishekchary6585 7 ай бұрын
Back again after 4 years
@buckyball2003
@buckyball2003 5 жыл бұрын
Yay! My favourite channel, 12minutemaths has uploaded a new video! (I do actually love the video, I’m not hating, I just think it’s funny.)
@videogyar2
@videogyar2 5 жыл бұрын
Why does privacy matter so much in this regard? It's not like there's any sensitive information in it.
@St0RM33
@St0RM33 5 жыл бұрын
What if each participant writes a random value which is then used to introduce noise to each census (their own or of a different participant)? This wouldn't introduce enough random noise that the final results will be both accurate and impossible to reverse with 100% accuracy? Kinda like one-way encryption
@ConManAU
@ConManAU 5 жыл бұрын
It's a valid option, although in practice you'd probably get the Census Bureau to do that to the data before they calculate what they publish. It's got some benefits over adding noise to the published figures, but there are also drawbacks, mostly around that problem of keeping the results accurate enough to be useful.
@Jo_Wick
@Jo_Wick 5 жыл бұрын
Short Answer: no, because the attack on privacy of sensitive information relies mostly on the published figures, and as such, when the attackers try to find the plausible inputs, they find every person's fake, random data in addition to the real data from the rest of them. Privacy would still be violated for most persons; the noise must be implemented into the published figures, which fundamentally change everyone's most probable data in such a way as to be rendered incorrect to the attacker.
@samtibbitts
@samtibbitts 5 жыл бұрын
1:41 but the datasets *are* kept secret for 72 years. Without the datasets the averages and totals don’t reveal private information.
@jdbrinton
@jdbrinton 5 жыл бұрын
The thing is that in nearly any given community epsilon is soooo small as to be practically unusable. I worry that this video will leave many thinking that we're dealing with epsilons that have useful value beyond being intellectually stimulating. I'm glad we're implementing matrix smoothing, but for everyone else who doesn't understand the math, there's nothing for you to worry about and you should really complete your census. Us nerds are busy battling privacy concerns long long before they should be of concern for people who aren't mathematicians.
@dreffon9213
@dreffon9213 5 жыл бұрын
A problem I see with this is that a lot of the useful information we gain from the survey is by looking at the evolution of data, compared to previous census. But in the end you might just be looking at random noise
@tec4303
@tec4303 5 жыл бұрын
Well the census bureau protected your privacy but many other government agencies still violate it.
@abhishekchary6585
@abhishekchary6585 7 ай бұрын
Thanks!
@Merennulli
@Merennulli Жыл бұрын
I love seeing how this shakes out in extreme cases like the dying towns that are down to a population in the single digits. The one single-person town that I know of infamously has a census population of "2" because of this jitter but obviously ALL data from that town is that one woman with jitter applied.
@hitwalkhook3831
@hitwalkhook3831 5 жыл бұрын
Dear minutephysics. I have a question. I am currently a science student in chemistry and I saw a video about quantum teleportation recently and i want to ask you something. Is it possible to use quantum computers with every electron being a kind of data(like 0 and 1) to make accurate measurement of scanning something very far? If so, then could we use that data to determine the chemical compound of the said object? If so, can we also use quantum computers and high sources of energy to accurately control light with controlled heat as well very far to break chemical bonds light years away? I am asking this because if that is right, then could we possibly make chemical reactions occur very far in space?(And thinking about it more, is it possible to teleport someone like that if it is accurately controlled?) By accuracy i mean with a lot of like near 100% but not 100% accuracy. Bonus question: Will it be possible in the future that instead of electrons determining the value of a data as of 0 and 1, we would use their spin quantum number to determine a 0 and 1 then for a secondary array... I will just state this below: [0,0,0,0] [it's spinning right side, it is moving on the 0 magnetic quantum number, the shape of the orbit is like a sphere, it is on the 1st shell.] 1s1 [1,1,1,1] [It is spinning on the left side, it is moving on the 1st magnetic quantum number, the shape of orbit is like a dumbell, it is on the 2nd shell] 1s2, 2s2 2p3 [1,0,0,0] [It is spinning right side, it is moving on the 0 magnetic quantum number, the shape of orbit is like a sphere, it is on the 1st shell] 1s2 And so on, or is it already like that? Or would it have some measurement problems? Thank you for attention. Sidenote: Currently it is 0:00am here and i am half dead, so sorry if i messed up the quantum numbers in any way.
@실험싫엄
@실험싫엄 5 жыл бұрын
Can we create "high plausibility peaks" at the positions deliberately located "off the accurate value"? So that the result from the brute force mechanism might lure the hackers into the wrong conclusion?
@LaunchPadAstronomy
@LaunchPadAstronomy 5 жыл бұрын
Brilliant demonstration and discussion of data privacy. Thank, and great job!
@aresharesh8671
@aresharesh8671 5 жыл бұрын
I thought that the unevenness of the distribution would be measured in something like bits of entropy. Is there a particular reason they use maximum slope instead?
@aBigBadWolf
@aBigBadWolf 5 жыл бұрын
I have the same question. I thought it would be something based on maximum entropy.
@ConManAU
@ConManAU 5 жыл бұрын
Well, "slope" in this case means "relative difference in probability between two adjacent possible input data sets", and in the worst case the difference between those two sets is "I'm in the data" and "I'm not in the data". So it's an attempt to measure the likelihood that someone could identify you uniquely in the data.
@aBigBadWolf
@aBigBadWolf 5 жыл бұрын
@@ConManAU Maximum slope would be a garbage replacement for maximum likelihood. Imagine a probability distribution that is a smooth analogy of the Heaviside step function that has been shifted such that most of the probability mass is uniform with a small step on the edge. The Entropy of this distribution is still very high but so would be your maximum slope. Maximum slope seems such a naive way of measuring this, surely there are better methods for that.
@ConManAU
@ConManAU 5 жыл бұрын
@@aBigBadWolf What you're describing is a case where there is a distinct point separating "possible" and "impossible" input data. From an attacker's view, this means that they now know something - any facts that are only true in the "impossible" data are not true of the real data. For example, maybe that means they can tell for sure that the true count of people in a particular area *must* be in a particular range since the probability drops to zero outside that range. That said, there's an alternative formulation of differential privacy that introduces a second parameter that effectively means you only compare the slope between adjacent possibilities that have a probability above a certain cutoff. Also, one thing the video didn't really mention is that differential privacy is designed to measure the privacy loss of a disclosure control mechanism agnostic of the underlying data (in the case of the ice cream example, it means that you can measure the delta value of drawing the random number regardless of how many people are actually in the data, how many of them like ice cream, and so forth). I suspect that an entropy-based measure would be more specific to the data, which makes it less useful for measuring the effectiveness of different mechanisms being applied in different circumstances.
@aBigBadWolf
@aBigBadWolf 5 жыл бұрын
@@ConManAU 1.) Yes, the Heaviside analogy is a setting where we can distinguish likely datasets from unlikely datasets. I have understood that this means that the attacker's certainty has increased a little. My objection with the Heaviside analogy is that the attacker's uncertainty over the true dataset can still be huge even if the maximum slope is very steep. (I'm assuming a permutation of the set of all datasets that would also maximise the slope. I don't know how you can even talk about a slope when the possibilities cannot be ordered. It's a set after all.) 2.) What is the name of that alternative formulation? I would like to read more about it. 3.) I'm not very familiar with differential privacy. I couldn't find quickly anything with regards to such a maximum slope technique. Entropy would also be data agnostic. It is just the negative average log probability of every conceivable dataset. If you use log base 2 it can be interpreted as the average number of yes/no questions you would have to ask to become 100% certain which dataset resulted in the statistics you see. If the entropy is 0 it means that the dataset is identifiable (like in the icecream example in the video) if the entropy is 10 it means that there are 2**10 datasets that are equally good candidates. Maximum slope makes no sense. Change my mind.
@Nakimi190
@Nakimi190 5 жыл бұрын
I think cats at Minuit Physics is evolving at an alarming rate... THE AGE OF CATS ARE UPON US!
@klobiforpresident2254
@klobiforpresident2254 5 жыл бұрын
I for one welcome our new furry fuzzball overlords.
@debonairrose
@debonairrose 5 жыл бұрын
@@klobiforpresident2254 go see a doctor!
@Mr.D.C.
@Mr.D.C. 5 жыл бұрын
Drinking game: take a shot every time Henry says a word that starts with p
@madjedi2235
@madjedi2235 5 жыл бұрын
I never thought a video on the census would be so cool
@AKT1610
@AKT1610 5 жыл бұрын
"Don't get into those maths. Maths has not helped Einstein discover gravity"
@shichengrao5314
@shichengrao5314 4 жыл бұрын
To whoever said that(maybe not you): So? Math helped with about ten quadrillion things. I’m also pretty sure math helped discover gravity, so joke’s on you
@MeisterHaar
@MeisterHaar 5 жыл бұрын
i like the last part about surveys. i am currently in university and i have never heard about that happening. i study a social science so we make and also take part in studies and surveys. now i think most of them could be traced back to each individual, including me. also i think we should force online surveys to do that kind of stuff.
@SciSky
@SciSky 5 жыл бұрын
Is the jitter added taken from a normal distribution or a uniform distribution? Or something else entirely?
@Zw285
@Zw285 5 жыл бұрын
I would love to see a video on the prisoner's dilemma.
@cube2fox
@cube2fox 5 жыл бұрын
This is a topic from game theory, but he is a physicist. Then again, he could work with experts as in the current video.
@rubes3927
@rubes3927 4 жыл бұрын
There’s something so relaxing about these stick figure drawings 😂 also the cat obsession is real
@CrushGM
@CrushGM 5 жыл бұрын
When Minnesota is mentioned as an example. 1/50 odds. I'm buying a lotto ticket TODAY!
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
any opinions and errors are my own, cuz the census bureau is infallible..
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
@@robertjansen6019 i agree with your point, i didnt make the implication you seemed to take from my comment
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
Here is why that bothered me originally though, it almost sounded sponsered. dont think it is but the wording just throws me off a bit. These disclaimers arent made for every souce he ever cited so why go out of the way to excuse the census bureau..conspiracists maybe and yeah i also get that point. i just wouldnt have said it this way or made that point at all.
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
unless it is sponsered in which case the way it was stated is even more sketch and a lot less honest than if it isnt sponsored content
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
if it is sponsored then the fact that i'm confused about the issue says enough i think..
@FailasaurusRex
@FailasaurusRex 5 жыл бұрын
lil tipsy, excuse my spelling and grammar and punctuation mistakes xD not fixing it
@8is
@8is 5 жыл бұрын
USA: We go through a lot to ensure the privacy of our citizens. Sweden: We don't do that here.
@rea8585
@rea8585 5 жыл бұрын
Privacy, what is that? 😀
@holypython4418
@holypython4418 5 жыл бұрын
The opposite of china
@unflexian
@unflexian 5 жыл бұрын
Dead.
@doemaeries
@doemaeries 5 жыл бұрын
@@holypython4418 and google
@sodiboo
@sodiboo 5 жыл бұрын
Oh you wouldn’t know, your pfp looks like its taken from your facebook
@JorgetePanete
@JorgetePanete 5 жыл бұрын
@@sodiboo it's*
@RahulAgarwal97
@RahulAgarwal97 5 жыл бұрын
Wow. Nice video.. quite in depth. Just a question, isn't computer security also just maths? (Encryption is just a bunch of mathematical operations which is easy to compute from one side, and almost impossible from the other).
@yogiHalim
@yogiHalim 5 жыл бұрын
if the jittering is insiqnificant, then bruteforce used might just round it up and get accurate data?
@alexwolfeboy
@alexwolfeboy 5 жыл бұрын
A big thing to consider is also what specific data is being released. Your ago is no big deal, seriously though, and thusly if the “privacy” of that data is broken it is unimportant. It’s one things giving away addresses, it’s another age and gender.
@kennethkho7165
@kennethkho7165 5 жыл бұрын
Rigorously perfectly balanced, as all things should be
@cavios8889
@cavios8889 3 жыл бұрын
But can't the old, unguaranteed datasets be used to extrapolate the true data and thus understand the jitter used for the newer datasets?
@danielwick7756
@danielwick7756 5 жыл бұрын
This video is awesome. Complicated statistical concept made much easier to understand. Thank you!
@Nosirrbro
@Nosirrbro 5 жыл бұрын
It’s nice to see sponsorships by things other than corporations. At least, not directly the same.
@Arthur0000100
@Arthur0000100 5 жыл бұрын
But how many statistics (data) are released? I thought that the amount was negligible compared to the 50+ mil population
@Sivah_Akash
@Sivah_Akash 5 жыл бұрын
What is the problem of anyone comes to know of our age, gender and other personal data?
@jackiecs8190
@jackiecs8190 5 жыл бұрын
I noticed that you modeled gender as "M," "F," and "T." This is a poor model. Transgender people are not a separate gender; many of us are male or female. I think you were thinking about nonbinary people, who are a subset of transgender people. In the situation where you have to use one letter, the proper choice for nonbinary people is "X." This is common on official IDs in the US and other countries. It is inappropriate to lump all transgender people in with nonbinary folks because it takes away our clearly identified binary gender.
@sUmEgIaMbRuS
@sUmEgIaMbRuS 4 жыл бұрын
The notion of "gender" is a non-solution to a non-problem.
@Eric14492
@Eric14492 5 жыл бұрын
I would like to see you do a video on attempting an actual count, v. using a more accurate sampling survey. Since this is, unfortunately, a political issue, I didn’t expect it to be mentioned in a video sponsored by the Census Bureau.
How to Build a Lava Moat (with xkcd)
3:55
minutephysics
Рет қаралды 1,7 МЛН
The Logistics of the US Census
13:03
Wendover Productions
Рет қаралды 928 М.
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
How Quantum Computers Break Encryption | Shor's Algorithm Explained
17:31
minutephysics
Рет қаралды 3,1 МЛН
What math and science cannot (yet?) explain
18:15
Zach Star
Рет қаралды 1,9 МЛН
Feynman's Lost Lecture (ft. 3Blue1Brown)
21:44
minutephysics
Рет қаралды 3,5 МЛН
It Turns Out, Hybrids Are Really Hard On Engines
12:26
Engineering Explained
Рет қаралды 1,7 МЛН
Why You Fear Your Own Potential
9:50
Sisyphus 55
Рет қаралды 42 М.
Bell's Theorem: The Quantum Venn Diagram Paradox
17:35
minutephysics
Рет қаралды 8 МЛН
Einstein's Biggest Blunder, Explained
6:04
minutephysics
Рет қаралды 841 М.
The No Cloning Theorem
10:04
minutephysics
Рет қаралды 2,4 МЛН
This is How Easy It Is to Lie With Statistics
18:55
Zach Star
Рет қаралды 6 МЛН
Why Penrose Tiles Never Repeat
6:37
minutephysics
Рет қаралды 1,3 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН