Jes Ford - Getting Started Testing in Data Science - PyCon 2019

  Рет қаралды 6,519

PyCon 2019

PyCon 2019

5 жыл бұрын

"Speaker: Jes Ford
How do you know if your data science results are correct? Robust software usually has tests asserting that certain conditions hold, but as a data scientist it’s often not straightforward or obvious how to integrate these best practices. Our workflow includes exploration, statistical models, and one-off analysis. This talk will give concrete examples of when and how testing should play a role, and provide you with enough introduction to get started writing your first data science tests using `pytest` & `hypothesis`.
Slides can be found at: speakerdeck.com/pycon2019 and github.com/PyCon/2019-slides"

Пікірлер: 9
@anantharamaniyer9135
@anantharamaniyer9135 2 жыл бұрын
Very well presented. Many thanks for presenting this, especially the section of testing dataframes, quite clear and succinct
@lalligood
@lalligood 5 жыл бұрын
This talk had me pumped when I got back to work after watching Jes Ford demonstrate the hypothesis library to fabricate a test pandas dataframe. I also think that getting some testing practices into folks doing data science is long overdue. What a fantastic presentation!
@erectlocution
@erectlocution 5 жыл бұрын
Fantastic presentation. Not only is it a nice introduction to testing generally, and then specific libraries, it's also a nice peak at a couple practical analytical methodologies.
@mailsiraj
@mailsiraj 3 жыл бұрын
Fantastic and very practical presentation. I really loved the division of work into 3 buckets - one-off, exploratory and defined work and applying slightly different testing strategies for each one of them, rather than being pedantic about testing. I learnt a number of useful ideas to improve my pandas testing. I am gonna checkout Hypothesis library
@Nino234mff
@Nino234mff 5 жыл бұрын
An excellent talk! A different wording from Jes but I think she would agree with me; data scientists in many situations do two jobs, science and engineering. I myself practice TDD for engineering and quick defensive programming for science. As Jes noted, science part is too exploratory to do TDD, it's just not well suited for TDD. However, you will thank yourself if you write tests for engineering part such as feature engineering.
@orianabaldizan8209
@orianabaldizan8209 5 жыл бұрын
Please don't use "asserts" in production code. Try catch and properly handle exceptions.
@user-nm6ns2cf6o
@user-nm6ns2cf6o 3 жыл бұрын
Could you tell pros and cons of your opinion?
@yangyu7309
@yangyu7309 3 жыл бұрын
@@user-nm6ns2cf6o not OP, but if you run `python -O script.py`, python will skip all assert statements. Some CI softwares I believe runs python in -O mode. -O stands for `optimize mode`. It can also be disabled with the PYTHONOPTIMIZE environment variable. Because assert can be globally turned off, it is not recommended that you use `assert` outside of testing. With `assert` you also wouldn't be able to define your error, since it always raises assertion error. stackoverflow.com/questions/40182944/difference-between-raise-try-and-assert
Haha😂 Power💪 #trending #funny #viral #shorts
00:18
Reaction Station TV
Рет қаралды 14 МЛН
Please be kind🙏
00:34
ISSEI / いっせい
Рет қаралды 184 МЛН
She ruined my dominos! 😭 Cool train tool helps me #gadget
00:40
Go Gizmo!
Рет қаралды 61 МЛН
Trey Causey: Testing for Data Scientists
39:34
PyData
Рет қаралды 9 М.
Unit Testing for Data Scientists - Hanna Torrence
39:11
PyData
Рет қаралды 25 М.
Erdős-Woods Numbers - Numberphile
14:12
Numberphile
Рет қаралды 71 М.
Haha😂 Power💪 #trending #funny #viral #shorts
00:18
Reaction Station TV
Рет қаралды 14 МЛН