Very well presented. Many thanks for presenting this, especially the section of testing dataframes, quite clear and succinct
@lalligood5 жыл бұрын
This talk had me pumped when I got back to work after watching Jes Ford demonstrate the hypothesis library to fabricate a test pandas dataframe. I also think that getting some testing practices into folks doing data science is long overdue. What a fantastic presentation!
@mailsiraj3 жыл бұрын
Fantastic and very practical presentation. I really loved the division of work into 3 buckets - one-off, exploratory and defined work and applying slightly different testing strategies for each one of them, rather than being pedantic about testing. I learnt a number of useful ideas to improve my pandas testing. I am gonna checkout Hypothesis library
@Nino234mff5 жыл бұрын
An excellent talk! A different wording from Jes but I think she would agree with me; data scientists in many situations do two jobs, science and engineering. I myself practice TDD for engineering and quick defensive programming for science. As Jes noted, science part is too exploratory to do TDD, it's just not well suited for TDD. However, you will thank yourself if you write tests for engineering part such as feature engineering.
@orianabaldizan82095 жыл бұрын
Please don't use "asserts" in production code. Try catch and properly handle exceptions.
@АртёмСелезнёв-ь7л4 жыл бұрын
Could you tell pros and cons of your opinion?
@yangyu73094 жыл бұрын
@@АртёмСелезнёв-ь7л not OP, but if you run `python -O script.py`, python will skip all assert statements. Some CI softwares I believe runs python in -O mode. -O stands for `optimize mode`. It can also be disabled with the PYTHONOPTIMIZE environment variable. Because assert can be globally turned off, it is not recommended that you use `assert` outside of testing. With `assert` you also wouldn't be able to define your error, since it always raises assertion error. stackoverflow.com/questions/40182944/difference-between-raise-try-and-assert