Jiaqi Liu Fuzzy Search Algorithms How and When to Use Them PyCon 2017

  Рет қаралды 7,195

PyCon 2017

PyCon 2017

Күн бұрын

"Speaker: Jiaqi Liu
Fuzzy Searching or approximate string matching is powerful because often text data is messy. For example, shorthand and abbreviated text are common in various data sets. In addition, outputs from OCR or voice to text conversions tend to be messy or imperfect. Thus, we want to be able to make the most of our data by extrapolating as much information as possible.
In this talk, we will explore the various approaches used in fuzzy string matching and demonstrate how they can be used as a feature in a model or a component in your python code. We will dive deep into the approaches of different algorithms such as Soundex, Trigram/n-gram search, and Levenshtein distances and what the best use cases are. We will also discuss situations where it’s important to take into account the meaning or intent of a word and demonstrate approaches for measuring semantic similarity using nltk and word2vec. Furthermore, we will demonstrate via live coding how to implement some of these fuzzy search algorithms using python and/or built-in fuzzy search functions within PostgreSQL.
Slides can be found at: speakerdeck.com/pycon2017 and github.com/PyCon/2017-slides"

Пікірлер: 1
@rberry932
@rberry932 7 жыл бұрын
Great talk. Didn't know hardly anything about this subject before.
Eric Evenchick   Hacking Cars with Python   PyCon 2017
29:48
PyCon 2017
Рет қаралды 15 М.
아이스크림으로 체감되는 요즘 물가
00:16
진영민yeongmin
Рет қаралды 47 МЛН
KINDNESS ALWAYS COME BACK
00:59
dednahype
Рет қаралды 131 МЛН
МАМА И STANDOFF 2 😳 !FAKE GUN! #shorts
00:34
INNA SERG
Рет қаралды 4,9 МЛН
The child was abused by the clown#Short #Officer Rabbit #angel
00:55
兔子警官
Рет қаралды 25 МЛН
How Fuzzy Text Search Works
18:36
Big Python
Рет қаралды 13 М.
Benny Bauer   Python in The Serverless Era   PyCon 2017
30:47
PyCon 2017
Рет қаралды 9 М.
Eric J  Ma   Bayesian Statistical Analysis with Python   PyCon 2017
30:41
Kelsey Hightower - Keynote - Pycon 2017
33:24
PyCon 2017
Рет қаралды 31 М.
This Is Why Python Data Classes Are Awesome
22:19
ArjanCodes
Рет қаралды 795 М.
아이스크림으로 체감되는 요즘 물가
00:16
진영민yeongmin
Рет қаралды 47 МЛН