Choosing Indexes for Similarity Search (Faiss in Python)

  Рет қаралды 21,416

James Briggs

James Briggs

Күн бұрын

Пікірлер: 18
@Nick-vs1zp
@Nick-vs1zp 2 жыл бұрын
Great explanations, especially for IVF - it's probably the best explanation for how it works that I've seen.
@jamesbriggs
@jamesbriggs 2 жыл бұрын
thanks Nick!
@narayansharma8797
@narayansharma8797 3 жыл бұрын
Thanks a bunch for this, James! Would be really great to see a couple of them get explored in depth. Also, if you could benchmark FAISS against ScaNN, it will help a few of us noobs a hell lot. Great content! Lovely command over your content. Really need more of this.
@jamesbriggs
@jamesbriggs 3 жыл бұрын
Hey Narayan, there is a video released already covering the 'traditional' version of LSH, and two more videos that will be released at 1200 ET today on the random projection version of LSH (used in Faiss) - and there are plenty more of these on the way ;) I love the FAISS vs ScaNN idea too, will be working on it soon!
@narayansharma8797
@narayansharma8797 3 жыл бұрын
@@jamesbriggs Sold!
@harshitjaitly6850
@harshitjaitly6850 3 жыл бұрын
Super Informative Content! Thank you so much for this.
@katehan9623
@katehan9623 2 жыл бұрын
Thank you for your video. Most Valuable Channel. Do you use GPU for indexing in this projects?
@ChrisZuo
@ChrisZuo 3 ай бұрын
Thank you! The drawings are cute!
@grayrigel7091
@grayrigel7091 2 жыл бұрын
Hi James. Thanks for such a wonderful tutorial. Really useful. A quick question, For a new query vector, is it possible to return the IVF cell/partition that it belongs to, instead of returning the neighbors? I think I can measure the distances with centroids and return the closest centroid. However, I was thinking if there is built-in way.
@mohammadyahya78
@mohammadyahya78 Жыл бұрын
Does the IVF algorithm works with high dimensional data please like 100?
@haneulkim4902
@haneulkim4902 Жыл бұрын
Thanks for amazing video! Do you know why simple K-means are not used for these MIPS problems?
@itheenigma
@itheenigma 3 жыл бұрын
Super useful! Thanks for this video James. For IVF, can we retrieve the clusters that each datapoint belongs to after training (also cluster centroids)?
@jamesbriggs
@jamesbriggs 3 жыл бұрын
Yes you can, there is info on it here gist.github.com/mdouze/904e0b538ef7767c9e83a45ac1b57d1b The code you need to write (after training and adding your data to 'index') is: invlists = index.invlists all_ids = [] for l in range(ind.nlist): ls = invlists.list_size(l) if ls == 0: continue all_ids.append( faiss.rev_swig_ptr(invlists.get_ids(l), ls).copy() )
@itheenigma
@itheenigma 3 жыл бұрын
@@jamesbriggs legend. Will give it go. Ta!
@viorelteodorescu
@viorelteodorescu Жыл бұрын
What does IP stand for?
@nareshsandrugu6057
@nareshsandrugu6057 2 жыл бұрын
Can share the video assume I have binary data of train and test, so need to calculate the haming distance, I didn't found any videos using faiss ,if share the video that may more helpful
@mohammadyahya78
@mohammadyahya78 Жыл бұрын
what is nbits please at 10:21?
@basedscienxe6632
@basedscienxe6632 20 күн бұрын
This is the amount of bits for the precision of each component in the vector I believe
Faiss - Introduction to Similarity Search
31:37
James Briggs
Рет қаралды 59 М.
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 18 МЛН
Бенчик, пора купаться! 🛁 #бенчик #арти #симбочка
00:34
Симбочка Пимпочка
Рет қаралды 4,1 МЛН
Wait for it 😂
00:19
ILYA BORZOV
Рет қаралды 10 МЛН
Osman Kalyoncu Sonu Üzücü Saddest Videos Dream Engine 269 #shorts
00:26
Index 2024 Talk: Vector Search and the FAISS Library
30:56
3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)
29:24
We Fell For The Oldest Lie On The Internet
13:08
Kurzgesagt – In a Nutshell
Рет қаралды 2,6 МЛН
Product quantization in Faiss and from scratch
24:39
mildlyoverfitted
Рет қаралды 7 М.
Product Quantization for Vector Similarity Search (+ Python)
29:37
James Briggs
Рет қаралды 11 М.
HNSW-FINGER Explained!
30:32
Weaviate • Vector Database
Рет қаралды 2,9 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 363 М.
How LSH Random Projection works in search (+Python)
19:08
James Briggs
Рет қаралды 7 М.
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 18 МЛН