18 Months of Pgvector Learnings in 47 Minutes (Tutorial)

Рет қаралды 6,154

Күн бұрын

Пікірлер: 22

@NatColley-t4z Ай бұрын

Excellent, excellent, excellent. It does even more than I had merely hoped for. Forgive me, postgres, I should never have doubted you.

@theointechs Ай бұрын

Massivelly underrated video! So much valuable info, thank you so much!

@TimescaleDB 24 күн бұрын

Thanks! Glad you found it helpful.

@gauthamvijayan 18 күн бұрын

With this single video - I was able to understand what I need to become a AI Engineer by leveraging PostgresSQL extensions and vector databases and then to consume them in my React/React Native Applications. Thanks a ton for making these videos. The instructor needs a raise for putting everything so well together.

@TimescaleDB 16 күн бұрын

That's awesome - thanks for sharing! Glad we could help.

@dbanswan 3 ай бұрын

Amazing video, learnt a lot. Will make time to read timescale blog regularly.

@TimescaleDB 3 ай бұрын

Thanks! much appreciated

@BruntPixels1234 3 ай бұрын

You should do more tutorials like these

@TimescaleDB 3 ай бұрын

What additional topics would you like to see? Let us know and we can make it happen.

@renobodyrenobody Ай бұрын

Well, after trying the whole thing I think the caveat here is to use pgai that depends on OpenAI: this is not local, you have to pay for the tokens, your data are going away and it is a black box. So I found another way, coding some functions locally in Postgres to use ollama locally with local models: no privacy leak or data leak, no token cost. This is what I understood but I am a rookie.

@TimescaleDB 24 күн бұрын

pgai supports Ollama so you can use local models for greater privacy and lower cost. Check out the Github repo for more. The example used in the video is with OpenAI, but pgai also supports Ollama, Cohere, and Anthropic models too.

@renobodyrenobody Ай бұрын

Mmm... Old school engineer here, spent more 30 years with db systems. And now I understand I don't want a black box RAG system but I want to implement AI stuff with PG ! For me there is a little thing you can do better: add some examples for retrieving data without and with the vectors, especially when there is a where clause. Other than that, your video is a big source of inspiration. Thanks a lot.

@TimescaleDB 24 күн бұрын

Thanks for the feedback!

@ram8849 2 ай бұрын

Hi, your presentation gives me a clear idea of vector DB (I am new to it). May I ask a question about the example in 18:03 If I understand whats going on correctly, you are encoding every row's columns (or their combinations) to vector data type, and then the same to the verbose text query using llm model text-embedding-3-small, so you can compare them with cosine similarity and output the top result, therefore we can get data in the columns and send them WITH the original verbose text query to the llm as usual. 1. Is this the idea of what RAG does? 2. If so, since what is being stored in the row/columns are raw data (string/int whatever), for example a date could be a expiry date of a password/member since/birthday/etc anything, should we embedded the original data directly, or turn them verbose first BEFORE encoding to vector, in order to get a better result? Or depends on the situations? For example sex, age, verbose description, embedded(verbose description), embedded([sex, age]) m, 18, "this is a man in age 18", [0.01, .......], [.........val in vector]

@awakenwithoutcoffee 2 ай бұрын

lovely like usual , there is indeed allot to learn but were getting closer :) Bless you, ps. Regarding storing structured and unstructured data in the same table : are you using the technique to store complete structured tables inside a JSONB ? We thought about this approach but dropped it in favor of separating structured from unstructured data to prevent mis-matching and allow for better isolation/scaling. Still experimenting but currently our set-up creates 1 table per structured document and infer the SCHEMA dynamically on upload + the embedding. Than the Agent decides on run-time which tables to query on. Unstructured documents can be bundled together more easily but placing all document types together can give false-positive search results ?

@afaha2214 2 ай бұрын

what is the postgres sql client being used? looks like supabase

@jroy3427 2 ай бұрын

PopSQL, it was acquired by Timescale a few months ago

@TimescaleDB 24 күн бұрын

It's PopSQL

@orenmizr Ай бұрын

give me more videos like this please : )

@renobodyrenobody Ай бұрын

Also, where is StreamingDiskANN come from? Seems only IVFFLAT and HNSW are here but diskann SQL Error [42704]: ERROR: access method "diskann" does not exist! Ha, got it: pgvectorscale !

@TimescaleDB 24 күн бұрын

Correct, install pgvectorscale and you can access the StreamingDiskANN index

@SageRap 3 ай бұрын

Appreciate the video. Just FYI, you're pronouncing the word "build" like "bulled" throughout the video, but most native speakers pronounce it like "billed"