Very cool lib indeed ! Thanks for all your videos Mark, always something to learn from your videos
@learndatawithmark3 ай бұрын
Awesome - glad they're helpful :)
@willTryAgainTmrw3 ай бұрын
Great! Also loved your parquet series when I watched it probably a year back. Subbed.
@learndatawithmark3 ай бұрын
Good to have you back and thanks for the kind words!
@Kk-ed1gr3 ай бұрын
absolutely useful tip
@aryamasingh34133 ай бұрын
I didn't understand how you performed document/paragraph chunking? Is each line a "chunk"? Also, when you are creating the embeddings of each chunk, where are you storing them? In a vector database?
@learndatawithmark3 ай бұрын
Yeh each line is a chunk. In that dataset (extracted from a BBC news article) they had one sentence paragraphs, so I just made each sentence a chunk.