Building a Semantic Search Engine on Hugging Face | Future of Data & AI | Data Science Dojo

  Рет қаралды 6,250

Data Science Dojo

Data Science Dojo

Күн бұрын

Пікірлер: 6
@Datasciencedojo
@Datasciencedojo Жыл бұрын
Hello everyone, you can find all the remaining tutorials here: kzbin.info/aero/PL8eNk_zTBST-Yjhzsvnq075UntUkhMLcz
@anapaunovic8405
@anapaunovic8405 Жыл бұрын
Is there a link for this notebook?
@brijboda
@brijboda Жыл бұрын
Hello Hamza, Thank you for the video. I am not from Computer science background but still I could follow this explanation. Can you please elaborate how to create a database for the search engine? Like you combined all the comments and now query will go into this database for all the further process. And how you applied embedding part and all reviews got converted into dataframe ?
@Datasciencedojo
@Datasciencedojo Жыл бұрын
Hello Brij, here is an overview of how a database can be created for a search engine, as well as how to apply embeddings and convert reviews into a data frame. 1. Creating a database: To create a database for a search engine, you first need to determine what data you want to include in the database. This may include product information, customer reviews, ratings, and other relevant data. Once you have determined what data you want to include, you can create a database schema to define the structure of the database. This schema should specify the tables, fields, and relationships between the data. Next, you can use a database management system (DBMS) such as MySQL, PostgreSQL, or MongoDB to create the database and populate it with the data. This can be done using SQL statements to insert the data into the appropriate tables. 2. Applying embeddings: To apply embeddings to the reviews, you first need to choose an embedding technique such as Word2Vec or GloVe. These techniques map words or phrases to vectors in a high-dimensional space, which can capture semantic relationships between words. Once you have chosen an embedding technique, you can use a pre-trained embedding model or train your own model on a corpus of text data. To apply embeddings to the reviews, you can tokenize the text into individual words or phrases, and then map each word or phrase to its corresponding embedding vector. You can then compute an average or weighted average of the embedding vectors for each review to obtain a single embedding vector that represents the overall sentiment or topic of the review. 3. Converting reviews into a dataframe: To convert reviews into a dataframe, you can use a Python library such as Pandas. First, you can retrieve the reviews from the database using SQL queries. You can then store the reviews in a Pandas dataframe, with each row representing a single review and each column representing a feature of the review, such as the text, the rating, the date, and the product ID. You can also apply embeddings to the review text and store the resulting embedding vectors as additional columns in the dataframe. This can allow you to perform similarity searches or clustering on the reviews based on their semantic content. Overall, creating a database for a search engine and applying embeddings to the data can be a complex process, but it can help to improve the accuracy and relevance of search results.
@apollosbangalu3717
@apollosbangalu3717 Жыл бұрын
Hello? Thank you for the video. Please, I would like to learn more about semantic Search.
@Datasciencedojo
@Datasciencedojo Жыл бұрын
Hello Apollo, your feedback is noted. We will try to host some more live sessions on this topic. Stay tuned with us!
What is Semantic Search?
11:53
Cohere
Рет қаралды 32 М.
UFC 310 : Рахмонов VS Мачадо Гэрри
05:00
Setanta Sports UFC
Рет қаралды 1,2 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Beat Ronaldo, Win $1,000,000
22:45
MrBeast
Рет қаралды 158 МЛН
$0 Embeddings (OpenAI vs. free & open source)
1:24:42
Rabbit Hole Syndrome
Рет қаралды 271 М.
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 4,9 МЛН
Beyond Semantic Search with OpenAI and Pinecone
1:05:32
Pinecone
Рет қаралды 15 М.
Understanding Semantic Search Strategies
33:11
Repustate
Рет қаралды 5 М.
NER Powered Semantic Search in Python
17:44
James Briggs
Рет қаралды 7 М.
Introduction to Architectures for LLM Applications
1:10:19
Data Science Dojo
Рет қаралды 7 М.
Stanford CS25: V3 I Retrieval Augmented Language Models
1:19:27
Stanford Online
Рет қаралды 174 М.
UFC 310 : Рахмонов VS Мачадо Гэрри
05:00
Setanta Sports UFC
Рет қаралды 1,2 МЛН