Automate like a pro - join our AI automation community 👉 www.theaiautomators.com/?C1 And check out this video to learn how I compiled the spreadsheet to load into the RAG system - kzbin.info/www/bejne/d2LanqWmlqicjaM
@AIEmployeesWithCosmo17 күн бұрын
For the JSON a great way in make is to use the create JSON module and then using the JSON variable as input for the API call. That module makes the variable json safe :)
@TheAIAutomators17 күн бұрын
Thanks for that ... Yes, that solution would have worked here. I've created other Make automations where the structure of the input JSON is dynamic so the Create JSON module doesn't really work as a solution. That's where this hack came from and it's now my go to! Make really needs to come up with a "MAKE_JSON_SAFE" function. It's ridiculous that you need to formally create a "data structure" to simply escape some characters to pass into a web service call!
@AIEmployeesWithCosmo17 күн бұрын
@@TheAIAutomators Yeah fully agree! When its dynamic you have a problem. This seems like an obvious standard tool that make should provide :) Love how you have a solution for dynamic JSONs as well. Btw also learning lots from you guys! You have great content. Keep up the good work.
@mysound777Ай бұрын
insanely valuable. Liked and subscribed. THANK YOU
@TheAIAutomatorsАй бұрын
You're very welcome!
@amzpro5734Ай бұрын
That safe JSON string clean up variable is badass 🔥🤘
@TheAIAutomatorsАй бұрын
Ha! Thanks - Make seriously need to build a native function for this
@amzpro5734Ай бұрын
@@TheAIAutomators Yes totally 👍Your code fixed a json bug I had with the Gemini API too 🙏 One question: What if the website also had 500 blog articles, would this data be upserted into the same Pinecone Db as the product info? E.g. Would all data to do with website X ( product page data, articles, company pages, how to guides etc) all go into one central Pinecone Db so it could then be efficiently queried by the AI model?
@TheAIAutomatorsАй бұрын
Yeah that's a good question. It really depends on your use case so you'd need to do some testing and trial and error with it. Having everything in a single vector store could result in no products being returned, for queries where you may only want products for example. So it may be better to have the different entities in separate namespaces and have logic around which vector store to call and when. (Similar to how AI Agents work with function calling) - There's no one-size-fits-all; it's specific to requirements really
@juanpasalagua24022 ай бұрын
Great job! Thanks
@TheAIAutomators2 ай бұрын
You're welcome!
@StephenGPope8 күн бұрын
Wonderful video, only 1/4 of the way through.
@TheAIAutomators8 күн бұрын
Thanks Stephen, appreciate it! Love your work
@RickKristalijn-AI2 ай бұрын
Incredible video!! Would love to see how to set up the ranking system.
@TheAIAutomators2 ай бұрын
Thanks! For reranking, there are a few options. Pinecone has a reranking endpoint that you can pass the query and all the retrieved documents to, which uses an open source reranker model - www.pinecone.io/learn/series/rag/rerankers/ huggingface.co/BAAI/bge-reranker-v2-m3 Another option is to use a foundation model (gpt 4o for example) ... Pass the query and a collection of documents and ask it to provide an ordered scoring that you can then filter by. The approach will likely depend on the length of text in each document.
@RickKristalijn-AI2 ай бұрын
@@TheAIAutomators Thanks a lot!!
@davidlorincz73392 ай бұрын
Nice one. Thx.
@agsvk-comАй бұрын
Hi, Thank you so much! Very thorough. Will sign up later as a member. Can you show how to implement the reranking based on another model @28:00? Also, which model is best for reranking? I was watching other local storage vector solutions. Are they any good for accuracy of results returned compared to Pinecone or other vector databases?
@TheAIAutomatorsАй бұрын
You're very welcome and we'd love to welcome you to the community! On the re-ranking question, it really depends on your use case. There are dedicated lightweight reranker models both in Pinecone and on huggingface if your use case requires very high volumes of queries. These can be fine tuned if it's a specialized subject matter. If your use case is lower volume (like in this video), you could use an LLM to carry our the reranking. I've done a version of this in one of our micro templates in the community where I build an ecommerce chatbot on botpress. You can see it around the 36minute mark here - kzbin.info/www/bejne/ZmOqqYCuYruhirMsi=7-EWTjCutPYQD02e. In it I ask the LLM to review the 15 items returned from Pinecone and filter out irrelevant results (so it's more of a filter than a reranker) I also do a version of it in our news automated system where again, I ask it to provide a score based on an input and a context and if that score is over a certain threshold, the automation progresses. You can achieve a lot via prompt engineering and trial and error here - either in a single shot or multi-stage pipeline of prompts. If you were getting back 15 results for example, you could feed those in altogether and get the LLM to use chain of thought based on a detailed brief, to provide the rationale for the relevance of each item and then provide a score and output an array of objects in JSON. Then you can reorder based on those scores. Or if there's too much data, you can send them in smaller batches or one at a time and aggregate and rerank after. With this approach you could use any LLM really. LLMs are more sophisticated than Vector search which is usually done using the likes of cosine similarity and distances between vectors. So with the right prompting and a bit of patience, you should get some quality outputs. The LLM approach to reranking can slow down the output, which was a problem in the ecommerce chatbot example above. So speed and cost are considerations as well. On the question about local vector solutions vs Pinecone and if they're any good in terms of accuracy of results. In theory, if they use the same algorithms like cosine similarity, then they should be exactly the same. The variable is what embeddings model you use. Hope some of that helps Daniel
@jackmckague2 ай бұрын
Amazing video mate. I would love to know what you used to create the spreadsheet off of the website with all the information regarding the products the prices and the various pages at the beginning of the database.
@TheAIAutomators2 ай бұрын
Thanks Jack! I thought I got away with it! Turns out creating the spreadsheet is almost more work then doing the automation itself. So to keep the video around the 30minute mark, I just left it completely out! I have a video on how I created this spreadsheet which you can watch below (from minute 11 to 22) kzbin.info/www/bejne/ZmOqqYCuYruhirM In a nutshell, it's a mixture of data exports from the ecommerce store, along with building a custom crawler to extract the data and then vlookups betweent he different data sources. Anyway, not straightforward!
@jackmckague2 ай бұрын
@@TheAIAutomators fair on!!😂😂
@TheAIAutomators10 күн бұрын
Hey Jack - just published a video on how I created that spreadsheet - kzbin.info/www/bejne/d2LanqWmlqicjaM
@aimakerslabАй бұрын
what are the options to do it in a secured way for a client that need to fit with security mwssurments? Like health tech and so on
@TheAIAutomatorsАй бұрын
Great question! Running everything locally or in a private cloud would be one way to go about it. FAISS is a local vector store that you could use instead of Pinecone. You can run local embeddings models and LLMs on Olama or LM Studio for example. Possibly tie it all together with an N8N instance locally. It's a more hands-on setup then what I showed in the video but a lot of larger companies do it. Another approach is the anomynisation of personally identifiable information before creating the embeddings or sending to the LLM. Then you may be able to use the likes of OpenAI on Azure cloud which supports stricter data security requirements.
@cristobalgomezgutierrez8546Ай бұрын
28:00 Hi thanks!! Could you please develop the re-ranking phase? No developer here... and thanks! God bless the reader
@TheAIAutomatorsАй бұрын
Hi! I just left a comment above for another viewer where I talk through some options on how to build the reranker. Hope that's helpful