Hi Greg and Wiz. Great tutorial. I am actually applying it to my own application. I was wondering what would you sugges to do if the whole document size is large more than 700 pages. It want be passed in the contextual chunking function. If I take the 500 pages around the chhunk, the chaching wont work. Please can you advice? Thanks Saurabh
@AI-MakerspaceКүн бұрын
I would build some metadata, in this case, like a summary/outline and use that to generate contextual augments.
@PhiliBuster-i7t3 күн бұрын
I'm super happy for this video supernova was on my radar last month.
@MegaClockworkDoc6 күн бұрын
I am trying to understand the unique concepts of this paper. It sounds like this is a workflow of agents and programmatic validators to synthetically generate DPO data. Is the system self learning as well?
@AI-MakerspaceКүн бұрын
It can be online, yes.
@rakeshkumarrout26296 күн бұрын
Hey this is quite useful.can you help me in how large action model works? The recent Claude computer or ominiparser and lavauge model or how the rabbit mq works.can you help with the codes refference or implementation.thank you
@AI-Makerspace6 күн бұрын
We're planning to cover computer use in an upcoming event soon - probably nov 13. Stay tuned!
@hoopNscoops6 күн бұрын
Great information here! Thanks for making it public. I think you're going to get a sizeable community around you because of these live streams. Q: where in the code is prompt caching evoked?
@AI-MakerspaceКүн бұрын
Caching is offered by Anthropic's endpoint by default - and is being taken advantage of under the hood here.
@AI-Makerspace7 күн бұрын
Calibrated reward: github.com/huggingface/trl/pull/2155/files Mixture of judges: github.com/huggingface/trl/pull/2159/files CGPO Trainer (single task single objective): github.com/huggingface/trl/pull/2190/files Event Slides: www.canva.com/design/DAGVDvGDG54/kEflcFEuGxDKMTYb6Rj2vA/view?DAGVDvGDG54&
@MarkDavisRocks7 күн бұрын
at 00:24:15 you give a formula for faithfulness, think it is flawed a bit. Should be (#Claims from the answer which exist in the context) / (#claims in answer). Otherwise there could be >1 result.
@AI-Makerspace5 күн бұрын
Can you be more specific about what the flaw is? Also, why do you choose the word "exist" rather than "inferred from?" --Here's what appears to be true from the documentation: -- "To calculate this a set of claims from the generated answer is first identified. Then each one of these claims are cross checked with given context to determine if it can be inferred from given context or not." Three steps to the calculation: 1. Break generated answer into statements 2. For each statement, verify if it can be inferred 3. Calculate Faithfulness! It seems that the condition "if (and only if) it can be inferred from the context" will keep the faithfulness calculation from going higher than 1.0
@niting19788 күн бұрын
Great work here Richard and Gil - loved the demo
@richardgower48908 күн бұрын
Love this guys. Great job!
@safiKhan-zl1ri11 күн бұрын
Hi Sir, Do you need a professional youtube thumbnail designer...?
@johnini13 күн бұрын
hello!! thank you a lot for the videos! what is the best way to interact in sort of chat engine of chat loop with a workflow?
@AI-Makerspace13 күн бұрын
Can you expand on your request?
@johnini13 күн бұрын
@@AI-Makerspace thank you for answering! I'm curious about the best practices for building a chat engine or chatbot that can interact in a continuous loop with a workflow. Currently, we are receiving one response at a time from the workflow, but I was wondering if we could enhance this by buffering the "chatmemory" and keep on with the conversation. Should this be achieved with a loop? I feel like I remember a llamaIndex or Langchain tool that kept the chat engine running, but I might be mistaken, maybe I was just re-querying. Also, how can I ensure other workflows share the same context? Additionally, is it possible to store interactions as vectorized semantic and episodic memories in a vector database, allowing the system to recall past conversations and in the future query from those memories and the RAG? and maybe do some type of reranking.
@johnini13 күн бұрын
from llama_index import SimpleDirectoryReader, VectorStoreIndex from colorama import Fore, Style, init init(autoreset=True) def chat(): print(f"{Fore.CYAN}Loading documents...") index = VectorStoreIndex.from_documents( SimpleDirectoryReader("./data").load_data() ) chat_engine = index.as_chat_engine() print(f"{Fore.GREEN}Ready! Type 'quit' to exit ") while True: query = input(f"{Fore.GREEN}You: {Style.RESET_ALL}").strip() if query.lower() == 'quit': break if query: print(f"{Fore.BLUE}Assistant: {Style.RESET_ALL}{chat_engine.chat(query)} ") if __name__ == "__main__": try: chat() except Exception as e: print(f"{Fore.RED}Error: {e}")
@MegaClockworkDoc13 күн бұрын
Great video, but using text that the model was already trained on is a bad test case
@AI-Makerspace13 күн бұрын
Agreed! We typically stick with easy to consume toy-examples, however!
@solyarisoftware13 күн бұрын
Hi! I really appreciated your video. BTW, I wrote an article titled "SWARMing Conversational AI: Integrating No-Code and Code in Agent-Based Workflows," which you can find online. I would love to hear your feedback on my perspective (SWARM emphasis on blending no-code instructions with hardcoded conversational steps. Thanks! Giorgio
Sure thing, just pinned the slides and notebook in a comment!
@enespacalar17 күн бұрын
Congratulations dude
@danielusvyat17 күн бұрын
Great video! I’m excited to dive into contextual retrieval next week. When it comes to productionizing hybrid retrieval with BM25, I’m considering using Elasticsearch, any other recommendations? My main concern with hybrid retrieval is the added complexity it brings to the production.
@AI-Makerspace14 күн бұрын
Elasticsearch is a great tool for this!
@cmagganas18 күн бұрын
🚨MERT ALERT
@seanbergman892720 күн бұрын
Great video and demo as always! I learn much from your content. The contextual retrieval paper said if your corpus is less than 200k tokens, just skip rag and dump the entire corpus into the prompt for every question, and they will cache it (but only for a short time) and just use long context Q&A. I didn’t see them publish any metrics comparing long context to rag, so I take it with a grain of salt. They do want customers to spend as many tokens as possible... But I’m very intrigued at the same time. Maybe you could do a video comparing the two methods? That would be amazing research.
@AI-Makerspace18 күн бұрын
Great insights and instincts @Sean! We'll keep the content recommendation in mind for sure! This is farthest we've gotten on Long-Context and Evaluation for the big window LLMs: kzbin.infoBrwhbjh3boU?si=V24z6pagQ0EQ8Ms1
@micbab-vg2mu21 күн бұрын
thanks :)
@AI-Makerspace21 күн бұрын
Would the results be even better when combined with semantic chunking? Answer: research.trychroma.com/evaluating-chunking
@AmanBansil21 күн бұрын
RAG-ception 0:55 - Context of the contextually generated chunks. Got it...got..it.......got it....ok wait what? Need to watch the whole thing.
@AI-Makerspace21 күн бұрын
Re; Would the results be even better when combined with semantic chunking? For more on Semantic Chunking strategies: research.trychroma.com/evaluating-chunking
Where this complete video, id like to understand this loss fuction and the matrix hessian
@AI-Makerspace24 күн бұрын
Hey Givanildo! The full event is here: kzbin.infoxmaG4al2A6E?si=bdHM0wzlll5XkXWJ To learn more about loss functions, check out this one! kzbin.infoiB8FWR9aD5Q?si=4oABKIf-DDNQQv1R
Thanks for the great video. Subscribed! Question - We saw here that "similar pairs" were trained where the pair implies a (question , context). Is it possible to get good results by fine-tuning on a "similar questions" dataset i.e. (question1, question2) and the difference between those 2 questions is usually one word/phrase. So question1 would have the full-form of an entity; and question2 the acronym of the same entity. Reason I'm doing this is that I'm storing a mix of questions and contexts in my vector Database. If the user's query matches a question then I look up the corresponding answer (static answer that almost never changes - so no LLM required). If the match is a context instead, then LLM generation takes over.
@AI-Makerspace20 күн бұрын
Yes, that is a decent way to approach that problem.
@nazmusas28 күн бұрын
How do I get a scholarship?
@AI-Makerspace18 күн бұрын
We don't currently have scholarships available @nazmuss! We are working to get our business model right and to grow our partnerships in the US so we can best serve our community members around the world in the long-term moving forward! In short, stay tuned!
@yerson55729 күн бұрын
Where does ground truth come from? Is this a human annotated property? I understand the ground truth in RAGAS refers to the correct answer to the question. It's typically used for the context_recall metric. But how to we get this? Human in the loop? LLM generated? More documents from the retrieval? Thank you?
@AI-Makerspace29 күн бұрын
"Ground Truth" can come from any of these sources! Of course, getting it straight from the people who perform whatever tasks you're automating is the right idea, but this can be very expensive. In the case of RAGAS the "Ground Truth" is represented by the output you get when you provide [question, retrieved context] pairs as input to a generator. That is, we are not actually using a RAG system, but passing "correct" [question, context] pairs as input. These are "correct" because they were synthetically generated and are known to be correct; see Synthetic Test Data Generation: docs.ragas.io/en/stable/concepts/testset_generation.html Note that Ground Truth is different than "Answer" because "Answer" actually uses the RAG application that you're building, while "Ground Truth" passes [question, context] pairs in direclty.
Trong thang đánh giá kỹ thuật Chunking thì Chunking theo ngữ nghĩa và chunking theo agent được đánh giá ở cấp 4 và 5. Thực nghiệm cho thấy chunking agentic sử dụng LLMs cho kết quả cao nhất. Cấp 1: Tách ký tự - Các đoạn dữ liệu ký tự tĩnh đơn giản Cấp 2: Tách văn bản ký tự đệ quy - Chia nhỏ đệ quy dựa trên danh sách các dấu phân cách Cấp 3: Tách theo từng loại tài liệu - Các phương pháp chia nhỏ khác nhau cho các loại tài liệu khác nhau (PDF, Python, Markdown) Cấp 4: Tách ngữ nghĩa - Chia nhỏ dựa trên embedding. Kỹ thuật này chia đoạn văn bản thành các đoạn nhỏ dựa trên ngữ nghĩa, thay vì chỉ dựa vào độ dài cố định. Cấp 5: Tách dùng agent - Agentic Chunker: Agentic Chunker tự động nhóm các propositions (mệnh đề) có liên quan vào các chunks (nhóm). Khi thêm một proposition mới, hệ thống sẽ xác định xem có nên thêm nó vào một chunk hiện có hay tạo một chunk mới.
@DataScienceandAI-doanngoccuongАй бұрын
Trong thang đánh giá kỹ thuật Chunking thì Chunking theo ngữ nghĩa và chunking theo agent được đánh giá ở cấp 4 và 5. Thực nghiệm cho thấy chunking agentic sử dụng LLMs cho kết quả cao nhất. Cấp 1: Tách ký tự - Các đoạn dữ liệu ký tự tĩnh đơn giản Cấp 2: Tách văn bản ký tự đệ quy - Chia nhỏ đệ quy dựa trên danh sách các dấu phân cách Cấp 3: Tách theo từng loại tài liệu - Các phương pháp chia nhỏ khác nhau cho các loại tài liệu khác nhau (PDF, Python, Markdown) Cấp 4: Tách ngữ nghĩa - Chia nhỏ dựa trên embedding. Kỹ thuật này chia đoạn văn bản thành các đoạn nhỏ dựa trên ngữ nghĩa, thay vì chỉ dựa vào độ dài cố định. Cấp 5: Tách dùng agent - Agentic Chunker: Agentic Chunker tự động nhóm các propositions (mệnh đề) có liên quan vào các chunks (nhóm). Khi thêm một proposition mới, hệ thống sẽ xác định xem có nên thêm nó vào một chunk hiện có hay tạo một chunk mới.
@shivafeaz882Ай бұрын
It's looks the like Llama Index's version of the Microsoft Prompt Flow😅
@AI-MakerspaceАй бұрын
Perhaps! We haven't looked closely into that specific tool. We do our best to focus on the patterns that underlie the toolsets so we can make connections like this though - good stuff!
@D_Analyst007Ай бұрын
bro. it's 1,999 USD for pple like me from asian countries can you please adjust it to at least PPP. Course is good but at 1999 I can have my colg degree here in India. Ideal price is 299-499 USD
@AI-MakerspaceАй бұрын
Thanks for the feedback! The "ideal price" is especially helpful! We understand that the current price point make it difficult for some people around the world to take part in our courses. AI Makerspace is committed to building transformation learning experiences that encourage the growth of our global community, and as part of that we aim to build pathways, programs, and products that make it easy for people to achieve their AI career goals, no matter where they're located. However, as an early-stage boostrapped startup searching for product-market fit, we are working to build out our core product line and company processes first. This will ensure the financial stability that will allow us to serve people like you around the world even better in the years to come. Stay tuned!
@D_Analyst007Ай бұрын
@@AI-Makerspace Thank you for your response. If I may add, you can charge fully, but in installments; that could be a viable option.
@AI-MakerspaceАй бұрын
@@D_Analyst007 Thanks for the additional feedback! This is something that we have on our list to test out in 2025!
@bradhatch8302Ай бұрын
Love this. Thanks for explaining this step by step (pun not intended).
@AI-MakerspaceАй бұрын
LAWL GOTTEM
@736939Ай бұрын
Can you please share the Jupyter file (Colab link)?
@AI-MakerspaceАй бұрын
Here you go! colab.research.google.com/drive/128vrBmON3535EGy5cW-j_JPLJr6ARxqM?usp=sharing
Limited mindset, solved it on a consumer product in like 4 days. non deterministic vs deterministic system validation 🙏
@Max-hj6nqАй бұрын
its so product dependent. this conversation is happening too early in the tech adoption cycle to hold any merit
@AI-MakerspaceАй бұрын
@@Max-hj6nq "It depends" is always the meta answer. It's just not very useful! Hard to talk about these things in general, but they are the questions people want answered!
@seanbergman8927Ай бұрын
Great topics! Sounds like lots of people must be in the phase of moving from demo to prod. Exciting times! I’m in the midst of this now after pitching my RAG POC to corporate execs. I found that streaming the LLM outputs with pipeline that involves functions with multiple query tools is not easy! Need to get asynchronous streaming working next. Another topic for prod is going to be switching to FastAPI and a React widget for the web team, moving to AWS or Azure and integrating with their tracing and guardrail tools. Also on the agenda is red teaming on dev website before launch. Also need admin panel and a scraping & indexing pipeline if context is from your company website content (mine is starting with blogs, etc.)
@AI-MakerspaceАй бұрын
WTG Sean awesome stuff, man! These are great additional topics for us to consider events on in the future - thanks!
@aneerpa8384Ай бұрын
👍
@孙姣姣Ай бұрын
That's very helpful to me. Thank you
@AI-MakerspaceАй бұрын
Love to hear it! For even deeper dives on RAGAS, check out our videos on RAG Assessment for LangChain RAG (kzbin.infoAnr1br0lLz8?si=Lf8cmhSUw3u0IpMD) and Synthetic Data Generation (SDG) (kzbin.infoY7V1TTdEWn8?si=-fTs08wrKGYattkA)!
@lucindalinde4198Ай бұрын
OMG such a helpful sesh. I had heard these terms before but never understood how they work together and what they mean. Love the care you put into explaining important concepts and illustrating them with real examples.
@AI-MakerspaceАй бұрын
Thank you Lucinda! We're looking forward to keep building slowly upward further into production from here!