Chain of Thought with Retrieval for LLMs

Рет қаралды 3,402

Vivek Haldar

Күн бұрын

Пікірлер: 5

@NLPprompter 8 ай бұрын

ahhhh another excellent content

@aishaal-harbi1929 5 ай бұрын

Great video!

@huveja9799 11 ай бұрын

For production it would be necessary to see if the increase in efficiency compensates the increase in cost (inference is more expensive and retrieval can also be more expensive) ..

@ThomasTomiczek 8 ай бұрын

Yes and no. FIrst, the question is whether the cost matters or not - if the answer otherwise is not good enough, it is using AI or not, cost is not a factor between different AI implementations. Second, while you have a good case - NOT, what about in a year or two? AI costs come down. FAST. Mamba are 5x more efficient compute wise as transformers on SMALL setups. So, costs are an implementation detail that one has to watch - but heck, things are brutally fluid here. It may well be "ok, too expensive, let's do it as wait some months" or "lets pay higher cost till the next price reduction".

@huveja9799 8 ай бұрын

@@ThomasTomiczek As you say, the answer is yes and no. The cost variable depends on your context. If the computing cost for running a more powerful model exceeds my budget, well, I'm going to have to use a worse model until the one I want to use is more efficient. On the other hand, if the most powerful model takes much longer to make the inference and my clients leave because they don't wait, in that case I will have to use a worse model again until the other one improves its efficiency. The cost is relative to one's pocket and the customer's tolerance, as it always was.