Analyzing the Costs of Large Language Models in Production

  Рет қаралды 4,711

TensorOps

TensorOps

Күн бұрын

Пікірлер: 15
@CresGallego
@CresGallego 8 ай бұрын
Really great insights. Economics is well explained.
@tensorops
@tensorops 8 ай бұрын
Thank you!
@mohamedfouad1309
@mohamedfouad1309 9 ай бұрын
😊
@loopaal
@loopaal 7 ай бұрын
fantastic
@tensorops
@tensorops 7 ай бұрын
Thank you so much 😀
@billykotsos4642
@billykotsos4642 8 ай бұрын
Being handed a bill based on tokens generated by a model is preposterous... These LLM apps cost so much right now that you need to have a solid use case in mind.... Else you just wait for a couple more years when inferencing these LLMs wont be as expensive... the only reason these LLMs are so expensive to run is that they are SOTA and Nvidia is the only player right now.
@billykotsos4642
@billykotsos4642 8 ай бұрын
the economics are broken because the hardware setup just isnt there... instead of paying by the hour you pay by the token/call which is insane..... Cloud has been build on the idea that you fire up the instance and you know what you pay.... but these days you need huge cloud instances to run these huge models... The costs will go down significantly to run these models in about 3 years.... you wont have to think about these things...
@lionhuang9209
@lionhuang9209 9 ай бұрын
Where can we download the slides?
@balainblue
@balainblue 7 ай бұрын
Can you explain the math of 5 requests per minute translating it to 9,000$ per month?
@tensorops
@tensorops 7 ай бұрын
We recommend looking here gptforwork.com/tools/openai-chatgpt-api-pricing-calculator Assuming 220K requests, with proper prompts that are usually 1000-2000 tokens you can get to these costs. Additionally we want to remind that often a single request to an LLM application triggers more than one API call to an LLM
@balainblue
@balainblue 7 ай бұрын
@@tensorops Thank you so much.
@balainblue
@balainblue 7 ай бұрын
@@tensorops Can you please elaborate on that? "A single request to an LLM application triggers more than one API call to an LLM"
@tensorops
@tensorops 7 ай бұрын
@@balainblue We give an example on the next webinar where you have one query that triggers many LLM calls. Sometimes even simple chains like Map-Reduce or Refine can cause many LLM calls to OpenAI for a simple action as "summarization"
@balainblue
@balainblue 7 ай бұрын
@@tensorops Thank you. I look forward to it.
A Survey of Advanced Prompt Engineering Techniques [webinar]
1:02:35
What Makes Large Language Models Expensive?
19:20
IBM Technology
Рет қаралды 69 М.
Фейковый воришка 😂
00:51
КАРЕНА МАКАРЕНА
Рет қаралды 7 МЛН
这三姐弟太会藏了!#小丑#天使#路飞#家庭#搞笑
00:24
家庭搞笑日记
Рет қаралды 126 МЛН
The CUTEST flower girl on YouTube (2019-2024)
00:10
Hungry FAM
Рет қаралды 53 МЛН
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,2 МЛН
Workshop on Useful and Reliable AI Agents
3:32:19
Princeton Language & Intelligence
Рет қаралды 2,6 М.
Emerging architectures for LLM applications
55:19
Superwise
Рет қаралды 50 М.
LlamaIndex Webinar: Make RAG Production-Ready
1:00:45
LlamaIndex
Рет қаралды 18 М.
This is why Deep Learning is really weird.
2:06:38
Machine Learning Street Talk
Рет қаралды 387 М.
Estimate Software Development Costs
14:02
AltexSoft
Рет қаралды 17 М.
$0 Embeddings (OpenAI vs. free & open source)
1:24:42
Rabbit Hole Syndrome
Рет қаралды 261 М.
Фейковый воришка 😂
00:51
КАРЕНА МАКАРЕНА
Рет қаралды 7 МЛН