Exploring the fastest open source LLM for inferencing and serving | VLLM

  Рет қаралды 9,109

JarvisLabs AI

JarvisLabs AI

Күн бұрын

Hello everyone welcome to our latest video. Today we will explore the open source inferencing and serving LLM, VLLM. Our founder Vishnu Subramanian will take you through on what is VLLM and also how to use them.
Connect with us on social media:
LinkedIn: / jarvislabsai
Instagram: / jarvislabs.ai
Medium: / jarvislabs
Connect with Vishnu:
X: / vishnuvig
Linkedin: / vishnusubramanian

Пікірлер: 20
@HermesFibonacci
@HermesFibonacci Күн бұрын
Very interesting i listened to the very end, and it gave me some ideas for prepping my Model. Thanks for the explanation and demo. May I ask?... Do you think an Nvidia GTX Orin Devkit 64 GB would be fitting for running LLMs locally for fine tuning, training and later deploying to server once developed (both Locally and Server on Ubuntu)?
@alecd8534
@alecd8534 8 ай бұрын
Thanks for your video. It is interesting. I am new to LLM and one question to ask. When you run JarvisLabs in your demo, does it mean you are running a server running locally to provide API endpoint? Please advise
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
In the demo, I was running on a gpu powered instance. The vllm server in this case is running in the Jarvislabs instance. You can use the API endpoint from anywhere.
@alecd8534
@alecd8534 8 ай бұрын
@@JarvislabsAI thanks so much. I have Navida T500 GPU card on my laptop. But it has only 4 gb. Can it run vLLM? Do we need to install JarvislabsAI on our local machine? Does JarvisLab do? Thanks
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
Not sure, if will be possible to run vllm on T500 GPU. Jarvislabs, offers a gpu instance in which you can use vllm.
@bernard2735
@bernard2735 8 ай бұрын
This was a nicely paced and clear tutorial. Thank you. Liked and subscribed.
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
Thanks for the support :)
@Akshatgiri
@Akshatgiri 7 ай бұрын
Super useful. Thanks for breaking it down.
@dineshgaddi1843
@dineshgaddi1843 8 ай бұрын
Thank you for sharing this information.
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
Glad it was helpful!
@kaiwalya_patil
@kaiwalya_patil 8 ай бұрын
An excellent one! Thank you so much for sharing. Any idea about the possibility of fine tuning my own LLM(like Llama/Mistral), uploading back to HF and the put it into production using VLLM?
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
Yeah definitely possible. Would make one soon.
@kaiwalya_patil
@kaiwalya_patil 8 ай бұрын
@@JarvislabsAI Thank you, looking forward!
@YajuvendraSinghRawat
@YajuvendraSinghRawat 4 ай бұрын
Its a wonderful videa, clearly and concisely explained.
@JarvislabsAI
@JarvislabsAI 4 ай бұрын
Glad you liked it
@fxhp1
@fxhp1 8 ай бұрын
hey i also have an AI channel, i tried mistrals model and it didnt finish its execution and looped over the input forever, i had slightly better luck with the instruct version. did you ever get mistral to work?
@JarvislabsAI
@JarvislabsAI 8 ай бұрын
We tried with vLLM and remember it working. I will probably check again.
@Ian-fo9vh
@Ian-fo9vh 7 ай бұрын
hank you, it was interesting.
Accelerating LLM Inference with vLLM
35:53
Databricks
Рет қаралды 3,6 М.
Fast LLM Serving with vLLM and PagedAttention
32:07
Anyscale
Рет қаралды 23 М.
From Small To Giant Pop Corn #katebrush #funny #shorts
00:17
Kate Brush
Рет қаралды 70 МЛН
vLLM on Kubernetes in Production
27:31
Kubesimplify
Рет қаралды 3 М.
Deploy LLMs using Serverless vLLM on RunPod in 5 Minutes
14:13
AI Anytime
Рет қаралды 4,2 М.
LangChain and Ollama: Build Your Personal Coding Assistant in 10 Minutes
20:43
AI Software Developer
Рет қаралды 1,3 М.
The architecture of mixtral8x7b - What is MoE(Mixture of experts) ?
11:42
Software Engineers and IT Leaders are Dead Wrong about AI
25:55
Coding the Future With AI
Рет қаралды 14 М.
How I Made AI Assistants Do My Work For Me: CrewAI
19:21
Maya Akim
Рет қаралды 836 М.
Run ANY Open-Source LLM Locally (No-Code LMStudio Tutorial)
14:11
Matthew Berman
Рет қаралды 97 М.
"The Life & Death of htmx" by Alexander Petros at Big Sky Dev Con 2024
23:01
Montana Programmers
Рет қаралды 61 М.
vLLM - Turbo Charge your LLM Inference
8:55
Sam Witteveen
Рет қаралды 16 М.