Exploring the fastest open source LLM for inferencing and serving

Exploring the fastest open source LLM for inferencing and serving | VLLM

Рет қаралды 9,959

JarvisLabs AI

Күн бұрын

Пікірлер: 21

@bernard2735 10 ай бұрын

This was a nicely paced and clear tutorial. Thank you. Liked and subscribed.

@JarvislabsAI 10 ай бұрын

Thanks for the support :)

@Akshatgiri 10 ай бұрын

Super useful. Thanks for breaking it down.

@HermesFibonacci 2 ай бұрын

Very interesting i listened to the very end, and it gave me some ideas for prepping my Model. Thanks for the explanation and demo. May I ask?... Do you think an Nvidia GTX Orin Devkit 64 GB would be fitting for running LLMs locally for fine tuning, training and later deploying to server once developed (both Locally and Server on Ubuntu)?

@JarvislabsAI 2 ай бұрын

Have not tried it. No idea.

@kaiwalya_patil 11 ай бұрын

An excellent one! Thank you so much for sharing. Any idea about the possibility of fine tuning my own LLM(like Llama/Mistral), uploading back to HF and the put it into production using VLLM?

@JarvislabsAI 11 ай бұрын

Yeah definitely possible. Would make one soon.

@kaiwalya_patil 11 ай бұрын

@@JarvislabsAI Thank you, looking forward!

@dineshgaddi1843 11 ай бұрын

Thank you for sharing this information.

@JarvislabsAI 11 ай бұрын

Glad it was helpful!

@fxhp1 11 ай бұрын

hey i also have an AI channel, i tried mistrals model and it didnt finish its execution and looped over the input forever, i had slightly better luck with the instruct version. did you ever get mistral to work?

@JarvislabsAI 11 ай бұрын

We tried with vLLM and remember it working. I will probably check again.

@YajuvendraSinghRawat 7 ай бұрын

Its a wonderful videa, clearly and concisely explained.

@JarvislabsAI 7 ай бұрын

Glad you liked it

@alecd8534 11 ай бұрын

Thanks for your video. It is interesting. I am new to LLM and one question to ask. When you run JarvisLabs in your demo, does it mean you are running a server running locally to provide API endpoint? Please advise

@JarvislabsAI 11 ай бұрын

In the demo, I was running on a gpu powered instance. The vllm server in this case is running in the Jarvislabs instance. You can use the API endpoint from anywhere.

@alecd8534 11 ай бұрын

@@JarvislabsAI thanks so much. I have Navida T500 GPU card on my laptop. But it has only 4 gb. Can it run vLLM? Do we need to install JarvislabsAI on our local machine? Does JarvisLab do? Thanks

@JarvislabsAI 11 ай бұрын

Not sure, if will be possible to run vllm on T500 GPU. Jarvislabs, offers a gpu instance in which you can use vllm.