Install and Run Llama3.3 70B LLM Model Locally in Python

  Рет қаралды 778

Aleksandar Haber PhD

Aleksandar Haber PhD

Күн бұрын

#meta #llm #llama #llama3.1 #lamma3.3 #ai #machinelearning #largelanguagemodels
It takes a significant amount of time and energy to create these free video tutorials. You can support my efforts in this way:
Buy me a Coffee: www.buymeacoff...
PayPal: www.paypal.me/...
Patreon: www.patreon.co...
You Can also press the Thanks KZbin Dollar button
In this tutorial, we explain how to install and run Llama 3.3 70B LLM in Python on a local computer. Llama 3.3 70B model offers similar performance compared to the older Llama 3.1 405B model. However, the Llama 3.3 70B model is smaller, and it can run on computers with lower-end hardware.
Our local computer has an NVIDIA 3090 GPU with 24 GB RAM. The computer has 48 GB RAM and an Intel CPU i9-10850K.
-Llama 3.3 works on this computer, however, the inference speed is not fast. We can speed up the inference by changing model parameters. More about this in future tutorials.
In this tutorial we will explain how to install and run a quantized Llama3.3 model. The model is denoted by 70b-instruct-q2_K. To install this highly quantized model you will need 26GB disk space. You can also try to install the regular model. For the regular model you will need 40GB disk space.
The installation procedure is:
1) Install Ollama on a local computer. Ollama is a framework and software for running LLMs on local computers. By using Ollama, you can use a command line to start a model and to ask questions to LLMs
2) Once we install Ollama, we will manually download and run Llama 3.3 70B model
3) Create a Python virtual environment, install Ollama Python library, and run a Python script.

Пікірлер: 2
@aleksandarhaber
@aleksandarhaber 11 күн бұрын
It takes a significant amount of time and energy to create these free video tutorials. You can support my efforts in this way: - Buy me a Coffee: www.buymeacoffee.com/AleksandarHaber - PayPal: www.paypal.me/AleksandarHaber - Patreon: www.patreon.com/user?u=32080176&fan_landing=true - You Can also press the Thanks KZbin Dollar button
@jackrorystaunton4557
@jackrorystaunton4557 6 сағат бұрын
looking forward to future tutorial on model parameters for inference speed-up :)
Install and Run Llama3.3 70B Model Locally
10:28
Aleksandar Haber PhD
Рет қаралды 4,2 М.
UV for Python… (Almost) All Batteries Included
17:35
ArjanCodes
Рет қаралды 79 М.
Try this prank with your friends 😂 @karina-kola
00:18
Andrey Grechka
Рет қаралды 9 МЛН
Арыстанның айқасы, Тәуіржанның шайқасы!
25:51
QosLike / ҚосЛайк / Косылайық
Рет қаралды 700 М.
Reliable, fully local RAG agents with LLaMA3.2-3b
31:04
LangChain
Рет қаралды 74 М.
Download and Run Microsoft Phi 4 LLM Locally (Unofficial Release)
11:54
Aleksandar Haber PhD
Рет қаралды 579
Anthropic MCP + Ollama. No Claude Needed? Check it out!
18:06
What The Func? w/ Ed Zynda
Рет қаралды 9 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,1 МЛН
Llama: The Open-Source AI Model that's Changing How We Think About AI
8:46
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Getting Started on Ollama
11:26
Matt Williams
Рет қаралды 65 М.
EASIEST Way to Fine-Tune a LLM and Use It With Ollama
5:18
warpdotdev
Рет қаралды 176 М.