Рет қаралды 778
#meta #llm #llama #llama3.1 #lamma3.3 #ai #machinelearning #largelanguagemodels
It takes a significant amount of time and energy to create these free video tutorials. You can support my efforts in this way:
Buy me a Coffee: www.buymeacoff...
PayPal: www.paypal.me/...
Patreon: www.patreon.co...
You Can also press the Thanks KZbin Dollar button
In this tutorial, we explain how to install and run Llama 3.3 70B LLM in Python on a local computer. Llama 3.3 70B model offers similar performance compared to the older Llama 3.1 405B model. However, the Llama 3.3 70B model is smaller, and it can run on computers with lower-end hardware.
Our local computer has an NVIDIA 3090 GPU with 24 GB RAM. The computer has 48 GB RAM and an Intel CPU i9-10850K.
-Llama 3.3 works on this computer, however, the inference speed is not fast. We can speed up the inference by changing model parameters. More about this in future tutorials.
In this tutorial we will explain how to install and run a quantized Llama3.3 model. The model is denoted by 70b-instruct-q2_K. To install this highly quantized model you will need 26GB disk space. You can also try to install the regular model. For the regular model you will need 40GB disk space.
The installation procedure is:
1) Install Ollama on a local computer. Ollama is a framework and software for running LLMs on local computers. By using Ollama, you can use a command line to start a model and to ask questions to LLMs
2) Once we install Ollama, we will manually download and run Llama 3.3 70B model
3) Create a Python virtual environment, install Ollama Python library, and run a Python script.