Using the same temperature and sampling settings. Give it a try ;)
@j05hau11 күн бұрын
Your question is too woke.
@PrivateLLM7 күн бұрын
:)
@TheFelixAlabi13 күн бұрын
Can I have this on iPhone 13 Pro? Will it make my device hot while running? Do I need to download models separately after downloading on App Store? And most importantly, does your model have access to the internet if I do actually need it?
@PrivateLLM8 күн бұрын
The iPhone 13 Pro, with 6GB RAM, can comfortably run smaller 3B models like Qwen 2.5/Llama 3.2/Gemma 2 2B. For best performance, we recommend devices with 8GB RAM, such as the iPhone 15 Pro or iPhone 16 to run larger models like Qwen 2.5 7B or Llama 3.1 8B. Local LLM inference is computationally intensive, so your device may get warm depending on usage. Models need to be downloaded separately after installing the app (app includes base model pre-installed). The app processes everything locally and models don’t have internet access
@TheFelixAlabi7 күн бұрын
@PrivateLLM thanks for clarifying
@devnulldroid14 күн бұрын
Another excellent video. Thanks. Keep 'em coming. Happy New Year!
@PrivateLLM14 күн бұрын
Happy new year!
@czr501414 күн бұрын
App is not working. I Try to download the llama llm and it hangs at 95% complete and never finishes downloading
@PrivateLLM14 күн бұрын
Sorry to hear you're having trouble. This could be happening for several reasons. Are you using a corporate network or VPN that might block Hugging Face? Also, what device are you using (Mac/iPhone/iPad)? Is your internet connection stable? Let us know these details so we can help troubleshoot the download issue.
@czr50147 күн бұрын
@@PrivateLLM running on an iPhone SE 2020. Deleted it, redownloaded, uninstalled reinstalled did everything but it’s not working
@PrivateLLM7 күн бұрын
@@czr5014 Running LLMs locally has its pros and cons. One of the cons is that if you have an older device, your experience is pretty limited in terms of the models you can use. The iPhone SE 2020 has 3GB of RAM, which restricts the models it can support effectively. For reference, devices with 6GB of RAM, such as the iPhone 13 Pro or newer, perform much better. On the latest models, like the iPhone 15 Pro with 8GB of RAM, you can even run larger models such as 7B or 8B with ease. That said, the base model that comes preloaded with the app should still work on your device. Could you confirm if that is functioning as expected? If not, please let me know, and we’ll investigate further to resolve the issue. If the app isn’t meeting your needs, you can find instructions for requesting a refund in our FAQ here: privatellm.app/refund. Refunds are processed directly through the App Store.
@PrivateLLM7 күн бұрын
Also, could you confirm if your device has enough available storage?
@czr50147 күн бұрын
@ your FAQ pages doesn’t include your hardware requirements bullet. Also, 3 gb is not big enough to run llama even though it’s like half a gig?
@metatron394216 күн бұрын
I was thinking of picking up an M4 max MacBook Pro for just this purpose. If you can show more performance videos, so I know what to expect.
@PrivateLLM15 күн бұрын
Thanks for the feedback. We will be uploading more videos soon :)
@PrivateLLM14 күн бұрын
Uploaded a new video of uncensored Llama 3.3 running on on M4 Max Mac: kzbin.info/www/bejne/rancXn2dlJd3sLM
@SimplemanOfficial-u7u16 күн бұрын
My iphone 13 constantly crashed when processing promt using Gemma2 2B, please fix it
@PrivateLLM16 күн бұрын
Sorry to hear about the trouble! It sounds like the crash may be related to memory limitations. Local AI processing can be memory-intensive, especially with models like Gemma2 2B. Could you please try closing any memory-intensive background apps and give it another shot? You can also check your device’s available memory at the bottom of the Help screen within the app. Alternatively, you might want to try the Llama 3.2 1B model, which is lighter on memory and could work more smoothly on your device. Let us know if the issue persists, and we’ll be happy to assist further!
@SimplemanOfficial-u7u14 күн бұрын
Yes maybe thats the problem. I noticed that my phone feells burning using that model. Maybe consider optimize or delete that model until it can be fixed
@PrivateLLM14 күн бұрын
@@SimplemanOfficial-u7u Thanks for the feedback! You raise a good point about device heating. While some users prefer having access to larger models like Gemma2 2B despite the performance trade-offs, we understand the concerns about device stress. For a smoother experience on the iPhone 13, we recommend trying our lighter models like Llama 3.2 1B or Qwen 2.5 0.5B. These are specifically optimized for older devices like yours and should run much cooler.
@tak4272Ай бұрын
Does privateLLM have an OpenAI-compatible API? If it doesn't, then being somewhat faster in inference compared to Ollama won't be a significant advantage. Many software applications are compatible with OpenAI's API, so using Ollama offers various benefits. I think without an API, it would just be a chatbot.
@PrivateLLM21 күн бұрын
@@tak4272 We’re working on adding an HTTP API. We’ve always supported extension through macOS shortcuts which llama.cpp wrappers lack. Also, Ollama has additional features that we’ll never be able match: Slow inference, and low quality RTN quantized models.
@Lp-ze1tgАй бұрын
Is there a tutorial for private llm?
@PrivateLLM21 күн бұрын
You can check this out: privatellm.app/blog/run-local-gpt-on-ios-complete-guide The article is slightly dated, and we need to revise it with new model recommendations. We will do it soon.
@pawanjenuАй бұрын
can you direct to github link?
@christiancrowАй бұрын
M4 mac mini ? Or pro
@PrivateLLM21 күн бұрын
Either one should be fine, as long as you have 48GB of RAM. This demo was run on an M2 Max Mac Studio, but an M4 Mac Mini should work as well.
@christiancrow21 күн бұрын
@PrivateLLM 2 grand for a 48 GB system , I would be interested in base model I wonder if it could run faster on newest llama
@PrivateLLM21 күн бұрын
@@christiancrow Base model with 16Gb of RAM can easily run Gemma 2 9B, Llama 3.1 8B, Qwen 2.5 14B. Checkout the model list on our website for a full list of models along with RAM requirements. privatellm.app/en#models
@christiancrowАй бұрын
❤❤❤❤❤ thank you
@PrivateLLM17 күн бұрын
You are so welcome
@NotJnshFireАй бұрын
Personal chat gpt
@RA-dq4ro2 ай бұрын
Isn't this going to damage the phone?
@PrivateLLM13 күн бұрын
Not really
@rodelpelegria44484 ай бұрын
App name
@PrivateLLM13 күн бұрын
privatellm.app
@therahay68498 ай бұрын
It only shows me 3 models to download and all are kinda shitty , I entered hoping to find llama 3 at least.
@ccentralchannel44927 ай бұрын
The problem is that they are limited by the amount of parameters the model has, the higher the more “intelligent and capable” model when it comes to training. But the higher the parameter is the more resources it needs since LLMs are resource hungry, especially when it comes to RAM storage.
@PrivateLLM13 күн бұрын
Please try again. We've added newer models. Llama 3.2 1B and Qwen 2.5 1.5B should work well even on older devices.
@JasonXiao898 ай бұрын
is there gonna be an api for things like react native app with langchain js to access?
@PrivateLLM13 күн бұрын
It's on our roadmap, yes!
@jasonjefferson65969 ай бұрын
What is the main reason why private Llm is faster?
@PrivateLLM9 ай бұрын
We use an auto-tuning and compilation based approach from mlc-llm and Apache TVM for LLM inference. This means that inference pipeline is optimized to extract the best possible performance from the underlying hardware for each model architecture.
@UncleDavid10 ай бұрын
Why no 2016 Mac support?
@PrivateLLM10 ай бұрын
We'll support it if you manage to get macOS Ventura running on it. :)
@thegodofmoneymaking10 ай бұрын
Please make it also speak the answers to me. Me talking to it is not, I can text it no problem, but when it gives an answer with speech is much better than mere text
@UncleDavid10 ай бұрын
and next it can wipe your lazy ass too?
@PrivateLLM13 күн бұрын
Have you tried using it along with Text to Speech action in Apple Shortcut?
@thegodofmoneymaking10 ай бұрын
what do Temperature: 0.70 and Top-P: 0.95 change if I put those on max?
@PrivateLLM13 күн бұрын
You can learn more about temperature and top-p settings in this article: privatellm.app/blog/run-local-gpt-on-ios-complete-guide
@totallyrandomtalk830311 ай бұрын
Will this run okay on an iPad 9 Gen?
@PrivateLLM13 күн бұрын
iPad 9th Gen has 3 GB RAM, which means you will be limited to small models like Llama 3.2 1B or Qwen 2.5 1.5B
@Z060 Жыл бұрын
Is it 4-bit quantised?
@CharlesG28 Жыл бұрын
The fact that this is local is totally bonkers.
@UncleDavid Жыл бұрын
Add feature to respond with Siri instead of the app needing to open
@PrivateLLM13 күн бұрын
The difference in functionality between iOS and macOS regarding background processing stems primarily from Apple's hardware usage policies. On iOS, Apple restricts background execution of tasks that require intensive GPU usage. This limitation is enforced to preserve battery life and maintain system performance. According to Apple's guidelines, apps attempting to run a Metal kernel in the background will be terminated immediately to prevent unauthorized resource use. For Private LLM, while we can run operations in the background on macOS leveraging the GPU, iOS versions are constrained to CPU processing when the app is not in the foreground. Running Private LLM's AI-driven tasks on the CPU is technically possible, but it would be significantly slower-more than 10 times slower compared to GPU processing. This slow performance would not provide the seamless, efficient user experience we strive for. We are hopeful that future updates to iOS might offer more flexibility in how background processes can utilize system resources, including potential GPU access for apps like Private LLM. Until then, we continue to optimize our iOS app within the current constraints to ensure you get the best possible performance without compromising the health of your device or the efficiency of your applications. For more technical details, you can refer to Apple's official documentation on preparing your Metal app to run in the background: developer.apple.com/documentation/metal/gpu_devices_and_work_submission/preparing_your_metal_app_to_run_in_the_background
@antoineyolo Жыл бұрын
Super ton app !!! je l'ai pris sur mac os. Est-ce que c'est faisable d'avoir un retour audio avec la synthèse vocal du mac ? trop hâte de voir les prochaines évolution 😊
@PrivateLLM Жыл бұрын
Merci ! Je suis ravi que vous aimiez notre application, je pense que nous pouvons le faire. Nous travaillons actuellement à ajouter plus de modèles au logiciel pour macOS. En particulier, un nouveau modèle de 7B de la startup française Mistral et quelques modèles supplémentaires de 13B. De plus, essayez notre application iOS, qui est téléchargeable gratuitement depuis que vous avez déjà l'application pour macOS.
@Z060 Жыл бұрын
When did 16 pro max released
@PrivateLLM Жыл бұрын
Corrected the typo-it's iPhone 14 Pro Max. Thank you for bringing this to our attention.