Рет қаралды 150
My M5Stack off-line Large Language Model (LLM) module arrived yesterday.
It is another example of how China is excelling in AI.Everyone has heard about DeepSeeker. It got the cost down and my LLM only cost $US49.95 while the Core3 computer is another $US59.
I have coupled the LLM module with my M5Stack Core3 computer and did the programming with UIFlow2 - a simple blocky language. I added an applications layer to the LLM. The LLM is based on Qwen.
The model embedded in the modules's firmware is based on qwen 0.5B (Billion), so you can hardly call this a large language.
My whole model is contained within the eMMC memory of the modules computer which is limited to 32G while the RAM is 4G.
Compare it to the trillions for Chat-GPT4 which is an estimated 170T (Trillion).
Despite that I had my tiny LLM (with the prompt "Hi Jimmy!") answer several questions and requests including:
"I want to stay healthly, please help me"
"How do you make noodles?"
"I want to apply for a job as a teacher, please help"
In addition but not shown on the video I also asked Jimmy to provide the Arduino Code to flash an LED every second. Jimmy gave me a python answer and not C++ for the Arduino IDE.
I also compared the ferformance of the off-line model (Qwen) with the same same tokens with an on-line model. Both performed with similar results.
Despite this, my tiny LLM is interesting and my goal is to develop a hybrid AI (artificial intelligence) model to control a robot with both a LLM and an AI Camera. The camera will pass on objects recognised in images and these will be used to formulate a robot navigation plan. Just what NASA needs for its Mars rovers!
I also interested in other hybrid models on tiny applications that can route requests that are beyond their context. They would route the requests onto OpenAI API server. Perhaps for a simple system the tinny LLM could anser 80% of requests.
Other applications include developing models for people with language disabilities. Instead of Jimmy answering he or his assistant will ask the questions to encourage a response. The application layer may have to intercept the responses.
I have already loaded a qwen 1.5B model and can not wait to test this. That is a x3 improvement on what am have already tried
Ref
M5Stack LLM Module (shop.m5stack.c... )
deepseek (www.deepseek.com/ )
What are small language models?
( www.ibm.com/th... )
Researchers use large language models to help robots navigate ( news.mit.edu/2... )
Qwen on-line: Qwen2.5 0,5B ( huggingface.co... )