Qwen-Agent: Build Autonomous Agents with The Best Open Weight Model

Рет қаралды 9,142

Күн бұрын

Пікірлер: 23

@unclecode 7 ай бұрын

Glad you clarified the definition of an agent! Many mix it up with calling multiple consecutive LLM calls which is “pipeline” and not agent. An agent needs autonomicity to plan, think, and decide. Also, I'm creating a COLAB to challenge the model's function calling ability. I'll share it soon for you to use and review. Another thing I wonder if their AWQ quantization is 4-bit or 8-bit. The table suggests 4-bit because AWQ preserves accuracy better and scores better than other 4-bit methods. But it scores lower compared to 8-bit, indicating they use 4-bit. I think AWQ 8bits is the best for most of cases.

@engineerprompt 7 ай бұрын

I have seen a lot of confusion around function calling and agents and thought it will be helpful. Would love to see the collab :) I think they probably were using AWQ 4-bit. There are some other formats will as and will be interesting to see how they compare. They also did some testing on the inference speed when you use different frameworks (TGI, vLLM etc.). The results are on their page and are really interesting. Will probably create a video on that soon.

@unclecode 7 ай бұрын

@@engineerprompt Exactly, and to be honest, I think sometimes some people do it deliberately. For fundraising or hype, it's a marketing move, like when "big data" or "cloud computing" were hot terms. Now, I see some people using a function call and calling it an agent. Experts like you have to clarify this for newcomers, so they understand what an agent truly is.

@xXWillyxWonkaXx 7 ай бұрын

I have random question regarding massedcompute. How many hours do you typically run the LLM for? 1 hour? more? less? And how much do you spread it across? 1 week (assuming you're testing it or you're building something with it)? Also what's the overall cost?

@engineerprompt 7 ай бұрын

I usually do testing of new LLMs on their VM if I need Nvidia GPU. So can be an hour or two when a new LLM is released or more if I am working on a project. They charge you per hour. You can run their VM for days/weeks and I haven't encountered any issues with it. For pricing I would suggest to checkout massedcompute.com/home/pricing/ If you use my code: PromptEngineering, you will get discount on their VMs. I do get a small commission out of it :)

@MrDenisJoshua 7 ай бұрын

Witch type of subscription you have make at massedcompute please ? I also wonder how the hour is calculate please ? Thanks for the video.

@engineerprompt 7 ай бұрын

I usually use those on hourly basis. If I am testing a model or running a training run. Different GPUs have different rates there. I normally use A6000 on their platform. Will recommend to checkout their pricing page. (massedcompute.com/home/pricing/). If you decide to use them, you can use my code: PromptEngineering for reduce pricing on certain VM. I can connect you to my contact there if you need it for enterprise usage. Happy to help

@MrDenisJoshua 7 ай бұрын

@@engineerprompt No.... is not for enterprise usage, just for hobby :-) Thanks for the answer.

@zxc15zxc 7 ай бұрын

Very informative, just curious why not langgraph to manage this.

@engineerprompt 7 ай бұрын

I think they are building their own framework. You could use langgraph for sure with something like this. Personally, I am not a big fan of frameworks. They add a lot of abstraction and bloatware which is not needed. You are better off to write custom code for your application if one has the time and skills to do that.

@jeffdavis5196 7 ай бұрын

i've found other models performing much better in an agentic workflow across many use cases, many times the results are nonsense from qwen2, so it needs to be filtered out, but the 128k context length is nice.