CodeAct: Code As Action Space of LLM Agents - Pros and Cons

  Рет қаралды 403

John Tan Chong Min

John Tan Chong Min

Күн бұрын

Using Code as Action for LLM Agents improves accuracy by up to 20% compared to using JSON or Text for modular function calling approaches!
Code allows for intermediate state caching, and draws on training data for native pythonic structures which can solve the problem more easily (e.g. for loop, min, max).
Code also allows for multiple actions to be done together, which can solve potential planning issues of LLMs.
Code also allows for native error-correction by iterating on the error feedback the environment gives.
My take - Using code is a good way to bypass the current inadequate planning abilities of LLM Agents. It is not a long-term solution. We need to figure out how to plan better, so that LLM Agents can be much better and more responsive. Code is only good for tasks within the training set, so that a variant of the code structure can be easily replicated for the new tasks.
My slides: github.com/tanchongmin/Tensor...
Github: github.com/xingyaoww/code-act
Paper: arxiv.org/abs/2402.01030
~~~
Related links:
TaskGen Repo: github.com/simbianai/taskgen
LLMs as a System of Multiple Expert Agents to solve the ARC Challenge: • LLMs as a System of Mu...
~~~
0:00 Introduction and TaskGen implementation
6:23 Main highlight of paper - Code Action space has higher success rate compared to Text/JSON
9:31 Other examples of Code as Actions
12:05 Recap: Reasoning and Acting (ReAct) Framework
17:00 CodeAct: ReAct with Code as Action
27:17 CodeAct Agent’s 4 steps
30:51 CodeAct Prompt
35:20 CodeAct Expert Feedback prompt
37:12 Three kinds of Action Formats
47:15 Text/JSON vs Code (Part 1)
55:30 Text/JSON vs Code (Part 2)
1:02:30 Why is code better?
1:10:48 Recap: ARC Challenge
1:14:19 CodeAct even works better for some models over Text/JSON for atomic API calls
1:17:44 Can CodeAct self-learn?
1:24:36 Discussion
1:34:27 Additional Slides
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: / discord
LinkedIn: / chong-min-tan-94652288
Online AI blog: delvingintotech.wordpress.com/
Twitter: / johntanchongmin
Try out my games here: simmer.io/@chongmin

Пікірлер: 4
@johntanchongmin
@johntanchongmin 15 күн бұрын
Update. This is live in TaskGen 4.2.1. Refer to Tutorial 6 for example of how to do Code Actions. My take: The code generated can be quite flimsy and harder to error-correct as compared to modular function calling. I think this method of using code is not very robust XD.
@johntanchongmin
@johntanchongmin Ай бұрын
Github: github.com/xingyaoww/code-act Paper: arxiv.org/abs/2402.01030
@johntanchongmin
@johntanchongmin Ай бұрын
My slides: github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Code%20as%20Actions.pdf
@johntanchongmin
@johntanchongmin Ай бұрын
TaskGen repo: github.com/simbianai/taskgen
TaskGen Ask Me Anything #1
1:54:55
John Tan Chong Min
Рет қаралды 289
아이스크림으로 체감되는 요즘 물가
00:16
진영민yeongmin
Рет қаралды 54 МЛН
THEY WANTED TO TAKE ALL HIS GOODIES 🍫🥤🍟😂
00:17
OKUNJATA
Рет қаралды 21 МЛН
Chat with SQL and Tabular Databases using LLM Agents (DON'T USE RAG!)
58:54
TaskGen - A Task-based Agentic Framework using StrictJSON at the core
1:51:37
100+ Linux Things you Need to Know
12:23
Fireship
Рет қаралды 728 М.
Training Your Own AI Model Is Not As Hard As You (Probably) Think
10:24
Steve (Builder.io)
Рет қаралды 465 М.
ОкКорз ЭП 1 - 14 | Анимация Minecraft
8:58
OK Корз
Рет қаралды 285 М.
Animation vs. Geometry
9:17
Alan Becker
Рет қаралды 7 МЛН
НУБ ИЩЕТ ЖЕНУ В GTA SAMP
22:34
STRAYF
Рет қаралды 619 М.