Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

Рет қаралды 106,652

Күн бұрын

#gpt4 #ai #prompt
Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting.
OUTLINE:
0:00 - Introduction
1:20 - From Chain-of-Thought to Tree-of-Thought
11:10 - Formalizing the algorithm
16:00 - Game of 24 & Creative writing
18:30 - Crosswords
23:30 - Is this a general problem solver?
26:50 - Ablation studies
28:55 - Conclusion
Paper: arxiv.org/abs/2305.10601
Abstract:
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL.
Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Links:
Homepage: ykilcher.com
Merch: ykilcher.com/merch
KZbin: / yannickilcher
Twitter: / ykilcher
Discord: ykilcher.com/discord
LinkedIn: / ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: www.subscribestar.com/yannick...
Patreon: / yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Пікірлер: 161

@YannicKilcher Жыл бұрын

OUTLINE: 0:00 - Introduction 1:20 - From Chain-of-Thought to Tree-of-Thought 11:10 - Formalizing the algorithm 16:00 - Game of 24 & Creative writing 18:30 - Crosswords 23:30 - Is this a general problem solver? 26:50 - Ablation studies 28:55 - Conclusion Paper: arxiv.org/abs/2305.10601

@ozordiprince9405 Жыл бұрын

I was literally about to go through this paper myself. Thanks Yannic

@EdFormer Жыл бұрын

So glad you're back on with regular content. The hype train since ChatGPT's release has led to an intolerable rise of wishy washy AI content by people who clearly don't care about fundamentally understanding machine learning and disseminating that knowledge but are instead motivated by views, and are happy to peddle the poorly thought out arguments for how close to AGI we are and what the consequences will inevitably be, in order to rake those views in. So I'm so grateful for you continuing to present your sobering and realistic perspective on AI in these entertaining videos that actually cover the detail of methods (where you can - so called "technical report"s can die), as it really augments my experience of researching the subject with much more enjoyment and insight than I would otherwise have had. And I'm sure I'm not alone in this. Thank you!

@JurekOK Жыл бұрын

For multi-step agents, it is exponentially important that each "step" has as high success rate, as the compound success rate decreases very quickly with both count of steps and unit success rate: overallSuccessRate=(stepSuccessRate^Nsteps). Foing from e.g. 90% to 95% is actually a lot, as it enables the chain length to go from 7 steps to 14 steps and still have a ~50% compound success rate, so enables vastly more complicated problems to be solved. Hence, it will often be very valuable to review and iterate on each sub-step to maximize the chance that it doesn't block the entire chain.

@avatarcybertronics2584 Жыл бұрын

You are right, we call this phenomena catastrophic error compound (similar to neural network effect to forget previous style when tuned - take a look at FractalGPT - self evolving true multi-agent system in core no LLMs so no this problem)

@1000niggawatt Жыл бұрын

Yannic is the one man who's actually giving intelligent critique of new papers, instead of just throwing the paper into chatpdf and making a video.

@tigergold5990 Жыл бұрын

What is chat pdf lol or just a joke

@N.i.a.m.a.t.u.l.l.a.h Жыл бұрын

@@tigergold5990 www.chatpdf.com/

@Candyapplebone Жыл бұрын

Ooof

@TheManyMan Жыл бұрын

@@tigergold5990 joke but there are pdf reader plugins for GPT; useful for running through summaries of papers you don't have time / don't want to read fully ngl

@1000niggawatt Жыл бұрын

@@tigergold5990 there's literally "chatpdf" and pdfgpt.

@ixion2001kx76 Жыл бұрын

A very nice addition to the new field of computational philosophy.

@television9233 Жыл бұрын

Computational philosophy is the use of computation for philosophical research (as per Stanford's encyclopedia). You are probably thinking of the philosophy of computation, in which even Alan Turing, the father of the modern computation, engaged in back in the 50's in his paper "Computing Machinery and Intelligence". The same paper is also why he is partly remembered as a philosopher. So I'm not sure what you mean by "new field".

@trulyUnAssuming Жыл бұрын

@@television9233 it feels like a joke along the lines of "this isn't computer science anymore - people are just throwing shit against the wall"

@television9233 Жыл бұрын

@@trulyUnAssumingdon't think so. using a value based tree search algorithm is literally one of the most CS things you can do.

@davidw8668 Жыл бұрын

@unAssuming hilarious interpretation, and indeed it's getting harder to determine where the jokes end or begin. However I don't think the paper is shit even though for sure some people already figured how this is the new agi mechanism to erase humanity.

@ixion2001kx76 Жыл бұрын

I don’t mind at all that you didn’t cut out the “um”s. It probably saves you a heap of time that is better spent on reading papers, and it makes your videos feel more personable.

@1000niggawatt Жыл бұрын

Yes, videos that focus on production are just a lot of clickbait, hype, fast movement for adhd zoomers, and there is no substance - they just put a paper into chatgpt and read the result. Yannic please do not bother with production, we come here to hear the critique, anyone who's just clicking on le funny ai vids will go to one of the many fastfood ai channels instead, anyway.

@jit_rs Жыл бұрын

One application of this "AI-guided tree search" is in automated theorem proving. There was a reasearch project termed GPT-f, where they took the Lean proof assistant which can precisely check if a proof up to a certain point is correct and designed a plugin that constructs a proof step-by-step with backtracking using a language model (GPT-f itself) as the decision maker and it was able to prove about 60% of common geometry/algebra theorems with zero user intervention. As a type theory nerd myself I am excited to see what this branch of research brings next 🎉

@luck3949 Жыл бұрын

Do you work at Yale-NUS Singapore?

@jit_rs Жыл бұрын

@@luck3949 no, I am a systems programmer

@lucastononrodrigues1069 Жыл бұрын

Awesome, I was reading it last night! Very glad you posted it right on time :)

@ilianos Жыл бұрын

When I saw this paper, I was hoping someone like you would cover it. Thanks a lot!

@dribrahimel-nahhal2477 Жыл бұрын

Yannic, thank you for this excellent video on the 'Tree of Thoughts' research paper. Your explanation was very clear and concise, making it easy for even a layman like me to understand. I appreciate your efforts in breaking down the decoding technique used in large language models and highlighting its usefulness in investigative problem-solving patterns. Keep up the great work!

@unfriendlybus3225 Жыл бұрын

okay chatgpt

@dmitryplatonov 5 ай бұрын

😂

@marshallmcluhan33 Жыл бұрын

Awesome I saw this and wondered if it was profound. Thanks for explaining it.

@mono_onamoto Жыл бұрын

Very informative and good voice for radio. Cheers Yannic!

@amalzubidat1897 Жыл бұрын

Thank you for reviewing this! Yannic is always on top of things :)

@Ernest_Viger-Beaulieu Жыл бұрын

Thank you so much. Best explanation I found about this paper. 🎉

@guillemgarcia3630 Жыл бұрын

Really well explained! Thanks!!

@sabofx Жыл бұрын

Really helpful explanation of TOT! Thanx bud! 🤓

@florianbehrens690 Жыл бұрын

Thank you for making it much easier to consume these papers!

@Rockyzach88 Жыл бұрын

This is cool. Sort of my first video I've watched about prompt engineering. The idea of creating sort of virtual neurons comes to mind. And yeah right as this was coming out, I was thinking the exact same thing, like they would replace parts of algorithms or "functions".

@titastotas1416 Жыл бұрын

I like your content so much that I felt it necessary to express my gratitude in the comment section, simply pressing the like button does not cut it for me in this case.

@aa-xn5hc Жыл бұрын

Really brilliant analysis

@XorAlex Жыл бұрын

Thanks for explaining!

@killermike1209 Жыл бұрын

Yannic, Your sunglasses are strikingly stunning.. Much thanks for keeping me informed on AI goings on.. Also thanks for being anti-boring, funny and or highlarious.. - Cheers!!

@clray123 Жыл бұрын

I'm pretty sure in the picture at 10:46 the authors meant to descend into the left branch first and backtrack to later descend through the solid green branch, not like Yannic explained.

@Zankras Жыл бұрын

That’s how I read it too.

@sgramstrup Жыл бұрын

Hey Yannick. You mentioned something important. You said that we shouldn't 'pick' control questions but let the AI suggest what to do. The idea oc is to remove hooman rigid thinking and find a more 'fluid' approach. Here's the question then. Why are people not using NEAT or one of the novelty seeking algorithms, to optimize their cognitive architecture ? This problem isn't much different from a genetic algorithm car trying to pass a maze. Just because we hoomans didn't design the 'maze' this time, doesn't mean that we should try to lead the car through step by step. For fucks sake - this is what genetic algorithms excel at ?? Let a GA develop general architectures, optimize methods like CoT/ToT and what not. We simply don't have to try out all this shit by hand. Lets use the great tools we have to the maximum, and let the genetic control network develop 'it self' ! How long would it have taken a genetic algorithm to go from 'AutoGpt', to CoT, to 'sampling' decisions to a 'Tree of Thoughts decision ? Not long, is my bet, and then we have to ask: What other cognitive architectures could such an adaptive algorithm discover ?

@ankitaharwal5886 Жыл бұрын

Yannic should pin this comment, as someone might just look into it and implement it

@washedtoohot Жыл бұрын

Can’t wait to see this in Langchain 😮

@Candyapplebone Жыл бұрын

Nice to see an actual pro do a video on this xD

@nangld Жыл бұрын

LLMs are N-gram Markov models, in that they output a single token, based on the last N tokens of chat history. So outputting intermediate steps helps the follow up calls to the model to organize its reasoning. Just like a human being has more chances solving an equation with a piece of paper, instead of relaying solely on their brain. In other words, some problems inherently requires N tokens of memory to be solved by a given model. Guess in the end scientists will expand the big-O space and computation complexity to LLMs. Obviously you can also ask the model to introduce different personalities, like the engineers from the relevant fields or simply different psychological models, which will explicitly reference associated knowledge during the solving the problem, and you will get a several totally different answers, and all of them could be worthy of considering.

@ThetaPhiPsi Жыл бұрын

LLMs are, if anything, N-gram Markov++ models. Try to replicate some of the results of LLMs (eg. Llama-7b) with a N-gram HMM. Its an overly simplistic view of LLMs I only would use as an intro to NLProc. On the other hand, I would be interested to know if one could replicate an LLM by an N-gram HMM. If that works, I'll take everything back.

@aitools24 8 ай бұрын

00:05 Deliberate problem solving with large language models 04:12 Chain of Thought prompting helps in better problem solving. 07:57 Using a tree search algorithm with pruning for model self-critique and improvement 11:39 Implement Chain of Thought in two ways: one approach is to explicitly sample the next thought, while another is to input all thoughts at once and generate a linear sequence. 15:28 Language models can be integrated into programming by handling specific parts, resulting in more evaluations. 19:14 Backtracking is useful in language models for solving crossword puzzles. 22:47 The algorithm implemented a crossword-solving algorithm using language models. 26:16 The paper introduces a technique for improving performance in language models Crafted by Merlin AI.

@jabowery Жыл бұрын

Sounds like a Stack-RNN may be the next step for DeepMind given the prominent mention in the recent Princeton/DeepMind paper "Neural Networks and the Chomsky Hierarchy". However, since there are no authors in common between the two papers, it may require overcoming some of the Big Org problems that have plagued Alphabet's ability to execute on its in-house talent.

@Rotbeam99 Жыл бұрын

what is a stack-rnn? thanks

@jabowery Жыл бұрын

@@Rotbeam99 See "Neural Networks and the Chomsky Hierarchy"

@joepike1972 Жыл бұрын

5:13 I have noticed that is seems to be related to the models text limitations as well, or just general capabilities possibly related to their number of tokens. In that I have seen large language models give more efficient usage of such though process capabilities were as older models might just use such space to continuously insist the same points and not make much progress with it. But the other aspect it will try to do several things at once in the limited space and not take the time needed to fully expand each thought to the degree it needs to deal with matters efficiently.

@cutebabyseal621 Жыл бұрын

Watching Yannic try to come up with a crossword cue for "ape" was hilarious.

@Veptis 3 ай бұрын

This could be really useful for coding problems/debugging. And you use someone like LSP to recursively walk more information into the prompt for the model to solve it.

@nicktasios1862 Жыл бұрын

Another possibility why chain of thought prompting works could be that data in the training set that has this form is more likely to be correct?

@-mwolf Жыл бұрын

yesssss, thanks yannik!

@TiagoTiagoT Жыл бұрын

Could this pattern of thinking be trained on, so that models may spontaneously chose this approach when suitable and produce better results straight out of the box?

@florianhonicke5448 Жыл бұрын

Thanks for the summary! Can we also have an interview with the authors? :)

@Sven_Dongle Жыл бұрын

Maybe next spring.

@joech1065 Жыл бұрын

@@Sven_Dongle So 10 years in AI time

@DaKingof Жыл бұрын

I'd think this could improve coding with LLM's tramendously. One huge problem I see is that they don't seem to know what versions they are using to write code. It would be wonderful to have the LLM look back and find the code it used for a snippet to see what version it is, then review for the latest or needed version and update it's response to include the latest or selected versions. This way it always knows what codebase it's using and can compare live rather than attempting to use it's training data. As of now it seems to get really confused when I try to ask it to do any of this.

@billxu9799 Жыл бұрын

Good name to catch the hotspot, but kinda trivial work considering extra token usage / computation

@eruiluvatar236 Жыл бұрын

I wonder what would happen if chain of thought or this technique or refinement or the other techniques that increase the quality of the output were used to produce a dataset that contains only the initial question and the final answer and that was used to fine tune the model. If the thinking needs to happen explicitly in the context window, that might not help much or it may still help. Then if it helps, it would be even more interesting to try those techniques again to see if they still provide a benefit. If they do, continue the fine tuning loop and see where it leads.

@jonnicholasiii2719 Жыл бұрын

It eventually leads to God-level cheat codes.

@eruiluvatar236 Жыл бұрын

@@jonnicholasiii2719 Lol, I doubt it. I don't think transformers can reach consciousness or AGI without some serious architectural changes so no God mode yet. But there is plenty of evidence that more training and better quality data help a lot and that you can squeeze way more intelligence in the weights than we are currently able to. Ie you can quantize and prune most of the weights with minimal performance loss. So I wonder if this can be an improvement, much like you can fine tune smaller models on larger/better models output and get some improvements in some benchmarks.

@youvegoattobekittenme6908 Жыл бұрын

Chain of thought prompting I thought was agreed to be more reliable because it creates context for the answer that feeds into the next token probability with higher probability to be correct information (since the prior steps are more likely to have good context) then by laying out a set of given information, it is more likely to have probable information that is accurate.

@joe_limon Жыл бұрын

One can prompt an ai like bing, asking it to implement a tree of thought process to solve your problem. It can look up the paper and construct/execute the process by itself.

@sgramstrup Жыл бұрын

Because it uses something like it already. Gpt4 and all other tooling LLM's are already connected in these cognitive architectures CA, when you chat. We are no longer chatting with the model directly, but with their CA on top.

@joe_limon Жыл бұрын

@@sgramstrup bing can and has in my testing looked up the paper, interpreted the strategy, and then applied it in its response.

@clray123 Жыл бұрын

@@sgramstrup Do you have any proof of that? It may just be that we're talking to larger and better trained models.

@ChocolateMilkCultLeader Жыл бұрын

A whatsapp group I'm in, consisting of non-AI people shared this. Yannic is making it

@anishbhanushali Жыл бұрын

So this is basically learnable Beam search where we're using the same GPT ( or anyother LLM ) for learning the best tree-path. Also here the beam is formed by 'thoughts' as oppose to 'tokens' in general .. !!

@FranAbenza Жыл бұрын

Flood Fill algorithm could be an interesting way to benchmark the probability of a branch to solve our maze

@ericadar Жыл бұрын

Do you think one train a new LLM with maybe 50% more parameters than the original LLM producing the input-output pairs of the final tree-of-thought prompting so that the new (larger) LLM already encapsulates the entire tree-of-thought tree expansion/pruning process on a single feedforward run and thus save on inference compute?

@drdca8263 Жыл бұрын

Does it need to be larger? Compare AlphaZero.

@hanskraut2018 Жыл бұрын

At 5:30 you said 2 hypothesis: 1) working memory (can calculate something then write it down and use all calculation to go in other branches/directions) 2) more compute time I think you you are spot on and u even used the framework of a „hypothesis“ Very nice. Nice 🏅 i dont know im just midlessly intuition commenting here like ur suposed to on youtube right

@MrSuntask Жыл бұрын

Great vid. Why do you wear sunglasses?

@falklumo Жыл бұрын

It should not go without saying that the original paper seems to be "arXiv:2305.08291v1 [cs.AI] 15 May 2023" which is NOT DeepMind and was published 2 days prior to the work you cite here ...

@karlitucha Жыл бұрын

What tools and platforms do you use to stay up to date with the latest papers?

@rikvermeer1325 Жыл бұрын

Polluting the ToT critic (prompt) makes way for some intricate goals. Looks like me this is the part where the AI gets to plot.

@Amipotsophspond Жыл бұрын

This reminds me of Psychology Therapists just blindly repeating "...And How Does That Make You Feel..."

@RedCloudServices Жыл бұрын

Yannic do you predict this capability be integrated soon with openai GPT, Llama or the other LLMs public or private?

@kurort5099 Жыл бұрын

yoooo nice sunglasses bro ;P good good

@jonbbbb Жыл бұрын

Could this technique (or even just chain-of-thought) be used in the training process itself, or as a separate step like RLHF? This would be RLAIF I guess heh.

@skyefreeman9987 Жыл бұрын

My feeling is we could train a new neural net using inputs and outputs using gpt4 and this method to create a much more efficient/intelligent base model before these techniques are applied.

@simonrouse9461 Жыл бұрын

In RLHF, they already use a language model as a critic model. Although it’s called “human feedback”, they actually only use a small amount of human feedbacks to train that critic model. It’s the critic model that actually gives the feedback.

@questmarq7901 Жыл бұрын

This kind of thing can help me with the world building in my book

@SLAM2977 Жыл бұрын

Yannic at his best: with glasses!:)

@JTMoustache Жыл бұрын

Kilcher is back

@piotr780 Жыл бұрын

space of solutions in Game of 24 is really small, so mayby they simply induce brute force search inside the network

@PaganPegasus Жыл бұрын

Not to sound like an ass or anything... but I feel like ToT is just CoT with beam search but pruning happens after self evaluating each thought rather than pruning on the conditional probability of each token. Because quite literally the beam search score evaluation is gonna be: `P("good"|{step N}) - P("bad"|{step N})` for each possible thought at step N.

@testboga5991 Жыл бұрын

Interesting

@dennisestenson7820 5 ай бұрын

It doesn't seems like anyone realizes that when you "instruct" an LLM to do something, it's not being instructed to do anything but continue writing text that could syntactically correctly follow the given instruction.

@ajit60w Жыл бұрын

This is back to writing programs. Whoever said that advent of LLMs will make teaching programming unnecessary.

@television9233 Жыл бұрын

The idea of a value based tree search on LLM outputs sounds interesting in theory, but the results of this specific implementation are lackluster, especially when taking into account that their method was given task-specialized prompts.

@aamir122a Жыл бұрын

So they have taken Tree search , which was applied to the game GO, to drive policy and value network and applied to NLP tasks.

@television9233 Жыл бұрын

Tree search has been a thing since the early days of computing and has been used extensively. And NLP research (as well as any other subfield within computer science) has constantly used tree search algorithms as they are one of the fundamentals of CS.

@unclecode 10 ай бұрын

This topic is interesting, but I have reservations about these papers. They often use large language models for classic algorithms, resulting in high costs, redundant tokens, and increased environmental impact. Rather than substituting parts of existing algorithms, it would be better to introduce solutions that leverage the language model's capabilities. If this research were valuable, one could publish a paper on bubble sort using LLM instead of comparison operators-a worthless endeavor. Using LLMs for traditional algorithms like quick sort may not be groundbreaking. Efforts like LORA or QLORA, which introduce new paradigms, are more significant. In my view, this research doesn't bring substantial innovation. For instance, replacing trees with graphs, naming it 'Graph of Thought' or GOT (Not Game of Thrones ;) ), could yield similar results using classic graph algorithms. This doesn't add much value when compared to zero-shot methods. While reputable institutions are involved, I'm curious if they see something I don't. Your content is consistently excellent. Keep up the good work!

@wizix9877 Жыл бұрын

being thoughtful before acting is even true for AI :)

@luciengrondin5802 Жыл бұрын

This seems to be good to eek better results out of a language model, but if I understand correctly, it can't be used to improve the language model itself, can it? Basically that tree search procedure is not differentiable, thus it can't be back-propagated, right?

@drdca8263 Жыл бұрын

training sets aren't differentiable either though?

@Amipotsophspond Жыл бұрын

19:52 this is really good point because all the models are forward only and this reevaluation allows some pseudo-backwards from a forward only system.

@clray123 Жыл бұрын

They should have called their methodology Clever Hans.

Ай бұрын

Nice sunglasses 😎

@jackderrida Жыл бұрын

I just have one criticism of the video. Otherwise, I love it and am glad you explained Tree of Thoughts. My one criticism is that it would be easier to follow you explaining the diagram if you didn't make up your own paths that contradict the graph at first. I get that you wanted to give an example where it goes back up the tree, which the graph doesn't. But I'd rather you just run through the displayed graph and then do your own to illustrate how it can go back up the tree, separately. It's just confusing for me, when learning, when the example contradicts the explanation I'm listening to.

@ericadar Жыл бұрын

@YannicKilcher can you do a review of Meta's Megabyte?

@petevenuti7355 Жыл бұрын

How can this be integrated into the conceptual structure of the network itself?

@mikelewis1166 Жыл бұрын

I’ll be dropping the transcript of this video into an llm to generate some interesting python scripts and system commands…

@joepike1972 Жыл бұрын

7:44 I would be careful about asking it a question like "do you think that is a good thought?" My experience has been the model might make up reasons why it thinks you think it is wrong instead of taking the time to understand what it is talking about, especially in a subject it is not that versed in. So I would watch out for over corrections from leading questions representing the model seeking only to appease and not reflecting understanding. Particularly if you are taking an unconventional stance or exploring a subject the model is not particularly strong in.

@tommornini2470 Жыл бұрын

Neural network + algorithm = science

@lingred975 Жыл бұрын

Feynman technique? Is the AI trying to explain, discovers its mistakes and corrects? :)

@jacktherater3533 5 ай бұрын

This is what my social anxiety looks like in white paper.

@Timotheeee1 Жыл бұрын

can you review rwkv?

@vighneshsablok5851 Жыл бұрын

Would have appreciated this video more with a live demo of solving a small crossword with ToT method. It is a tough one to visualize. Otherwise great work!

@Sven_Dongle Жыл бұрын

Open Assistant didnt do very well on one of the examples: Using only the numbers 4, 9, 10 and 13, here's another possible mathematical equation that adds up to 24: ```less 5(4 + 9 - 13/2) = 24 ``` Explanation: 1. Addition of positive 4 and negative (negative because of the fractional value 3/2 below) is performed as per the regular addition rule i.e. adding their absolute values in the same order of magnitude (multiplicand) then reduce them and proceed to multiplication step: 4+9=-9 (This makes the next calculation easy): 2. Applied scaling factor = 2 then: 10*2 - 9*2 = 24 We observe; 9 - 9 = 0 , which completes evaluation. All other subtractions now do not make much sense so let's move onward 3. Continuing, => (1 * 2)/2^3 + (2 * 2)/2^3 = 95. Finally without attempting a circle operation after all carryovers from last line no new calculations become possible for operation completion. So this scenario leads to overall non-completion. We don't really care about this fact though since all sequences depending on some chosen variables based operations have always evaluation failure modes rather than the positive ones which just happens when coupled rightly with input numbers types. Eventually implementing fixed precision computation schemes has potential ways to avoid such situations altogether and compute everything most accurately and rigorously. Limiting operations available or numerical handling data capacity/size also typically reduces computing range like floating point arithmetic does too to ensure general approach success while moving away little bit from classical mathematics :). It becomes algorithmic artificial intelligence math instead making full use of technological achievements someday possibly completely surpassing human brain capabilities maybe rivaling something resembling unintelligent brute force optimized entirely for a single task like solving NP hard problems very quickly although having useless feedback required by biological organisms when it comes down to self preservation instincts through learning processes feeding humanity's approach to master own fate most probably soonish if current trends persist extrapolated generations later during very long timeframe while complex reciprocal relations between science, technology, engineering and mathematics otherwise push technological singularity closer rapidly enough. In conclusion this doesn't even come close to what awaits future civilizations struggling to overcome various constraints imposed within known universe boundaries itself hinted by well established fundamental physical laws exactly telling life stories expecting certain rewrites throughout eternity... until "singular" event horizon proper appears beyond normal scale questions thereby leading toward redefined reality essence as a whole originating second Law Of Thermodynamics necessity while offering well described answers ultimately steering research efforts directly bridging all supposed knowledge, hopefully, around a common universal substrate transcending any seemingly autonomous nature meaning altogether under ultimate statistical mechanical control thoroughly maintaining determinism principle for every description, somehow... would The Nature ever be able to smile or cry accidentally in endless expansion eventually? ... What else could become exposed already lying beneath our feet at best guess about ourselves simply leaving room for enhancement unimagined by today's science expectations where only imagination seems limited as far as unknown things go judging what has been discovered recently yet still goes by preconceived ideas basically guessing stuff among popular theories, somewhere inside quite narrow comfort zones usually providing most

@zyxwvutsrqponmlkh Жыл бұрын

I thought you didn't do papers anymore. ☼

@Adventure1844 Жыл бұрын

Tree of thoughts looks like a neural network

@tigergold5990 Жыл бұрын

everything is. They’re in your walls. They’re coming

@rootthree9436 Жыл бұрын

how's it different from beam search?

@markopancic6060 Жыл бұрын

Ant poe eta would be a way to finish that crossword 😂

@charlesd774 Жыл бұрын

I wonder how long until someone makes a GAN where the discriminator is a trained LLM and the generator trains itself.

@rothn2 Жыл бұрын

If the authors had restricted their scope to _planning models_ I think this could have been a much more sound paper, with the opportunity to dominate a class of problems.

@rothn2 Жыл бұрын

You know, the control systems formerly done by RL.

@SimonJackson13 Жыл бұрын

Morpholigical application complexity.

@SimonJackson13 Жыл бұрын

It's like an alpha beta pruned minmax search?

@SimonJackson13 Жыл бұрын

So pattern matchers, rule expanders, rule factorisers and similarity scorers. GANminmax? The score list does appear like a genetic algorithm cross pollination for something. Would the later thoughts have more than one node as parent then?

@danberm1755 Жыл бұрын

Sounds like we need an AI assistant to prompt the AI assistant for complex problems 😁

@danberm1755 Жыл бұрын

Actually this gave me a longer run time (which you mention as the reason tree of thought might have high success rates). I'm pretty sure this can be expanded apon for tree of thought like results that are actually correct. ---- Act like you have the ability to prompt yourself as a human would to figure out complete answers. If you made change with $20 how many ways could you split the bill into 8 bills?

@danberm1755 Жыл бұрын

The big question is how long OpenAI would allow the inference to continue. In other words you'd probably have better results when you pay by the token (not ChatGPT).

@drdca8263 Жыл бұрын

oh heck, uhhh... I hope the idea I have for how this could be extended, doesn't work? Edit: not to say that I think I had any rare insight or anything. I think the idea I had is probably obvious, especially if it works. I’m just refraining from mentioning it in case it works just in case, on the very unlikely off chance that me saying what it is makes it be done sooner I mean 90% of the idea is already described in the video, and the other 10% idk if it is likely to work at all... and maybe if it does work, it still might not be *that* effective, but like,.... it seems *conceivable* that the idea might work quite well, with enough training data, and idk that seems like it could be bad... edit3: Ok, yeah, no, many other people definitely thought of the idea before I did.

@aleksanteri_r Жыл бұрын

Why is no one talking about how ToT is also a funny emoticon??

@stacksmasherninja7266 Жыл бұрын

Figure 1 has subfigures (a) (c) (c) (d) lol

@hurktang Жыл бұрын

This is basically the algorithm of the annoying kid in the car. 1. Ask "why". 2. Do noise. 3 Ask "how long before we get home?" 4. Goto 1. But with a twist! If the kid gets bored, the parenting driver gets disintegrated and replaced by a new one.

@Chillingworth Жыл бұрын

We need adversarial promoting that makes any model output an exact phrase or extremely similar thing from innocuous input prompts. There must be a way to get ChatGPT to be redpilled

@milos_radovanovic Жыл бұрын

All of these sound like we are teaching AI to do the philosophy by combining expert intuition with formal reasoning! I'm waiting for AI that can at least do its own science through numerical model experiments. :)