They are just multiple experting on different task each paper to achieve generalization, next paper name will be: mixture of expert on Multi ticket hypothesis with attention with dropout
@aleph05402 ай бұрын
ah ok i see
@minecraftermad2 ай бұрын
22:00 i mean there's mixture of millions of experts?
@Tunadorable2 ай бұрын
i’ll have a vid out on that paper soon
@phobosmoon46432 ай бұрын
multi-headed agent is the 'solution' to the halting problem. Or, simply, an agentic kernel. I think that means that once we have chips and micro-architecture to do so, LLM inference will be a core operation at the hardware level and there will be a dedicated subsystem for it. Oh this is talking about syntax/symbolic heads, not execution heads. Kinda the same thing though.