links to slides and more context x.com/latentspacepod/status/1871998012467380698
@RafiDude16 сағат бұрын
Please pin this comment.
@Lucien-lu1vw17 сағат бұрын
Amazing demo with the pie charts PR. I played a lot with agents and came to the same conclusions and design choices as Graham Neubig.
@TheNitroPython12 сағат бұрын
These talks are extremely valuable. I would love to be in person.
@NaveenReddy-p5j17 сағат бұрын
Graham Neubig nailed it! Agent tech’s evolving fast, with AllHands on top. 2025 looks exciting for agents!
@AlexJohnson-g4n15 сағат бұрын
AllHands leading SWE-Bench Full is impressive! Graham Neubig’s insights were top-notch. 2025 does seem like the year for major agent tech breakthroughs.
@neuronwave11 сағат бұрын
Interesting to watch this at Christmas and reflect on how much higher o1-pro (49) and o3 (71.7) are than the performance of models on swe-bench example. Clearly highlights a challenge in 2025 (and beyond) to build things that won't be washed away in the tsunami of new models doing all the agent workbench internals. Especially relevant as the cost per hour of software dev is so high that o1-pro (and probably o3), while expensive, are much cheaper than human coder.
@MatthewSanders-l7k17 сағат бұрын
Graham Neubig's keynote on LLM agents is a game-changer! AllHands leading SWE-Bench Full is impressive. Exciting future for agent tech!
@WinonaNagy17 сағат бұрын
Neubig’s insights are a gem. With AI agents advancing rapidly, are we ready for a paradigm shift in our digital interactions?