tried telling it to write a 400 line game of snake, it generated a plan (beginning was partially a story), then it output a simple code output, then a plan for more features, then code output, this repeated 4 or 5 times - the end result before getting timed out was about 100 lines of okayish python code. I know it wasnt made for this - but just wanted to share
@paullopez_ai4 ай бұрын
Great video, Sam! I appreciate your clarity and consistency of your videos
@unclecode4 ай бұрын
I love how they generated the data; it's similar to how I used to create long content, starting with an outline, then building section by section, with summaries to keep things on track. What's fascinating is their explanation, showing how you can train a model for any output. I'd love to see a comparison between this method and using a model to quickly generate and merge content for each part. Which one gives better quality and integrity? That's an interesting topic to explore.
@novantha14 ай бұрын
I'm not sure if this would be in your wheelhouse, but the paper "Automated Design of Agentic Systems" discussed how to automatically (!) produce agentic workflows to achieve a variety of capabilities (they showed solving the MMLU for example, I believe), where an agentic system designs an agentic system to solve your problem. I think that would be quite a fun video and a good followup if you were to show an example of it being deployed for generating synthetic data on a topic. I'm pretty sure something like that will be pretty important in the future, and I could even imagine designing an agentic system to "grade" the outputs of the automatically produced agent for achieving "soft targets" that don't have explicit successes or failures, like writing "interesting" articles, or writing "well-written" fiction and so on.
@andydataguy4 ай бұрын
This sounds epic please do this 🙌🏾
@samwitteveenai3 ай бұрын
This sound interesting let me check it out. I did make an Agent that can make CrewAI bots pretty easily. Just Googled it and I see it is citing AI Scientist which I have also worked on making a LangGraph version of. Thanks for the tip much appreciated, I will give the paper a read etc.
@Arcticwhir4 ай бұрын
Tsinghua University consistently has amazing AI research papers, such a strong research department!
@samwitteveenai3 ай бұрын
totally agree, lots of cool stuff coming from there
@MeinDeutschkurs4 ай бұрын
You nailed it! 🎉
@reza2kn4 ай бұрын
Wonderful video! 😍had seen the project but was too lazy to play with it myself. I'm reading up on Knowledge Distillation, specifically for translation tasks from the new Flash model to tiny ones, to develop tiny MT models that > SOTA. This idea of long-generations might come in handy there as well.
@andydataguy4 ай бұрын
This is awesome!
@finlay4224 ай бұрын
Interested to see how this fine tuning can be applied to summarization. Maybe this can solve the issue of large documents being condensed into a few sentences.
@bseddonmusic14 ай бұрын
Interesting but what's the use case for explicitly generating long documents? Are there people who are going sit with an LLM and choose to spend hours reading long output? Maybe book publishers wanting to get away from unreliable authors? Maybe authors who have a deadline and need to get some copy to their publisher? I can understand the benefit of long output in the context of coding where real world applications are 10's of thousands of line long but is code output possible from a model like this?
@vanshpundir66484 ай бұрын
Actually this is helpful when you want to generate alot of data in a single LLM call. Otherwise we need to make multiple LLM call.
@samwitteveenai4 ай бұрын
Lots of people ask for long outputs for SEO and lots of other uses etc. For me personally I am interested in long outputs for editing long form text. Imagine something like Grammarly but where it can totally change elements of content not just grammar or spelling correction. Image things like please change this character from male to female and correct all references as an obvious example or rewrite this doc removing all political bias and make it neutral. Currently these are very hard for things that go beyond a 1000 words etc. I totally agree about the long form code btw. You could train a model to do that in a similar way.
@sammcj20004 ай бұрын
It can be useful for coding, refactoring large code bases etc if your available tooling doesn’t handle chunking requests
@cemtekesin90334 ай бұрын
I had a project where I had to categorize thousands of risk titles to first group them together and rewrite them to capture the risk groups consistently (data ingestion quality issue) . I was surprised to see how output token limit can be such a limiting factor.
@rishabnandi95934 ай бұрын
Company FAQ and CS Training docs often are over 10k and someone somewhere in the ladder has to read them
@TechnoMageCreator4 ай бұрын
In my experience it's all about context. Kinda like a reverese dominos. I've been able to do it with ChatGPT-4o since it has memeory. Fist you add to memory important information. Generate a structure and add some details, generate chapter by chapter and than have it generate everything together, most i generated was about 20 min without user interaction and about 10000 lines of vode, pressed about 20 times continue generating. With every generation start to run much much slower. Since I generated like this for months manual mode I got to write down my entire process step by step. Trying now to emultae with software and multiple AI, file read/update/edit/delete. The idea is to have the user input a text, AI searches the existing documents and always create an aditional task list in order to access next file.
@NetZeroEarth4 ай бұрын
🔥 🔥 🔥
@micbab-vg2mu4 ай бұрын
Nice:)
@anishbhanushali4 ай бұрын
ah ....they could have named the model "wan shi tong" ( he who knows 10,000 things)
@geronimotutusaus4 ай бұрын
Wow! This is an amazing way to create synthetic data! Today I saw this eye-opening video kzbin.info/www/bejne/fHu3i4Ntj8mEnJI from David Shapiro based on the research "Textbooks Are All You Need " and it looks like LongWriter can be the solution to mastering specialized LLM.
@claytoncarroll23093 ай бұрын
poor video quality
@JNET_Reloaded4 ай бұрын
how to use this with ollama locally?
@samwitteveenai4 ай бұрын
It should be able to be converted to gguf for Ollama. You will need quite a bit of RAM though to generate the long outputs.