Excellent! need to go deeper that could be a game changer for reasoning, as it makes more sense to reason on a graph rather than on the next token.
@bensimonjoules4402Ай бұрын
Its interesting to see "attention" on graph structures again. I think in the future a more structured knowledge representation may play a role on improving reasoning, as we could leverage logic and rules using engines on them, like compilers aid in code generation.
@bharanij6130Ай бұрын
Thank you for this video Letitia! As always amazing :=) Side note: Loved the silent Good Bye :)))
@jonclementАй бұрын
Interesting. It's almost like two types of tokens: nodes + edges which can each be compressed to a feature vector. But yes, with positional encoding you're left with "random walk with restart" or a traversal depth. Or one could sum node_vector + edge_vector ~= positional distance. but yeah, more graph solutions coming in the future.
@vladimirtchuiev2218Ай бұрын
I'm interested even more on the generative side, generating large graphs with contained text in them from a prompt, can be useful for modalities which are represented by large graphs. I've yet to see anyone doing this. While you can prompt LLMs to generate small graphs, for larger graphs you see significant performance drops.
@sonOfLiberty100Ай бұрын
It would be interesting how much computation this needs
@AICoffeeBreakАй бұрын
Do you mean for training or inference? Training is a finetuning setting and you can see performance curves in Figure 4 in the paper. arxiv.org/pdf/2401.07105 Inference costs as much as the base LLM.
@sonOfLiberty100Ай бұрын
@@AICoffeeBreak both things. overall computation. Thank you, I will take a look
@MoritzPlenzАй бұрын
Hi, I am Moritz (one of the authors). I don't have much to add to Letitia's reply, but here is another relevant part of the paper, taken from section 4: Being transformers, GLMs have the same computational complexity as their respective LM. For sparse graphs the lGLM could make use of sparse matrix multiplication, making it more efficient than a corresponding LM or gGLM. However, for our experiments this was not necessary.
@yorailevi6747Ай бұрын
I need to read this more deeply, I don't understand why would just grafting the parameters willy nilly works
@keeperofthelight9681Ай бұрын
Deep Learning is more of an alchemy than anything, an in-depth thought out plan may not work and sometimes just a hacky way around a solution works a lot better