Graph Language Models EXPLAINED in 5 Minutes! [Author explanation 🔴 at ACL 2024]

Рет қаралды 4,696

Күн бұрын

Пікірлер: 11

@jmirodg7094 Ай бұрын

Excellent! need to go deeper that could be a game changer for reasoning, as it makes more sense to reason on a graph rather than on the next token.

@bensimonjoules4402 Ай бұрын

Its interesting to see "attention" on graph structures again. I think in the future a more structured knowledge representation may play a role on improving reasoning, as we could leverage logic and rules using engines on them, like compilers aid in code generation.

@bharanij6130 Ай бұрын

Thank you for this video Letitia! As always amazing :=) Side note: Loved the silent Good Bye :)))

@jonclement Ай бұрын

Interesting. It's almost like two types of tokens: nodes + edges which can each be compressed to a feature vector. But yes, with positional encoding you're left with "random walk with restart" or a traversal depth. Or one could sum node_vector + edge_vector ~= positional distance. but yeah, more graph solutions coming in the future.

@vladimirtchuiev2218 Ай бұрын

I'm interested even more on the generative side, generating large graphs with contained text in them from a prompt, can be useful for modalities which are represented by large graphs. I've yet to see anyone doing this. While you can prompt LLMs to generate small graphs, for larger graphs you see significant performance drops.

@sonOfLiberty100 Ай бұрын

It would be interesting how much computation this needs

@AICoffeeBreak Ай бұрын

Do you mean for training or inference? Training is a finetuning setting and you can see performance curves in Figure 4 in the paper. arxiv.org/pdf/2401.07105 Inference costs as much as the base LLM.

@sonOfLiberty100 Ай бұрын

@@AICoffeeBreak both things. overall computation. Thank you, I will take a look

@MoritzPlenz Ай бұрын

Hi, I am Moritz (one of the authors). I don't have much to add to Letitia's reply, but here is another relevant part of the paper, taken from section 4: Being transformers, GLMs have the same computational complexity as their respective LM. For sparse graphs the lGLM could make use of sparse matrix multiplication, making it more efficient than a corresponding LM or gGLM. However, for our experiments this was not necessary.

@yorailevi6747 Ай бұрын

I need to read this more deeply, I don't understand why would just grafting the parameters willy nilly works

@keeperofthelight9681 Ай бұрын

Deep Learning is more of an alchemy than anything, an in-depth thought out plan may not work and sometimes just a hacky way around a solution works a lot better