GNN Project #4.2 - GVAE Training and Adjacency reconstruction

Рет қаралды 9,224

Күн бұрын

Пікірлер: 24

@ILoveMattBellamy Жыл бұрын

Approximately half way through this series, but had to take a pause and congratulate you on the amazing content! The visuals and your commentary throughout are exceptional. Thank you for sharing your work! Please keep it up!

@zijiali8349 Жыл бұрын

on 10:42, could you explain why you multiplied latent_embedding_size by 2 as your input dim for decoder layers?

@danieladebi7665 2 жыл бұрын

@12:17 Shouldn't we be taking the square root of the std variable to get the actual standard deviation, assuming we're getting the log of the variance then exponentiating that? Or does this not actually matter here? Also great series!

@DeepFindr 2 жыл бұрын

Hi! This is actually not required, because log(Sigma ** 2) = 2 * log(sigma) Because 2 is just a constant we can also simply use the standard deviation. :) hope this makes sense

@danieladebi7665 2 жыл бұрын

@@DeepFindr Ok this makes sense, thank you!

@Ee-ki7cn Жыл бұрын

if i use the InnerProduct as the decoder, must the shape of the adjacency matrix be identical for every input graph? i.e. i have a training set of graphs with different nodes and varying number of nodes. Can i just take the graphs as they are and use the InnerProduct.

@Ee-ki7cn Жыл бұрын

and how to handle batching alternatively in the training function without using batchnorm in the model.??? Thank you

@DhananjaySarkar-y3i 2 ай бұрын

One of the best video i have ever seen

@kianpu6242 3 жыл бұрын

@16:32 what is the reason of having three repeated decoder layer? I noticed you didn't pass in x to the second dense layer after the first dense layer. Instead, you passed in inputs.

@DeepFindr 3 жыл бұрын

Hi! Yes this error is already fixed in the code, but had no impact on the result. There is no specific reason for using three layers. It's just to give the model more depth and parameters. Cheers

@nicolasf1219 5 ай бұрын

Would this also work on large graphs?

@변진영-v7j 2 жыл бұрын

The best video what I've ever seen!

@artistworking7755 2 жыл бұрын

Hi! Thanks for sharing, may I ask why there are no activation layers in the decoder functions (i.e self.edge_decode(z) ). I have also seen that in the original GVAE the decoding is done by sigmoid(z@z.t()), where z is the latent representation of the graph. I am trying to understand these reconstruction losses and trying to figure out which one is best for reconstructing a non-binary adjacency matrix.

@DeepFindr 2 жыл бұрын

Hi! The final projections have no activation because they should output plain logit values that are then fed into the argmax. Would maybe make sense to also put a softmax in between to sharpen the distribution. The shared layers before apply relu and allow for nonlinearity

@DeepFindr 2 жыл бұрын

The other way to decode is called "dot product decoder" and is just another way. This approach however creates similar embeddings for nodes that should be connected. This puts more emphasizes on similar neighborhoods which doesn't necessarly need to hold for connections. I think the projection head is adding more flexibility, but I have not empirically evaluated this.

@lennarth.3270 2 жыл бұрын

Hey, I am currently trying to rebuild the system. Unfortunately, the website you linked on your github to get the dataset somehow looks a bit fishy and redirects me to some other websites. I was wondering if I can get the data somewhere else? I really would appreciate it :)

@DeepFindr 2 жыл бұрын

Hi! Was it this MoleculeNet dataset? moleculenet.org/ Thanks for this info!

@DeepFindr 2 жыл бұрын

Yep that link is outdated. I updated it, thanks!

@lennarth.3270 2 жыл бұрын

@@DeepFindr Thank you : )

@stevenrodrig14 2 жыл бұрын

Great video! Just wondering why you chose MLP layers of size 128? Is 128 the total number of possible combinations in the adjacency matrix?

@DeepFindr 2 жыл бұрын

Hi! Thanks :) This is just the embedding dimension. The adjacency matrix is predicted by taking all possible combinations of these embeddings per node. This number is however dynamic - one graph has 10 nodes another one has 15. So the size of the adjacency matrix changes from graph to graph.

@liongkhaijiet5014 3 жыл бұрын

The concepts and reasons are well explained. But i have one question, do we just take the whole matrix instead of the triangular upper matrix if our graph is directed?

@DeepFindr 3 жыл бұрын

Yes. Also you can ignore the diagonal if you don't have self loops.