Approximately half way through this series, but had to take a pause and congratulate you on the amazing content! The visuals and your commentary throughout are exceptional. Thank you for sharing your work! Please keep it up!
@zijiali8349 Жыл бұрын
on 10:42, could you explain why you multiplied latent_embedding_size by 2 as your input dim for decoder layers?
@danieladebi76652 жыл бұрын
@12:17 Shouldn't we be taking the square root of the std variable to get the actual standard deviation, assuming we're getting the log of the variance then exponentiating that? Or does this not actually matter here? Also great series!
@DeepFindr2 жыл бұрын
Hi! This is actually not required, because log(Sigma ** 2) = 2 * log(sigma) Because 2 is just a constant we can also simply use the standard deviation. :) hope this makes sense
@danieladebi76652 жыл бұрын
@@DeepFindr Ok this makes sense, thank you!
@Ee-ki7cn Жыл бұрын
if i use the InnerProduct as the decoder, must the shape of the adjacency matrix be identical for every input graph? i.e. i have a training set of graphs with different nodes and varying number of nodes. Can i just take the graphs as they are and use the InnerProduct.
@Ee-ki7cn Жыл бұрын
and how to handle batching alternatively in the training function without using batchnorm in the model.??? Thank you
@DhananjaySarkar-y3i2 ай бұрын
One of the best video i have ever seen
@kianpu62423 жыл бұрын
@16:32 what is the reason of having three repeated decoder layer? I noticed you didn't pass in x to the second dense layer after the first dense layer. Instead, you passed in inputs.
@DeepFindr3 жыл бұрын
Hi! Yes this error is already fixed in the code, but had no impact on the result. There is no specific reason for using three layers. It's just to give the model more depth and parameters. Cheers
@nicolasf12195 ай бұрын
Would this also work on large graphs?
@변진영-v7j2 жыл бұрын
The best video what I've ever seen!
@artistworking77552 жыл бұрын
Hi! Thanks for sharing, may I ask why there are no activation layers in the decoder functions (i.e self.edge_decode(z) ). I have also seen that in the original GVAE the decoding is done by sigmoid(z@z.t()), where z is the latent representation of the graph. I am trying to understand these reconstruction losses and trying to figure out which one is best for reconstructing a non-binary adjacency matrix.
@DeepFindr2 жыл бұрын
Hi! The final projections have no activation because they should output plain logit values that are then fed into the argmax. Would maybe make sense to also put a softmax in between to sharpen the distribution. The shared layers before apply relu and allow for nonlinearity
@DeepFindr2 жыл бұрын
The other way to decode is called "dot product decoder" and is just another way. This approach however creates similar embeddings for nodes that should be connected. This puts more emphasizes on similar neighborhoods which doesn't necessarly need to hold for connections. I think the projection head is adding more flexibility, but I have not empirically evaluated this.
@lennarth.32702 жыл бұрын
Hey, I am currently trying to rebuild the system. Unfortunately, the website you linked on your github to get the dataset somehow looks a bit fishy and redirects me to some other websites. I was wondering if I can get the data somewhere else? I really would appreciate it :)
@DeepFindr2 жыл бұрын
Hi! Was it this MoleculeNet dataset? moleculenet.org/ Thanks for this info!
@DeepFindr2 жыл бұрын
Yep that link is outdated. I updated it, thanks!
@lennarth.32702 жыл бұрын
@@DeepFindr Thank you : )
@stevenrodrig142 жыл бұрын
Great video! Just wondering why you chose MLP layers of size 128? Is 128 the total number of possible combinations in the adjacency matrix?
@DeepFindr2 жыл бұрын
Hi! Thanks :) This is just the embedding dimension. The adjacency matrix is predicted by taking all possible combinations of these embeddings per node. This number is however dynamic - one graph has 10 nodes another one has 15. So the size of the adjacency matrix changes from graph to graph.
@liongkhaijiet50143 жыл бұрын
The concepts and reasons are well explained. But i have one question, do we just take the whole matrix instead of the triangular upper matrix if our graph is directed?
@DeepFindr3 жыл бұрын
Yes. Also you can ignore the diagonal if you don't have self loops.
3 жыл бұрын
it's great series. I learn a lot from you with thanks. Is that possible to have a GNNs tutorial for text classification?? :)