Understanding Graph Neural Networks | Part 3/3 - Pytorch Geometric and Molecule Data using RDKit

Рет қаралды 63,544

DeepFindr

Күн бұрын

Пікірлер: 198

@ruxiz2007 2 жыл бұрын

Thank you for the excellent videos! They saved me from a panic attack when facing this topic.

@edinjelacic2132 6 ай бұрын

Awesome presentation, you should really be proud of this series! Learned quite a lot, and also learned how to transfer knowledge effectively. All the best!

@shashanksistla5400 3 жыл бұрын

Absolutely phenomenal explanation! Thank you so much for the video!

@uxirofkgor Жыл бұрын

Easy explanation and straight to code. I love it

@galmasrati 2 жыл бұрын

One of the best GNN tutorials I encountered! Well done!

@sharifulislam8357 3 жыл бұрын

Thanks a lot for this series! I am learning GNN for my UG research in computational biology. It will be very helpful for me!!

@DeepFindr 3 жыл бұрын

Great! Soon I will upload new videos for my series with molecule data. Maybe it is helpful for you :)

@김성원-j8k 11 ай бұрын

This series was so helpful. Thank you so much!!!!

@LuoxiangPan 4 ай бұрын

The best GNN explanation video also answered many questions of mine

@itslogannye 2 жыл бұрын

This series has been a fantastic intro to GNNs. Subscribed for any future tutorials like this! :)

@DeepFindr 2 жыл бұрын

Thanks!

@鄭秉松 2 жыл бұрын

Absolutely a great explanation of GNN. And I really like those qiuck recaps of deep learning methods while training.

@quynhnlp5086 3 жыл бұрын

Really helpful, clear and great explanation. thank you a lot :)

@ethanshen4162 3 жыл бұрын

Your work is really excellent! Please keep an update, I will continue to pay attention. Thank you!

@DeepFindr 3 жыл бұрын

Thank you!

@peterstrom8522 4 жыл бұрын

Really well done! In 3/3 you apply GCN to distinct graphs (molecules). Would be much appreciated if you could do a similar tutorial for node classification on a single (large) graph.

@DeepFindr 4 жыл бұрын

Hi, thanks I really appreciate the feedback. Sure I can have a look at this, I'll notify you once I got something prepared :)

@peterstrom8522 4 жыл бұрын

@@DeepFindr Fantastic. Looking forward!

@subhavmittal5099 Жыл бұрын

Great video sir! I wanted to understand GNNs for protein representation learning and this series has been very helpful to understand stuff from the base up

@DeepFindr Жыл бұрын

Glad you liked it! Interesting topic you are working on!

@MGRVE 3 жыл бұрын

Thank you very much for this three-part series. Very helpful indeed!

@DeepFindr 3 жыл бұрын

Happy that you found it useful :)

@MGRVE 2 жыл бұрын

@@DeepFindr Hi, is there a chance to get in touch? (I am based in Germany, doing Bioinformatics)

@DeepFindr 2 жыл бұрын

You can always write me an email if you have questions / requests :) deepfindr@gmail.com

@jianxianghuang1275 3 жыл бұрын

Thanks a ton. It's REARLLY helpful.

@TheAnna1101 Жыл бұрын

thank you! really appreciate you covering the details with explanation

@pedroviniciuspereirajunho7244 Жыл бұрын

Thanks! Can't wait to apply this on new solutions here :D

@JesusRenero Жыл бұрын

Great series!! Thanks for such a clear way of explaining it.

@mojtabadezvarei5646 2 жыл бұрын

Thanks for the excellent explanation and your highly profitable channel. I have a question about your example that might be a general one. How should we determine embedding size? In your example, it was chosen 64. Assuming a value, let's say 'M,' how is it possible to get M when we have each node's 'n' feature? I mean, is there any formula/procedure to get the embedding size from node features and the number of nodes? Thank you in advance.

@DeepFindr 2 жыл бұрын

Hello! The embedding size is a hyperparameter and can be determined using trial and error. I am not aware of any formula for the ideal size, though there is some literature on this (e.g. arxiv.org/abs/2105.03178). Typically, if you have larger node feature vectors, the embedding size should also be larger to hold more information. But this is just a general rule of thumb. Typical choices are 32 - 512. You can just try out what works best for your problem :)

@amiralizadeh6621 Жыл бұрын

Thank you for the nice tutorial. What are you trying to predict in the graph level? you wanna know if a molecule graph is Ibuprofen?

@pierrebedu7760 3 жыл бұрын

Hi, great work. I'm new to GNNs and i really appreciate. There's only one part that's unclear for me (when i read the gmp/gap documentation) : why do those pooling operations need to get the batch_index??

@DeepFindr 3 жыл бұрын

Hi! Thanks :) You want to pool the nodes graph-wise, that's why you need to be able to map the nodes to each of the graphs (=batch index). In PyG and most other graph libs the batching works by creating a big disconnected graph, as also mentioned in the video at ~15min. Wiht gmp / gap you don't want to pool all nodes of this batch, but just all nodes per graph, that's why you need the index :) best regards

@cat-cu1cx 2 жыл бұрын

Great Series. Thank you so much! When we define conv2 and conv3 layers of the GCN, can we assume that message passing is explored upto 3 degrees from any node?

@DeepFindr 2 жыл бұрын

Hi, yes the number of layers corresponds to how deep the aggregation goes. Three layers = maximum 3 hops away from a node

@drewrohskopf2881 3 жыл бұрын

Excellent video. Do you know what the 9 features in the ESOL dataset represent? I assume data[0].x represents the 32 atoms for the first data point, but what are the 9 features?

@DeepFindr 3 жыл бұрын

Thank you! The features are: H-bond acceptor count, H-bond donor count, non-carbon proportion, aromatic proportion, rotable bonds, molecular weight and clogP. However, I don't really know what additional transformations were applied inside Pytorch Geometric as it seems like all the node features are integer values. Hope that helps anyways!

@DeepFindr 3 жыл бұрын

I think those features I mentioned are rather on a graph basis but not on a atom basis. I did some research but to be honest there is no transparency about what atom features were used. From personal projects I can say that typical node features for molecules are: atomic number, valence electrons, aromaticity, hybridization (one-hot encoded), if the atom is part of a ring, the 2D/3D coordinates of the conformation, gasteiger charge, Van der waals forces,... All of those can be calculated through rdkit.

@drewrohskopf2881 3 жыл бұрын

@@DeepFindr Thanks! Seems like these features are a choice, I was wondering if those numbers were supposed to make sense. Do you know of any ways to featurize geometry? I guess you could pass distance r_ij between nodes, but it would be useful to get more direction dependent information.

@DeepFindr 3 жыл бұрын

@@drewrohskopf2881 with calculating the conformation you can get 2D/3D coordinates and add them to each atom. And additionally you could add the distance between atoms in that spaces as edge features. Is that what you mean? For a project I also added such information, in my case however it didn't improve the performance significantly (but might be different in other cases) .

@drewrohskopf2881 3 жыл бұрын

@@DeepFindr yeah I suppose positions would do it. I'm specifically interested in graph-level predictions of energy, where the graph is the molecule geometry. So I wanna capture the relative geometry between atoms, but in a translationally and rotationally invariant way.

@kimrowoon7660 Жыл бұрын

Thank you for the great content. I had a doubt regarding the definition of the model and the passing of hidden layers in the forward method. How does the dimension of edge_index change to embedding_size=64?

@moarshy 3 жыл бұрын

thanks a lot!! n looking fwd to more GNN videos :)

@chongtang7778 3 жыл бұрын

Thank you so much for the great explanation. One question I'm really interested in is that how to build our own graph? For example, we have some skeletal data or networking data that is collected by ourselves. How can we make them nice graph data step by step?

@DeepFindr 3 жыл бұрын

Hi! I have uploaded a video for that in my recent GNN series. Just search for "custom dataset" in my videos. It's for molecule data, but the process is the same :)

@chongtang7778 3 жыл бұрын

@@DeepFindr Thanks! I found it~

@juanete69 Ай бұрын

Could you explain more in depth why we have 32 nodes and 9 features for this problem, please?

@doyleBellamy03 9 ай бұрын

Another perfect video. Thanks a lot!!

@chandrasutrisno 2 жыл бұрын

Thank you for providing this video. Just a simple question, how do the models for node, edge, and graph level prediction differ?

@DeepFindr 2 жыл бұрын

Hi! The main difference lies in the last layers of the network. For Graph-level you need to apply pooling (either simple things like mean/max pooling on the node embeddings or more advanced pooling methods). For node-level you can simply use the per-node embeddings to perform predictions. For edge-level you typically take all pairs of node-level embeddings and predict if there is a connection between the nodes. Hope that helps :)

@emreipek4485 10 ай бұрын

Hello sir. Before asking my question, I would like to thank you so much for your precious video series about GNN. I guess you can't figure out how valuable are your videos and resources that you provide us for me. Thank you so much, sir. My question is about the GNN model designing and training part. In this video, we apply global pooling operation for both reducing "node embedding"s into the one for each graph and also reducing "graph embedding"s into one for getting one loss score at each mini-batch, am I wrong? I just want to clear it for myself. This confused my mind a litlle bit. In image or tabular data based problems, do we apply optimization processes by pooling loss values of each data for a mini-batch? Also I want to mention that I cannot find torch's 1.6.0 version in my colab notebook. I wonder that whether this is about Python version or not? My Python version is 3.10.7. I hope, I explained my question clearly for you. Many thanks again for everyone you contribute us. Best regards.

@chri_pierma 2 жыл бұрын

First of all, great series of videos, you got a new subscriber. Second, one question: in a graph neural network, what are the weights that are updated through the training of the GCN model? And where are them (or rather, in which layer are them)? In the Message Passing layer or between an MP layer and a "activation function" layer? Thank you very much!

@DeepFindr 2 жыл бұрын

Thanks! Each of the node feature vectors is transformed BEFORE aggregation and therefore also before activation. I have some better visualizations of that in either the video "Graph Attention Networks" or "How to use edge features in GNNs" that might be helpful :)

@keshavraghuwanshi1242 Жыл бұрын

pooling also has to apply when we work with the node classification task? and how pooling works in node-level prediction? please help me with this.🙌

@DeepFindr Жыл бұрын

Hi, no pooling is only necessary for Graph-level predictions :) In the node level case you can simply use each node embedding for predictions, e.g. Using a dense layer

@keshavraghuwanshi1242 Жыл бұрын

@@DeepFindr thank you 👍

@florianhonicke5448 3 жыл бұрын

Keep up your great work. I'm happy i found this channel

@DeepFindr 3 жыл бұрын

Thanks for the feedback! Is there something specific you are interested in (like GNNs) or do you like any deep learning content? Thanks!

@easyguide9391 3 жыл бұрын

You can make videos on the variants of GNN and it’s implementation in pytorch

@DeepFindr 3 жыл бұрын

Hi ok I'll note it down. Soon more GNN videos will be uploaded. :)

@panditrishabh9813 3 жыл бұрын

Thanks for the GNN series. Can you upload a video on how GNN works on 3D mesh data ?

@wspie6043 Жыл бұрын

Pretty nice tutorial! I have a problem that struggled me a lot. I'm have a large dataset with geometry info that can form boundary edge indices, and I want to model them in batches. I splitted train, val, test sets and for training, do I need to get edge indices before distribute them into batches or after? If before, can each batch access info from other batches? I tried several times but failed. After that, I decided to split the map into smaller cells without batching and then implement GCN without using global pooling for each cell, but the performance was pretty bad. Can you give me some advice if you see this, I appreciate!

@AI_ML_DL_LLM 3 жыл бұрын

Thank you for the video, in the 1st example, how did you handle the different size of the molecules?

@DeepFindr 3 жыл бұрын

Hi :) that's exactly the beauty of GNNs - they can handle different sizes automatically. Maybe have a look at my video on how to use edge features in GNNs, there I have a better visualization of how it works :)

@AI_ML_DL_LLM 3 жыл бұрын

@@DeepFindr thanks again for your reply, I think different sizes are handled in the "pooling part" as explained in your video.

@DeepFindr 3 жыл бұрын

Ah ok so you refer to graph-level predictions :) I also have a video about that in my GNN project. It's also called "graph level predictions" :)

@AI_ML_DL_LLM 3 жыл бұрын

@@DeepFindr thank you sir, I came back to this video after a month and it synched with me very well. one more thing, where the "learning parts" are happening? (p.s. "whatsoever" means "never", i think you meant "so on so forth" or "et cetera" :) )

@DeepFindr 3 жыл бұрын

The learning happens in the matrices that are multiplied with the adjacency matrix and node feature matrix. I have a video on graph attention networks, where I have visualized this :) hope this helps

@basitakram Жыл бұрын

I am trying to run the Colab Notebook which has been linked in the description. I am facing issues while running it.

@carloshu5529 2 жыл бұрын

was ESOL dataset changed since then? i have 734 as dataset target

@DeepFindr 2 жыл бұрын

Yes, several people have reported this. Simply manually set it to 1. I think only the num_classes value is invalid.

@praveenbenedict8551 2 жыл бұрын

@@DeepFindr How do I manually set it to 1?

@srinivasarepalli6080 11 ай бұрын

I am working on graph classification on heterogeneous data. Is there any good reference or example?

@deadliftform4920 6 ай бұрын

best explanation present out there, i would love to connect with you to gain some knowledge about the ML/AI journey i wanna go through, please tell me how can i connect with you?

@prakaashsukhwal1984 3 жыл бұрын

wonderfully explained..thank you! 😊 ..request some similar videos for knowledge graphs

@DeepFindr 3 жыл бұрын

Thank you! I have another video called "node classification...", where I use PyG on one single graph. If that is what you look for it might be the right place :) but it's not really a knowledge graph, but just one single graph.

@prakaashsukhwal1984 3 жыл бұрын

@@DeepFindr yes, the one you mentioned is good as all your videos but actually looking for a KG example..any plans to create those with embeddings like TransE or query2box embedding framework.. :)

@DeepFindr 3 жыл бұрын

Hi! I can add it to my list but there are many other videos I wanted to create. So probably not in the near future I think :(

@awadelrahman 3 жыл бұрын

Thank you, a great video! another video can be implementing some of the GCN MassagePassing Layers themselves, i.e. implementing the Update and Aggregate functions, so we can implement our custom layers. Also, mostly node classification is done for a giant single graph using masks by learning from some part of the graph to label other nodes. My question is: if We have many graphs with labeled nodes can we learn node classification task to classify the nodes of new graphs?

@DeepFindr 3 жыл бұрын

Hi, regarding custom layers there is already something in the documentation: pytorch-geometric.readthedocs.io/en/latest/notes/create_gnn.html But generally would be a nice video idea yes, thanks :) For node classification I have a dedicated video on one large graph, just search for node classification on my channel :) Hope this helps

@naevan1 2 жыл бұрын

Thank you so much , i learned so much stuff here. However one suggestion, since I cannot see a cursor or something specific, and i just hear your voice, I saw all 3 videos on 0.8 speed, otherwise I went back and forth all the time , since I found it a bit fast. But could be just me !

@DeepFindr 2 жыл бұрын

Yes the resolution is a bit small here. I improved this in my more recent videos :)

@naevan1 2 жыл бұрын

@@DeepFindr With much more knowledge after watching multiple of your videos and reading tons of papers , I have to ask : Do you have any plans of implementing Continuous Dynamic Temporal Graphs ? Like the TGN paper. Also, I have issues with installing pygeom and cuda in windows.. I made it installing it somehow but without cuda. such a hassle Lastly, again, thanks for your help, your videos help TONS!

@chavo004 4 жыл бұрын

Great presentation, thank you. 1.) using your example I tried swapping the dataset to tox21 with 12 classes for each of the assays, but without success. Do you have a Ideas how to get that to work? 2.) with a dataset and some set of initial features, is the GNN using those features as a seed and it jones in on better latent features as it converges to a final prediction layer? Thank you.

@DeepFindr 4 жыл бұрын

Hi! Thanks for the feedback :) 1) I didn't fully understand what you changed? What is not working anymore? 2) yes the initial node features are converted to latent features in the final layer which are used for predictions :)

@hyperhypochondriac3378 Жыл бұрын

When i was trying to install the dataset, the error poped up ("PytorchStreamReader failed locating file data.pkl: file not found"). How to fix it?

@alexistremblay1076 4 жыл бұрын

Very comprehensive series of video. I was hoping to see how to include edge features. PyTorch Geometric seems to be able to include that, but I can’t seem to wrap my mind around it. Any chance you can point me in the right direction? In you exemple, how would you include the bond type?

@DeepFindr 4 жыл бұрын

Hi! Thanks for the feedback! I'll probably soon make a video on how to include edge features. However, I also commented on that in the second part of this series (someone asked a similar question). Maybe that helps already. Besides that I can recommend to search the github issues for "edge features" (in the pytorch geometric repository).

@DeepFindr 4 жыл бұрын

I plan to make this video mid of January next year :)

@DeepFindr 3 жыл бұрын

Hi! I have uploaded a video about this now :) let me know if it helps or if something is missing

@alexistremblay1076 3 жыл бұрын

@@DeepFindr wow thank you so much for making the video and coming back to update me. Class act.

@DeepFindr 3 жыл бұрын

@@alexistremblay1076 sure :) a couple of months ago I was also wondering how to do that. So probably there are also more people besides us with this question :)

@sumitkumar-el3kc Жыл бұрын

Hi, I'm back with another doubt. I am wondering how the dataset ESOL or any dataset for GCN training was made? I tried to convert a different dataset which is in list within list format for node and edge features and edge index in COO format. I couldn't create pytorch tensors from them since the size of each molecule is different. I was wondering if I do padding to create a uniform data wouldn't it going to effect the performance of the model? In short, how can I create my own custom dataset compatible for GCN training?? Thank you.

@DeepFindr Жыл бұрын

Hi, not sure if you watched my video on how to do that yet (it's called tabular to graph dataset), but essentially you put each graph into a separate object. No need to do padding, because you can simply store the information which node belongs to which graph (in Pytorch geometric done in the "batch" variable)

@sumitkumar-el3kc Жыл бұрын

@@DeepFindr thanks again, batch in pytorch geometric worked.

@AhmedIsam 3 жыл бұрын

In the dataset decription, there was 3 features per edge, never used in the model. Could you please explain why we ignored them, and how to incorporate them?

@DeepFindr 3 жыл бұрын

Hi! Oh, I didn't realize they are available. I uploaded another video about edge features for GNNs - hope that is what you are looking for. So generally it would've been easy to incorporate them into the layers.

@DeepFindr 3 жыл бұрын

Wow, yes you are right. I really overlooked them :D didn't know that dataset had edge features. Thanks for the remark!

@AhmedIsam 3 жыл бұрын

@@DeepFindr Yep. I found the video about edges. Thanks. Top quality content BTW.

@DeepFindr 3 жыл бұрын

Thanks!!

@iDenyTalent 2 жыл бұрын

so after testing the model, we can compare the predicted values vs the correct values for the batch. However, is this the batch with the best performance? Also wouldnt we want to compare the results for the entire dataset rather then just one batch of 64 molecules? Video was also very helpful !

@DeepFindr 2 жыл бұрын

Yes sure, in practice you would have a full test set. This was just an example and I didn't do the full procedure :)

@sushmadevraj4389 3 жыл бұрын

Hi, Great explanation !!! I have a quick question, How did you choose to take x2 from neighboring node not x3 or x4?

@DeepFindr 3 жыл бұрын

Hi! Thanks :) To which section of the video are you referring to?

@sushmadevraj4389 3 жыл бұрын

@@DeepFindr It’s in part 2/3. Example problem first step

@DeepFindr 3 жыл бұрын

Ah. We take all three - x2, x3 and x4. Do you ask this because there is only a message icon for x2? Actually just look at the red arrows, not the icon. :)

@sushmadevraj4389 3 жыл бұрын

@@DeepFindr Thankyou

@S4Kyoto 3 жыл бұрын

Great video! Your videos really helped me to get started with GCNs. I'm currently working on a model to predict some outcome properties of a reaction between and reactant and a catalyst. I tried to make a model, similar to the one you showed here, with just input molecules and it worked fine by using the DataPair class (Advanced mini-batching, from PyTorch). But for my case it would be important to have some sort of 3-dimensional information, do you know if there would be a way to make a 3D-GCN (or use it in torch_geometric) and if so, there is no easy way to generate this 3D information as node- or edge-features from RDKit, I assume? Something like shown in this paper: "Three-Dimensionally Embedded Graph Convolutional Network (3DGCN) for Molecule Interpretation", but I couldn't find a torch implementation.

@DeepFindr 3 жыл бұрын

Hi! I've had a look at the paper - essentially they have an additional matrix with the 3d info, right? A trivial approach I can think of is to calculate the 3d coordinates of molecules and add them to the node feature vectors. Rdkit provides an algorithm that calculates the 3d structure (conformer) of a molecule. You can then get the 3d coordinates of the atoms. Have you tried that?

@S4Kyoto 3 жыл бұрын

@@DeepFindr I tried that now but it seems like my model is not learning much, maybe I have to tweak around with some parameters before I can even see a little bit of improvement. But thank you for your idea and ofc for all the videos you make, which are always helpful. :)

@DeepFindr 3 жыл бұрын

Thanks and good luck with your project!

@DeepFindr 3 жыл бұрын

Hey! I also recently discovered that Rdkit also offers several hand crafted features that can be easily obtained. For instance have a look at "CalcWHIM" in Rdkit. It calculates 114 3d structural descriptors. Maybe it is helpful for you and you can add it to the GNN to improve the learning

@S4Kyoto 3 жыл бұрын

@@DeepFindr Hey, thank you for pointing that out, I’m going to check that out.

@FreeBMXcrew 3 жыл бұрын

Vielen Dank für die Videoreihe, mega gut!!

@giannismanousaridis4010 4 жыл бұрын

Nice work. I was thinking about the following problem. If we have a similar dataset with graphs which represents molecules and in each node, we have more features. Would I be able to train my GCN so that if I give a molecule as input, it would give me info about the nodes (which elements are). In general, I will have small graphs and I will try to do node classification on small graphs. It's like a combination of this video and the one with the Cora dataset.

@DeepFindr 4 жыл бұрын

Hi! Thanks! Sure that is possible. The only thing that would change (regarding this video) is that you don't have to summarize the embeddings with this (max / mean pooling) and use the train and test mask for the labels. Then your input is the molecule and the output are probabilities for the classes for each node. I would say this is the default example for node classification. There is probably also an example on Pytorch github. Best regards!

@DeepFindr 4 жыл бұрын

Take a look here: github.com/rusty1s/pytorch_geometric/tree/master/examples

@giannismanousaridis4010 4 жыл бұрын

Thanks for your fast reply! I will take a look on them.

@DeepFindr 4 жыл бұрын

@@giannismanousaridis4010 I think the introduction example might also be interesting for you: pytorch-geometric.readthedocs.io/en/latest/notes/introduction.html They use 600 graphs and 6 classes there :)

@ANAND02120 Жыл бұрын

The command the notebook is not working, too many errors, any update?

@meylyssa3666 2 жыл бұрын

The video is great, but why don't you increse the size of the colabl notebook? The code just takes 40% of the screen and is very small and difficult to read. And 60% of the screen space is just blank

@DeepFindr 2 жыл бұрын

Yes that's right. I've done that in more recent videos :)

@soufien7354 2 жыл бұрын

Hi, How we can comapre between two tubes with GNN ? (each point of tube have an x, y, z coordinate and raduis)

@DeepFindr 2 жыл бұрын

Some questions... What is your goal of this comparison? Is this not possible without deep learning? How many tube data points do you have?

@soufien7354 2 жыл бұрын

@@DeepFindr the goal is to predict the pourcentage of resemblance betwenn the two tube (for example the deep learning model return that the tube is 80% resemblance to the reference tube) ; I have one point cloud of x,y,z coordinates of reference tube and point cloud of x,y,z ccordinates of the tube who will be tested . it's not possible without deep learning because the point cloud of tube to test depend of his position (it's not have the same position every time so x,y,z coordinate chages every time)

@DeepFindr 2 жыл бұрын

@@soufien7354 You could have a look at PairData for this: pytorch-geometric.readthedocs.io/en/latest/notes/batching.html#pairs-of-graphs It allows you to iterate over pairs of graphs and apply GNNs to it. The question is - do you have labels for your tube-pairs? Because you need to train the model somehow. If you don't have that the only thing I could think of is to train an auto-encoder and then somehow compare the latent vectors with regards to similarity.

@waleedrafi7977 3 жыл бұрын

Amazing video, waiting for a video on Fake news detection using GNNs

@DeepFindr 3 жыл бұрын

I've noted it down, but the list is very loooong :D

@waleedrafi7977 3 жыл бұрын

@@DeepFindr Push it to the top and try to use Stack not Queue :) kindly make a video on FND as soon as possible

@DeepFindr 3 жыл бұрын

Have you seen this repo? github.com/safe-graph/GNN-FakeNews

@minalpatil564 2 жыл бұрын

Hi, I am trying to run this notebook on google colab , but when I try to import dataset MoleculeNet, there is an error for module torch sparse. I tried to install it. but shows some error. Please guide

@DeepFindr 2 жыл бұрын

Hi! Which pytorch version are you using in the notebook? For some torch versions I experienced errors with torch sparse. You can also find this in the Github issues of PyG.

@sumitkumar-el3kc Жыл бұрын

Hi, I'm using a server with CUDA 12.0 in it. (Deep Graph Library) DGL is not yet available for CUDA 12.0. Is there any other alternative for DGL?

@DeepFindr Жыл бұрын

I have a video with different GNN libraries, maybe that's what you llok for

@sumitkumar-el3kc Жыл бұрын

@@DeepFindr I'll look forward to it. Thank you.

@shubhamverma8610 3 жыл бұрын

Thanks for such a great video...I have one question hoping to get some idea, I have n classes plus some random data belonging to some other class. Random data is not included in the training set. I trained the network for n classes and it is performing well. Now if an unknown class object comes in for prediction, the GCNconv predicts it as any of the n classes Which is clearly misclassified. The confidence also comes nearby 0.98, which makes it difficult to filter out. And also known class object of n trained classes is also classified at the same confidence level. One idea that comes to mind after checking online is to try keeping a class that does not include any feature set of n classes i.e. sort of negative sampled class as an unknown class. But I think it can confuse the model. This I am still going through. So is there a way to tackle it that it doesn't fall into any known class classification? Any help is appreciated Thanx.

@DeepFindr 3 жыл бұрын

Hi! Thanks for the feedback! Generally it is difficult to teach a model something that has not been seen during training. Why is it not possible to include some of the random samples into the train data? They could then be classified as "other". Also, how does the distribution of the random class's features look like? You would need to know that before you start to generate random samples for the new class.

@shubhamverma8610 3 жыл бұрын

@@DeepFindr Actually I am using GCNconv for cad feature recognition like say for example nut, screw, and washer are three classes and if I am slicing the washer in half and giving it to the model still it is predicting it to the washer so if I add this to another class it might create an issue right for the actual washer class and my accuracy could decrease I think. Overall what I was hoping to reject anything which is not a part of training classes and also predict the class label for the model which is similar to the training class. But based on your answer it seems that it's not possible to do it as the model has not seen the random data so I think if I want to achieve that either I should be able to filter out the random data before feeding it to the model otherwise it won't work. Ok, I will try to find something else to do something about it. Thank you very much for your time really appreciate it

@DeepFindr 3 жыл бұрын

Well in that case couldn't you simply give random slices of the objects to your model as well? So just slice the screw, the nut ect. And put all of them into another class "other" which is used in the training dataset. Other than that I could Only think of using the model confidence - once it falls below a certain threshold the model predicts the other class. But as you said the model is pretty confident even for the slices, which makes sense because some of the washer features also appear in the sliced washer.

@shubhamverma8610 3 жыл бұрын

@@DeepFindr ok I will try that let see how it is performing. Once again thank you for your time and these great videos........Keep up the good work...:-)

@DeepFindr 3 жыл бұрын

Sure! Good luck with your work :)

@koolgal722 Жыл бұрын

Can you guide me code for ADR prediction task from molecular structures of the drugs

@odev6764 3 жыл бұрын

Do you know something about DGL ? If know: Could you please make a video ? And other question: how can I build my own dataset ? I want to build a stocks dataset but I don't know how

@DeepFindr 3 жыл бұрын

Hi! I just uploaded a video about GNN libraries and quickly talk about DGL. But that's all - I haven't used dgl yet. Typically there is a section in the documentation how to build custom datasets. Otherwise you can have a look at another video from me called *GNN project creating a custom dataset" :) hope this helps

@sadenb 4 жыл бұрын

Both part 2 and part 3 are examples of GCN. Can you make a note where the difference is between GCN and the original Graph Neural Network paper. ???

@DeepFindr 4 жыл бұрын

Hi! Nice question! Actually I also thought about this for a while. I feel that GCN and GNN are used interchangeably in most papers. However there is a slight differentiation in the way how the aggregation works. GCN is Officialy proposed by Kipf et al. with the settings I quickly mentioned in part 2 (variants). However the basic idea of message passing can be found in all GNNs. To summarize I would say a GCN is just the approach from Kipf et. al which is slightly different than for example Graph Attention networks or Gated GNNs but all approaches share the same idea. Hope this helps

@emindurmus993 7 ай бұрын

this is really amazing content but there is a problem on colab this code is not work anymore

@naveedmazhar7260 3 жыл бұрын

Hi, I am an enthusiast of deep learning and data science. I worked on different data called LIDAR data in which there is a comparison of 3d meshes. but i cannot apply GCNconv on them it gives me dimension error says that all dimension should be same except dim 0. can you guide me?

@DeepFindr 3 жыл бұрын

Hi naveed, can you send me a screenshot of your code and error to deepfindr@gmail.com? It's hard to debug without seeing what's going on. Thanks

@naveedmazhar7260 3 жыл бұрын

@@DeepFindr thanks for just an amazing response and work on the video. I will send you the code in a moment.

@mohamadabdulkarem206 3 жыл бұрын

Could you please tell me if want use the GraphSAGE ,SAGEConv. What can I do?

@DeepFindr 3 жыл бұрын

Hi, sure these are also supported by pytorch geometric. Just look for the documentation - there is everything you need. Alternatively check out the pytorch github page. In the examples folder are also examples for graphSAGE

@DeepFindr 3 жыл бұрын

For example here: github.com/rusty1s/pytorch_geometric/blob/master/examples/graph_sage_unsup.py

@mohamadabdulkarem206 3 жыл бұрын

@@DeepFindr Thank you so much for your help and your support, I am so grateful for your help

@alexvass Жыл бұрын

you need HD for code presentation, it is hard to read on big screen views

@DeepFindr Жыл бұрын

Yes, I've fixed this for future videos :)

@lucaslau8379 3 жыл бұрын

Hi, is it possible to predict single node feature? Thanks!

@DeepFindr 3 жыл бұрын

Hi! Do you mean a feature vector for the whole graph? Or do you mean a feature vector with length 1?

@lucaslau8379 3 жыл бұрын

@@DeepFindr Say I have a set of graphs with 4 nodes, each node have 2 node features [a,b] , can I predict node features of each node given a new graph ?

@DeepFindr 3 жыл бұрын

Sure, there speaks nothing against training with the graphs you mentioned. You can simply send this through one or more GNN layers and create node embeddings. For new graphs you can then "predict" these embeddings

@lucaslau8379 3 жыл бұрын

@@DeepFindr Can I contact you privately as I really want to ask you some detail on prediction of node feature. Thanks!

@DeepFindr 3 жыл бұрын

Yes. Deepfindr@gmail.com :)

@grzana805 3 жыл бұрын

Consider making font bigger, code is barely visible on mobile devices

@DeepFindr 3 жыл бұрын

Hi, yep its bigger in the more recent videos :)

@yanlu914 2 жыл бұрын

When I print data.num_classes, I get result 734, not 1... I don't know why, is anybody have the same problem?

@DeepFindr 2 жыл бұрын

Yes something is wrong with this property. Simply manually set your dimension to 1, as it's a regression problem. Several people have reported this :)

@yanlu914 2 жыл бұрын

@@DeepFindr Thank you for your reply！ I have set it to 1, it works.

@AliRashidi97 2 жыл бұрын

It was GREAT!!! Tnx a lot

@ShahxadAkram 2 жыл бұрын

Could you please explain why is this noisy? Actually, I tried this tutorial with different other datasets and tweaked different parameters but got the same noisy graph. PS: I'm a beginner to GNNs and watching your series for the 4th time and it's still awesome.

@DeepFindr 2 жыл бұрын

Hi! Thanks :) Which nosiy graph are you referring to? The loss?

@ShahxadAkram 2 жыл бұрын

@@DeepFindr yup that loss graph. I mean that flactuates too much, is that normal?

@ShahxadAkram 2 жыл бұрын

@@DeepFindr Actually I'm comparing it with the normal neural networks e.g. in the case of the MNIST example we have a very smooth decrease in loss, again I'm a noob, you can also suggest me some material to look at before coming here.

@DeepFindr 2 жыл бұрын

There can be many reasons for fluctuations in the loss. The most likely ones are: - Learning rate (make it smaller, so that the model doesn't jump that much) - Batch size (make it bigger, so that the update is smoother e.g 512) - The network is too big for the Dataset (reduce the size) Let me know if nothing helped :)

@ShahxadAkram 2 жыл бұрын

@@DeepFindr Hi, thanks for the suggestions, after tweaking a lot of stuff I removed the square root from this line `loss = torch.sqrt(loss_fn(pred, batch.y))` i.e. only the loss_fn left and this resulted in a much smoother graph. I think you used this square root because of the MSELoss but still, you can describe it well.

@mohamadabdulkarem206 3 жыл бұрын

Great explanation Thank you so much , But could you help me I tried to implement the code and I got this error,SError: /usr/local/lib/python3.7/dist-packages/torch_sparse/_convert_cuda.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

@DeepFindr 3 жыл бұрын

Hi! The reason for this is that the python version changed in the colab Notebooks. I updated the notebook with a fix but it seems like it's not fully stable yet. Are you running it locally or on Google colab?

@DeepFindr 3 жыл бұрын

Hi man. I looked into it and the Pytorch version of the colab notebooks updated. I adjusted the code to be more generic, now it will always install the libraries for the right version. It should work now - please let me know if you can confirm :)

@mohamadabdulkarem206 3 жыл бұрын

@@DeepFindr Thank you so much it is working now I am really happy, I am so grateful for your help. Do you give private courses or private lessons I will be very happy if do that because I need Understand some things in this topic Thanks again

@DeepFindr 3 жыл бұрын

@@mohamadabdulkarem206 sure no problem :) I don't really have time for private courses, but you can always send me a mail with your questions to deepfindr@gmail.com. Best regards

@mohamadabdulkarem206 3 жыл бұрын

@@DeepFindr Thank you so much for your help and your attention, I really happy Thanks again

@jagdishmeerchandani7700 3 жыл бұрын

great tutorials

@SadeghShahmohammadi Жыл бұрын

Great. Thanks

@josueairtonlopezcabrejos9807 3 жыл бұрын

Muchas gracias amigo

@waqassheikh1469 3 жыл бұрын

The given notebook is not woriking

@DeepFindr 3 жыл бұрын

I updated it to enforce pytorch version 1.8.0. Let me know if it works for you. The problem with colab is that they change the machines - different cuda versions, pytorch, ect..

@veerasaidurga8502 6 ай бұрын

I have worst experience in your channel deep learning course videos are very less and your channel has worst audio clarity

@DeepFindr 6 ай бұрын

Pro tip: don't watch the channel :)

@xxXXCarbon6XXxx 3 жыл бұрын

How does the GNN cope with enantiomers and sterio isomers? Dextro & Laevo glucose are both glucose but have different properties.

@DeepFindr 3 жыл бұрын

Hi! Molecule graphs that are nearly identical (isomorphic) are always difficult to handle for a GNN, because there is no node ordering considered. Out of the box the GNN will probably not be able to distinguish them. There is an extension called position aware Graph Neural Networks, that might be able to capture such mirrored molecules. Another idea I could think of is to add the 3d coordinates of a molecule to the node feature vectors. Maybe this way the two structures will lead to different embeddings. However, I have the feeling that GNNs are not the right choice for such problems and it might be better to fall back to "hand crafted" features that are able to spot these differences (e. g. Using a graph isomorphism test).

@oladipupoadekoya1559 2 жыл бұрын

Hi, Please do you do a crash online training on GNN. I am working on optimisation problem with a geospatial datasets with some objective functions and I intend to use GNN for the optimisation and later for prediction. I will appreciate if we can discuss via my email

@DeepFindr 2 жыл бұрын

Hi! aren't my videos "crashy" enough? :P I think if you watch 5 different videos you should be good to go :) If you need more information feel free to contact me at deepfindr@gmail.com.