Рет қаралды 594
This is the visualization of the process of gradient descent finding a 5-coloring of a graph.
Suppose that (V,E) is a graph and A is a set with |A|=k. Then a k-coloring of the graph (V,E) with set of colors A is a function from V to A where if {u,v} is an edge in E, then f(u) is different from f(v). In other words, a k-coloring is a coloring of the vertices of the graph where adjacent edges have different colors.
In our visualization, the graph is of the form (V,E) where V is the union of 5 disjoint sets V_1,...,V_5, and where there does not exist an edge {u,v} between two members of some same set V_i, but the graph (V,E) is otherwise random. I have calibrated the size |E| of E so that the gradient descent algorithm will be able to find the 5-graph coloring where each set V_i has the same color but where if we decreased the size of |E|, then the gradient descent would not be able to find the graph coloring.
Each vertex v shall be represented as a real vector x_v of length 5. The fitness level of the collection of all vectors x_v is a composition of two functions of (x_v)_v.
Let f(x_v) be the function where
f(x_1,...,x_5)=(x_1^2,...,x_5^2)/(x_1^2+...+x_5^2).
Let g((y_v)_v)=\sum_{{u,v} in E}\langle y_u,y_v
angle where \langle *,*
angle denotes the standard dot product on a Euclidean space. Then the loss level of (x_v)_v is g((f(x_v))_v). We use gradient descent to minimize this loss level.
The vectors x_v were initialized randomly since I needed randomness in order to break the symmetry.
We observe that after the gradient descent process, we eventually obtain a loss level of approximately 0. In this case, the gradient descent process has produced a coloring of the graph (V,E).
The animation shows the values f(x_v) for all vertices v. In the animation, the vertices v are sorted so that if i is less than j, then the vertices in v_i are to the left of the vertices in v_j. Since each vector f(x_v) has length 5, the animation represents the vector f(x_v) as 5 different dots with different colors.
The notion of gradient descent and graph colorings are not my own. The specific loss function that I am using to produce a graph coloring is my own, but I would not be surprised if someone else has already used this loss function.
The problem of finding a k-coloring of a graph (V,E) is a well-known example of an NP-complete problem. As with many NP-complete problems, there are special instances of the problem which are difficult, but if there exists a graph k-coloring, then it is typically not too difficult to find the graph k-coloring. I have made other visualizations of gradient descent being used to solve NP-complete problems such as the clique problem (for the clique problem, I used LSRDRs) and 3-satisfiability.
Unless otherwise stated, all algorithms featured on this channel are my own. You can go to github.com/sponsors/jvanname to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.