A fitness function for reducing tuples of symmetric matrices to vectors that has few local maxima

Рет қаралды 85

Күн бұрын

This is a visualization of the fitness levels for gradient ascent optimization when we run the gradient ascent 128 times.
Suppose that n is a natural number and A_1,...,A_r are real n by n symmetric matrices. Then define the fitness level F(x) of a real vector x=(x_1,...,x_r) to be
rho(x_1A_1+...+x_rA_r)/norm(x)+(norm(x)-1)^2 where rho(X) denotes the spectral radius of X. The fitness level is optimized when norm(x)=1 and in this case, the fitness level is just the spectral radius rho(x_1A_1+...+x_rA_r). When the fitness level F(x) is optimized, the vector x is an approximation for the tuple (A_1,...,A_r).
For the visualization, we maximize the fitness level F(x) of a vector x 128 times where r=n and where n ranges from 10 to 15 and where A_1,...,A_r are independent random real symmetric matrices. We use a version of gradient ascent to maximize the fitness levels, and we show all 128 fitness levels in the visualization.
In the visualization, we observe that the fitness function F tends to have multiple local maxima, but in all my computer experiments, the fitness function F never has more than n=d local maxima.
The fitness function F is my own. I am making this visualization in order to showcase some of the properties of the fitness function F. Since F has few local maxima, we should consider the fitness function F to be a safer and more interpretable machine learning algorithm. On the other hand, while F is safer, the machine learning algorithm F lacks the depth of deep neural networks, so we need to develop these safer machine learning algorithms so that they may compete with or at least evenly complement deep neural networks.
Unless otherwise stated, all algorithms featured on this channel are my own. You can go to github.com/spo... to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.

Пікірлер: 2

@aaravyadav9906 Ай бұрын

what level of math would I need to know in order to understand what this is trying to do I have only taken first year lin alg and multi var calc

@josephvanname3377 Ай бұрын

More math is better here. Here, the fitness function is a linear algebraic function, so one needs to go through a couple of good courses on linear algebra covering topics such as vector spaces, inner product spaces, trace, eigenvalues, Jordan decomposition, the spectral theorem and singular value decomposition, and matrix norms (such as Schatten norms). Functional analysis and Hilbert spaces would also be helpful. But gradient ascent multivariate calculus is good, but instead of dealing with just 3 variables, one needs to deal with many variables and even matrices, so one should want to go over some matrix calculus to see what the gradient actually is. To have a better idea about why the fitness function has only few local maxima (and I do not have a complete understanding of why this is the case) and to get a better understanding of the problem, one should go over some quantum information theory (quantum channels in particular). After all, the tuple of Hermitian matrices (A_1,...,A_r) is best understood as a real completely positive superoperator which is essentially a real quantum channel (and here it does not make much of a difference whether we use the real or complex numbers). Added later: It takes a bit of experience training these fitness functions to determine what conditions are needed for the fitness function to apparently have one or a few local maxima, so for this, one should train many fitness functions (for this experience, it is better to train many small models than a few large ones since it is the number of models that gives the experience rather than the size). My visualizations should give one an indication of some conditions for which fitness functions tend to apparently have just one local optima.