Machine learning recovering handwritten digits from the discrete Fourier transform magnitudes

Рет қаралды 420

Ай бұрын

This is an experiment to determine how easily images can be recovered from the absolute value of the discrete Fourier transform.
Given an MNIST digit A, we use gradient descent to minimize the distance squared between abs(F(sigmoid(X)))^2 and abs(F(A))^2 where F denotes the Fourier transform and the operations sigmoid,abs, and v^2 are applied entrywise.
The Fourier transform is invertible, but the absolute value operation is not invertible, so the problem of finding a real matrix Z where abs(F(Z))^2=B is a non-trivial problem to solve.
We use the sigmoid function so that sigmoid(X) is a matrix with entries in the interval (0,1); this is useful since the entries in the MNIST digit A are all in the interval [0,1].
We observe that the machine learning algorithm is often capable of accurately recovering the MNIST digit even though the algorithm sometimes fails to recover the MNIST digit.
The notion of gradient descent is not my own, but this particular loss function is my own. This loss function gives an example of where gradient descent can be used to accurately solve a problem and where if we run the gradient descent multiple times, we will often get nearly the same solution (up to translation invariance and flipping the image).
The absolute value of the Fourier transform is a useful invariant of an image since this invariant is unchanged by translations of the original image and by replacing the image with an upside down version of itself. And since the original data can be recovered from the absolute value of the Fourier transform (well at least for MNIST data) and the positional information, we conclude that machine learning models can be trained on the magnitudes of the Fourier transform alone.
Unless otherwise stated, all algorithms featured on this channel are my own. You can go to github.com/sponsors/jvanname to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.

Пікірлер: 4

@justman017 8 күн бұрын

How it recovery? Redraw or just edit the image?

@josephvanname3377 8 күн бұрын

It recovers the original image by minimizing the loss level by moving in the direction opposite to the gradient. The loss level is the distance of the magnitudes of the discrete Fourier transform. I gave more details in the description.

@justman017 8 күн бұрын

@@josephvanname3377 owh thanks for the explanation! Now it crystal clear for me😆

@josephvanname3377 8 күн бұрын

@@justman017 You're welcome. I am glad that it is now clear.