In the beginning you described the diffusion process and mentioned that the model is trained to remove all noise each timestep. The way that I understood it was that only the new noise added between t=n and t=n-1 was predicted. Or has the methodology changed to predict all noise? (Or maybe the objective was always to predict all noise and I misunderstood diffusion this whole time lol)
@gabrielmongaras6 ай бұрын
Usually you train the model to predict all the noise. This way, you can change the number of steps to take to create the image. During inference, one can make a one-step solve, however the solve will be quite terrible as the resulting space isn't flat and the model predicts with some error. Rather it has curvature which is why diffusion models use multiple steps to solve for the resulting image. To use the noise prediction, you can go from x_t to the predicted x_0 and then add some more noise going to x_t-1 (like DDPM). Or you can just take a small step in the direction of x_0 going from x_t to x_t-1 (like DDIM)