Рет қаралды 15
Lecture Recording: The Role of Padding Tokens in T2I Diffusion Models
IIn this lecture, we explore how padding tokens influence Text-to-Image (T2I) diffusion models. While padding prompts to a fixed length is common in most modern T2I models, its impact on image generation has been largely overlooked.
We present two causal analysis techniques to examine how padding affects model outputs during text encoding, diffusion, or when ignored. Our findings reveal links between these effects and model architectures (cross/self-attention) and training methods (frozen/trained encoders), offering insights for better T2I design and training.
#TextToImage #DiffusionModels #AIResearch