摘要
Text-to-image diffusion models can create stunning images from natural language
descriptions that rival the work of professional artists and photographers.
However, these models are large, with complex network architectures and tens of denoising
However, these models are large, with complex network architectures and tens of denoising
iterations, making them computationally expensive and slow to run.
As a result, high-end GPUs and cloud-based inference are required to run diffusion models at
As a result, high-end GPUs and cloud-based inference are required to run diffusion models at
scale.
This is costly and has privacy implications, especially when user data is sent
This is costly and has privacy implications, especially when user data is sent