Stability.AI announced its open-source release of Stable Diffusion Version 2. According to Stability.AI, the original Stable Diffusion V1 led by CompVis changed the nature of open source AI models and spawned hundreds of other models and innovations worldwide. It had one of the fastest climbs to 10K Github stars of any software, rocketing through 33K stars in less than two months.
Stable Diffusion 2.0 delivers a number of big improvements and features versus the original V1 release. Features and Improvements include:
New Text-to-Image Diffusion Models
The Stable Diffusion 2.0 release includes robust text-to-image models trained using a brand new encoder (OpenCLIP), developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases. The text-to-image models in this release can generate images with default resolutions of both 512×512 pixels and 768×768 pixels.
These models are trained on an aesthetic subset of LAION-5B dataset created by the DeepFloyd team at StabilityAI, which is then filtered to remove adult content using LAION’s NSFW filter.
Super-resolution Upscaler Diffusion Models
Stable Diffusion 2.0 also includes an Upscaler Diffusion model that enhances the resolution of images by a factor of 4. Combined with their text-to-image models, Stable Diffusion 2.0 can now generate images with resolutions of 2048×2048 – or even higher.
Other new features and improvements include: Depth-to-Image Diffusion Model and an Updated Inpainting Diffusion Model.
The Verge reported that users of the AI image generator Stable Diffusion are angry about an update to the software that “nerfs” its ability to generate NSFW output and pictures in the style of specific artists.
StabilityAI, the company that funds and disseminates the software, announced its update. It re-engineers key components of the model and improves certain features like upscaling (the ability to increase the resolution of images) and in-painting (context-aware editing). But, the changes also make it harder for Stable Diffusion to generate certain types of images that have attracted both controversy and criticism. These include nude and pornographic output, photorealistic pictures of celebrities, and images that mimic the artwork of specific artists.
According to The Verge, unlike rival models like OpenAI’s DALL-E, Stable Diffusion is open source. This allows the community to quickly improve on the tool and for developers to integrate it into their products free of charge. But it also means Stable Diffusion has fewer constraints in how its used, and as a consequence, has attracted criticism. Some artists are upset that an AI was trained on their artwork without their consent and can now reproduce their styles.
The Verge also reported that nude and pornographic images have been removed from Stable Diffusion’s training data. AI image generators are already being used to generate NSFW output, including both photorealistic and anime-style pictures. However, these models can also be used to generate NSFW imagery resembling specific individuals (known as non-consensual pornography) and images of child abuse.
CNET reported that AI-art technology creates images based on text prompts. It then feeds those prompts into a program that’s designed to recognize patterns in immense quantities of real-world data. The result is upending the art and tech worlds, where AI-generated imagery and videos have raised questions about what constitutes art and who should own a copyright to it.
Overall, I think AI-generated art is bad for artists. It means that people will immediately go to one of the many AI-art models, type in what they are looking for, and use the generated image that they prefer from what the AI shows them. People will take the lazy way out and grab something from an AI when they could be getting better work from real-world, human, artists, who deserve payment for their work.