1027 words
5 minutes
Announcing Animagine XL 3.1

Thank you for using Animagine XL 3.0 and becoming a pioneer of anime-themed open-source text-to-image models with us. We are truly glad that the model received overwhelmingly positive feedback from users on different platforms. We cannot thank all of you enough for the feedback, support, and excitement for what’s to come next.

To celebrate the success of Animagine XL 3.0, today, we are happy to introduce you to Animagine XL 3.1, the next iteration of our opinionated open-source anime text-to-image model and direct continuation of the Animagine XL V3 series. With enhanced knowledge, all-new configuration to address overexposure, and powerful new aesthetic tags, Animagine XL 3.1 represents a major leap forward in open anime image generation.

In this iteration, we are trying to improve the model’s capabilities and fix the issues that occurred in the last iteration. We would like to break down the details in this blog post.

Build on top of Animagine XL 3.0#

Animagine XL 3.1 is undeniably a direct continuation of Animagine XL 3.0. We heard many suggestions and feedback to make the model suitable for everyone. Incremental learning is then implemented so we can update the model almost every month.

Using the base version of Animagine XL 3.0, we utilize 2x A100 80GB at Runpod. This model was pretrained against a data-rich collection of 870k ordered and tagged images for 15 days in the second half of February, approximately over 350 GPU hours. In this iteration, we’re focused on doing 3 things:

  1. To increase model knowledge with new data,
  2. To make the model have aesthetic appeal despite focusing on concept understanding, and
  3. To find out the root of the overexposure problem and how to avoid it.

We are truly grateful to SeaArt for funding our model training process via Runpod Credits. Thanks for supporting the open source community, we truly appreciate it.

Tag Ordering#

Inspired by what NovelAI did in their anime text-to-image model last year, NovelAI Diffusion V3, we still build our datasets with tag ordering, meaning that prompt order is crucial to get what you want.

For optimal results, it’s recommended to follow this structured prompt template:

1boy/1girl, what character, from which series, everything else in random order

Better Knowledge#

On Animagine XL 3.0, we primarily focused on adding characters from popular gacha games. This expansion of the training data significantly enhanced Animagine XL 3.1’s knowledge base. In this iteration, we are integrating numerous well-known anime franchises into our dataset.

The model now understands a vast range of anime more deeply: from the legendary Neon Genesis Evangelion to the newly aired Kusuriya no Hitorigoto. It spans from the oldest anime art styles to the most modern art styles. This development doesn’t cover everything but significantly broadens Animagine XL’s capabilities in generating and recognizing characters, themes, and styles from a wide spectrum of anime history, catering to fans of various genres.

Improved Special Tags#

Animagine XL 3.1 utilizes a refined set of special tags to guide the model towards generating images with specific qualities, ratings, creation dates, and aesthetics. While not strictly necessary, these tags are powerful tools for achieving your desired results.

Aesthetic Tags#

A major addition in Animagine XL 3.1 is the set of aesthetic tags, which categorize content based on visual appeal. These tags are derived from a specialized Vision Transformer (ViT) model, shadowlilac/aesthetic-shadow-v2. This tag, combined with quality tag, can be used to guide the model to generate better results. Below is the list of aesthetic tag, sorted from the best to the worst:

  • very aesthetic
  • aesthetic
  • displeasing
  • very displeasing

Quality Tags#

The quality tags in Animagine XL 3.1 have been updated to consider both scores and post ratings, ensuring a more balanced distribution of quality in the generated images. We’ve also made the labels clearer, such as changing ‘high quality’ to ‘great quality’. Below is the list of quality tag, sorted from the best to the worst:

  • masterpiece
  • best quality
  • great quality
  • good quality
  • normal quality
  • low quality
  • worst quality

Year Tags#

The year range tags have been redefined to more accurately represent specific modern and vintage anime art styles. This simplified range focuses on distinct eras relevant to current and past anime aesthetics. Below is the list of year tag, sorted from the best to the worst:

  • newest
  • recent
  • mid
  • early
  • oldest

Address Overexposure Issue#

In Animagine XL 3.0, we encountered several problems such as unbalanced quality tags distribution, gradients not syncing due to DDP issues when using multi GPUs. These factors caused the results to be overly sensitive and explicit, even with safe prompts, and also produced more artifacts and poor anatomy.

To address these issues in Animagine XL 3.1, we implemented the following changes:

  • Optimized quality tags distribution by not only relying on scores but also considering posts ratings for a more balanced dataset.
  • Updated to newer commits of the training scripts after the DDP issue was fixed, ensuring proper gradient synchronization across multiple GPUs.
  • Experimenting with cosine annealing with warm restarts for the learning rate scheduler, allowing the model to explore different learning rates.
  • Decaying the learning rate by a factor of 0.9 every cycle to gradually fine-tune the model.
  • Using AdamW optimizer with L2 regularization (weight decay) set to 0.1 to prevent overexposure and improve generalization.

Get Started With Animagine XL 3.1#

There are several ways to get started with this model:

  • Animagine XL 3.1 is getting early release in SeaArt and Huggingface.
  • Animagine XL 3.1 is live on Huggingface Spaces, powered by Zero Nvidia A100 GPU.
  • Animagine XL 3.1 is also going to be released on other platforms sometime later.

About the License#

Based on Animagine XL 3.0, Animagine XL 3.1 falls under Fair AI Public License 1.0-SD license, which is compatible with Stable Diffusion models’ license. Key points:

  1. Modification Sharing: If you modify Animagine XL 3.1, you must share both your changes and the original license.
  2. Source Code Accessibility: If your modified version is network-accessible, provide a way (like a download link) for others to get the source code. This applies to derived models too.
  3. Distribution Terms: Any distribution must be under this license or another with similar rules.
  4. Compliance: Non-compliance must be fixed within 30 days to avoid license termination, emphasizing transparency and adherence to open-source values.

We chose this license to keep Animagine XL 3.1 and its derivatives (including this iteration) open and modifiable, aligning with open source community spirit. It protects contributors and users, encouraging a collaborative, ethical open-source community. This ensures the model not only benefits from communal input but also respects open-source development freedoms.

Announcing Animagine XL 3.1
https://cagliostrolab.net/posts/animagine-xl-v31-release/
Author
Cagliostro Research Lab
Published at
2024-03-18
© 2023 Cagliostro Research Lab. All Rights Reserved.
Powered by Fuwari