okuha-logo-small-white

How to Use Dreambooth In Stable Diffusion

dreambooth-in-stable-diffusion-feature-image

If you click on a link and make a purchase, I may receive a small commission. As an Amazon affiliate partner, I may earn from qualifying purchases.
Read our disclosure.

What Is Dreambooth?

In 2022, the Google research team introduced Dreambooth, a technique to fine-tune diffusion models, specifically Stable Diffusion, by incorporating a custom subject into the model. The name “Dreambooth” is inspired by the photo booth concept, where once the subject is captured, it can be synthesized in various dreamlike settings.

dreambooth-showcase-image
In Dreambooth, you train the AI art model with a few input images for various “dream-like” settings. Image credits.

Check out: How to use Stable Diffusion

How Does Dreambooth Work?

how-does-dreambooth-work-example-image
A dog [V] is used as input images, and via the Dreambooth method and unique identifier, the dog [V] can be placed in different settings.

The Dreambooth method uses a few input images and a corresponding class name to fine-tune a text-to-image model. The model can then generate diverse instances of the subject by implanting a unique identifier in different sentences.

The approach involves two steps:

  • Fine-tuning the low-resolution model with input images and a text prompt containing the class name.
  • Fine-tuning the super-resolution components using pairs of low-resolution and high-resolution images.

The method achieves high fidelity and realistic interaction between the subject and the scene.

Using a few input images (typically 3-5) and the corresponding class name (e.g., “A photo of a [V] dog”), the Dreambooth method creates a personalized text-to-image model that encodes a unique identifier ([V] in this case) for the subject.

This identifier can be implanted in different sentences during inference to synthesize subjects in diverse contexts.

The core concept of Dreambooth:

  • It uses a unique identifier for the new subject ([V] dog could be named XYK-3), so it does not have a previous meaning inside the model.
  • Corresponding class refers to the subject in hand. In our case, the dog represents the class, so Dreambooth is fine-tuned so that XYK-3 is used while generating an image of a dog (class).

Textual inversion works the same way as you have a trigger word or, in this case, a unique identifier. However, in textual inversion, only the text embedding part of the model is fine-tuned, whereas, in Dreambooth, the whole model is fine-tuned.

How to Do Dreambooth Training In Stable Diffusion?

To start the training process, you need the following things:

  • Custom images
  • A unique identifier
  • A class name

Step 1. Gather custom images

The higher-quality images you have for the training, the better. Suppose you want to inject a specific subject into the image generation process (a dog, for example). In that case, you should have images covering the subject from multiple angles.

If possible, the subject should have multiple different backgrounds so that the model can differentiate the subject from the background.

Step 2. Bulk resize and crop images

birme-image-exporting-saving-files
Remember to use 512px x 512px image ratio.

I used Birme (free bulk image resizing tool) to resize and crop my artwork to be used for Dreambooth training. When you have made sure that images are correctly focused, and the main subject is within the square, save the files to your computer.

dreambooth-training-data
Training data to be used with Dreambooth.

Step 3. Get the Dreambooth extension for Stable Diffusion

When you’ve launched AUTOMATIC WebUI, go to Extensions -> Available -> (click) Load From. Find the Dreambooth tab (training), and select Install.

After the installation, you should see it in the Installed tab:

dreambooth-stable-diffusion-installation
The ‘Installed’ tab will show the extensions you’ve installed to your AUTOMATIC WebUI.

Apply and restart UI. If the Dreambooth tab does not appear in the WebUI, you might need to restart the AUTOMATIC1111 WebUI completely.

You can check Dreambooth’s GitHub repository for known issues and further help.

Step 4. Create a model

creating-dreambooth-model-in-stable-diffusion
You can create a Dreambooth model by selecting ‘Create model.’

When you have the Dreambooth tab open, select Create tab. Name your model and select the Source checkpoint model to be used. Depending on your computer, the model creation can take several minutes to complete.

When the model creation is complete, you can find your model in the following folder: *\stable-diffusion-webui\models\dreambooth

Step 5. Dreambooth model settings and parameters

dreambooth-training-settings
You have two different tabs you need to adjust, ‘Settings’ and ‘Concepts’.

Settings tab

  • Training steps per image (Epochs): 1000
  • Learning rate: 0,000002.
  • Mixed precision: fp16

Concepts tab

As I’m creating a style-based model, I will name my instance prompt images of okuha style. I also chose the ‘Training Wizard (Object/style).’

If you would create an animal or a person-based model, you could use the following examples as your instance prompt:

  • photo of xxxx dog
  • images of xxxx object
  • photo of xxxx person

The main thing is that it represents the class of your subject.

The class prompt could be the following:

  • photo of a dog
  • images of an object
  • photo of a person
  • images of cel shading style

The dataset directory is where your training data (cropped images from Birme) is. For me, it’s in the following folder:

C:\Users\****\****\TRAINING-DATA-CROPPED

Paste this address to the Dataset directory section.

You should use regularization images as part of your training data. However, I didn’t use any this time. Regularization of images can improve the output result of your model. You would put the regularization images in this section: Classification Dataset Directory.

If you have ten custom images, you should have 100 regularization images.

Regularization images:

In Dreambooth, regularization images encourage the model to make smooth, predictable predictions and improve the quality and consistency of the output images, respectively.

Step 6. Start training

Click the Train button to start the training. You can follow the process either through the cmd window (Windows OS) or the Output window next to the Input.

Note: you might need to click a few times the ‘Train’ button to get the training going.

dreambooth-training-output-progress-bar
You can follow the progress from the ‘Output’ window.

Step 7. Using the custom checkpoint model

dreambooth-training-results-custom-ai-art-model

When you’ve trained your custom Dreambooth checkpoint model, you can use it by copy-pasting it to the following folder: *\stable-diffusion-webui\models\Stable-diffusion

Troubleshooting Dreambooth training

  • I had some issues with my Dreambooth installation (torch and torchvision were outdated). I deleted the venv folder: STABLE-DIFFUSION\stable-diffusion-webui\venv
    • This made the WebUI download the latest versions of both.
    • This will also install a whole bunch of other software, so finishing might take a while.
  • I also renamed my STABLE DIFFUSION folder to STABLE-DIFFUSION.
  • Installed --xformers by editing the webui-user.bat with the following command: set COMMANDLINE_ARGS= --xformers
  • My settings for commandline: COMMANDLINE_ARGS= --api --xformers --precision full --no-half --lowvram

Additional resources:

Minimum System Requirements For Dreambooth

Below are the minimum hardware requirements for using Stable Diffusion and for training Dreambooth. Training Dreambooth with 6GB can be challenging even with optimized parameters and LoRA settings.

  • CPU: Any AMD or Intel processor.
  • RAM: At least 16GB, preferably the latest DDR memory available.
  • GPU: Any GeForce RTX GPU that has at least 8GB memory.
  • Storage: Preferably any SSD drive with at least 200GB of storage space.
  • OS: Mac, Windows, or Linux

Check out: Hardware and software requirements for Stable Diffusion

Running Dreambooth with 6GB VRAM

Settings:

  • Dataset (images): 10-20
  • Use LoRA: checked (you can also try unchecked)
  • Training Steps Per Image (Epochs): 200-300
  • Save Model Frequency (Epochs): 20-30
  • Save Preview(s) Frequency (Epochs): 0
  • Batch size: 1
  • Gradient Accumulation Steps: 1
  • Class Batch Size: 1
  • Set Gradients to None When Zeroing: checked
    Gradient Checkpointing: checked
  • Lora UNET Learning Rate: 0,0009 (if LoRA unchecked: Learning rate: 0,000001)
  • Lora Text Encoder Learning Rate: 0,00005 (if LoRA unchecked: Learning rate: 0,000002)
  • Learning Rate Scheduler: cosine or polynomial
  • Learning Rate Warmup Steps: 0
  • Image processing – Resolution: 512
  • Optimizer: 8bit AdamW
  • Mixed precision: fp16
  • Memory attention: xformers
  • Cache latents: checked

Depending on your hardware, you might be unable to train a custom Dreambooth AI art model.

Dreambooth Training On Cloud

RunDiffusion, Getimg, and Leonardo AI offer you an option where you can train a Dreambooth model and not worry about hardware or software requirements. The Dreambooth training takes place in their servers/cloud and you only need to gather the input images for the training to happen.

Search
artist-profile-picture-avatar

Okuha

Digital Artist

I’m a digital artist who is passionate about anime and manga art. My true artist journey pretty much started with CTRL+Z. When I experienced that and the limitless color choices and the number of tools I could use with art software, I was sold. Drawing digital anime art is the thing that makes me happy among eating cheeseburgers in between veggie meals.

More Posts

Contact and Feedback

Thank You!

Thank you for visiting the page! If you want to build your next creative product business, I suggest you check out Kittl!