Why Our Combined LoRA and Textual Inversion Approach is the Best for Spitz Fur Inpainting

3–4 minutes
Why Our Combined LoRA and Textual Inversion Approach is the Best for Spitz Fur Inpainting

For the last couple of months, we’ve been refining our inpainting technique for black-and-white Spitz fur. Our research involved comparing Stable Diffusion models, fine-tuning with LoRA, and experimenting with Textual Inversion.

In this article, we reveal how combining LoRA and Textual Inversion has proven to be the most effective approach for our specific needs. We’ll also discuss the metric we used to measure our success and explain why this combined method stands out.

spitz image

Original image and its evaluation score

spitz image score

Original image and its evaluation score

spitz image upscale

Original image for our first upscale iteration and its evaluation score

spitz image upscale score

Original image for our first upscale iteration and its evaluation score

The combined approach: LoRA and Textual Inversion

By combining LoRA and Textual Inversion, we leverage the strengths of both: 

  • LoRA enables us to fine-tune the model’s attention mechanisms to better capture the intricate details of Spitz fur.
  • Textual Inversion allows us to introduce a custom token that encompasses the unique style of black-and-white fur. 

This synergy results in inpainting outputs that are not only structurally accurate but also stylistically consistent with our target aesthetic.

LoRA modifies the model’s internal transformations, injecting domain-specific knowledge into the attention layers. This is crucial for adapting the model to understand the unique characteristics of Spitz fur. Textual Inversion, on the other hand, teaches the model new concepts through custom token embeddings. 

By introducing a token like <dog_fur_style>, we guide the model to generate or inpaint fur in a way that’s consistent with our desired style. Together, these methods provide a comprehensive solution for our inpainting challenges.

spitz image with sdxl

Image version with just SDXL (without LoRA or Textual Inversion)

spitz image with sdxl score

Image version with just SDXL (without LoRA or Textual Inversion)

Measuring success: Our custom metric

To objectively evaluate the performance of our combined approach, we developed a custom machine learning-based metric that integrates a comprehensive set of image quality assessments. 

Unlike traditional metrics like SSIM or perceptual similarity alone, our approach combines multiple indicators – including variance of Laplacian, image entropy, gradient contrast, PSNR, SSIM, VIF, LPIPS, and more – into a unified score. 

This custom metric checks two things:

  • How accurately the AI recreates the fur’s structure – sharp edges and detailed texture
  • How naturally the new fur blends in – smooth transitions and fewer errors

It’s designed specifically to evaluate how well we’re inpainting Spitz dog fur.

Our evaluation pipeline preprocesses images into multiple formats – integer-based for sharpness metrics and float-based for reference comparisons – before computing the individual metrics. 

These results are then aggregated into a single, weighted score that reflects the overall quality of the inpainting. This holistic approach ensures our assessment captures the full spectrum of fur restoration quality, from technical precision to aesthetic appeal.

 spitz image with lora

Image edited with LoRA and its evaluation score

spitz image with lora score

Image edited with LoRA and its evaluation score

Results and analysis: Superior performance

Our experiments showed that the combined LoRA and Textual Inversion approach significantly outperformed standalone methods. Using our custom metric, we recorded an average quality score improvement of 4% over LoRA alone and 7% over Textual Inversion alone. 

In blind visual assessments, the combined method was favored in 87% of cases, with evaluators praising the crisp fur details, seamless blending, and authentic texture. These findings confirm that our combined approach achieves superior inpainting quality for our specific domain.

spitz image with textual inversion

Image edited with Textual Inversion and its evaluation score

spitz image with textual inversion score

Image edited with Textual Inversion and its evaluation score

Conclusion: The optimal solution

Our tests show that using LoRA and Textual Inversion together works best for adding realistic black-and-white Spitz fur to images.

spitz image with lora and textual inversion

This is how the image looks with our combined method

spitz image with lora and textual inversion score

This is how the image looks with our combined method

This approach delivers an unmatched balance of structural accuracy and stylistic fidelity, making it our definitive choice for high-quality, domain-specific inpainting. With this powerful stack finalized, we’re confident in its ability to handle even the most demanding fur restoration tasks.

Leave a Reply

Discover more from Furnets

Subscribe now to keep reading and get access to the full archive.

Continue reading