The Stable Diffusion model is a popular and effective model for image generation. But sometimes the image of the human hand it generates is not standard, such as a hand with less than or more than five fingers. Building upon the foundational HaGRID dataset, we curated our own dataset tailored to the specific challenges of non-standard hand representations. This research addresses this issue by introducing a comprehensive pipeline that not only detects these inaccuracies but also restores them to closely resemble real-world hand images, termed as standard hands. Our methodology incorporates a detection phase using a fine-tuned YOLO model, proficiently identifying and categorizing hand types across diverse datasets: images generated by Stable Diffusion, real photographs, and redrawn samples from the HaGRID dataset. Following detection, our multi-phased restoration process involves body pose estimation, control image generation, and subsequent inpainting processes, effectively transforming non-standard hand to their standard hand counterparts. The conducted experiments validate the robustness and efficacy of our approach, marking a significant advancement in enhancing the Stable Diffusion model's capabilities in hand image generation. For quick and easy use, we have encapsulated our methodology into an interactive web application. This platform empowers users to quick upload images and get immediate restoration feedback.
@article{zhang2023detecting,
title={Detecting and Restoring Non-Standard Hands in Stable Diffusion Generated Images},
author={Zhang, Yiqun and Qin, Zhenyue and Liu, Yang and Campbell, Dylan},
journal={arXiv preprint arXiv:2312.04236},
year={2023}
}