Bringing Diversity from Diffusion Models to Semantic-Guided Face Asset Generation
High-quality 3D face asset creation remains costly due to reliance on controlled capture setups and manual processing, limiting scalability and diversity. We introduce a fully automated, semantically controllable framework for generating PBR-ready 3D facial assets without requiring dedicated scans. Our pipeline begins with a diffusion-based data synthesis stage, where 2D portrait samples from a pre-trained diffusion model are converted into 44K textured 3D face reconstructions via our proposed geometry recovery and texture normalization algorithm, which aligns arbitrarily shaded outputs into clean albedo space. Using this dataset, we train a disentangled adversarial generator that maps semantic attributes (age, gender, ethnicity) to UV-space geometry and albedo, enabling both direct sampling and continuous latent editing while preserving identity. A refinement stage further produces PBR materials and secondary assets (eyeballs, teeth, gums). The resulting system supports controllable face generation and post-editing in real time and exports directly to standard rendering and animation pipelines. We evaluate each component extensively and provide a web-based interactive interface to showcase practical deployment.
Reproducibility Dossier
GEOMDIGEST treats reproducibility as an evidence trail: public artifacts, documentation, data, packaging, archival stability, and verification checks. Numeric scores are only exposed for audited records; public pages prioritize the evidence itself.
Implementation Index
This paper is in the knowledge graph, but we have not attached a runnable artifact yet.