Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment

Xin Lei Lin1, Soroush Mehraban1, Abhishek Moturu1, Babak Taati1,2
1University of Toronto, 2KITE Research Institute, University Health Network
Pain in 3D: Sample synthetic pain faces showing controllable AU-driven expressions across diverse identities.

Pain in 3D generates controllable synthetic pain faces with paired neutral references, AU annotations, and clinically grounded PSPI scores across diverse identities.

Abstract

Automated pain assessment from facial expressions is crucial for non-communicative patients, such as those with dementia. Progress has been limited by two challenges: (i) existing datasets exhibit severe demographic and label imbalance due to ethical constraints, and (ii) current generative models cannot precisely control facial action units (AUs), facial structure, or clinically validated pain levels.

We present 3DPain, a large-scale synthetic dataset specifically designed for automated pain assessment, featuring unprecedented annotation richness and demographic diversity. Our three-stage framework generates diverse 3D meshes, textures them with diffusion models, and applies AU-driven face rigging to synthesize multi-view faces with paired neutral and pain images, AU configurations, PSPI scores, and the first dataset-level annotations of pain-region heatmaps. The dataset comprises 82,500 samples across 25,000 pain expression heatmaps and 2,500 synthetic identities balanced by age, gender, and ethnicity.

We further introduce ViTPain, a Vision Transformer based cross-modal distillation framework in which a heatmap-trained teacher guides a student trained on RGB images, enhancing accuracy, interpretability, and clinical reliability. Together, 3DPain and ViTPain establish a controllable, diverse, and clinically grounded foundation for generalizable automated pain assessment.

Data Generation Pipeline

3DPain three-stage data generation pipeline: mesh generation, texture diffusion, and AU-driven rigging.

Our three-stage pipeline generates diverse, controllable synthetic pain faces:

  1. 3D Mesh Generation: We create diverse facial geometries using parametric 3D morphable models (FLAME), sampling across age, gender, and ethnicity to ensure demographic balance.
  2. Texture Synthesis: Diffusion models generate realistic skin textures for each mesh, producing photo-realistic face appearances.
  3. AU-Driven Rigging: Pain expressions are applied through Action Unit (AU) configurations mapped to clinically validated PSPI scores, generating paired neutral and pain images with precise control over expression intensity.

ViTPain Architecture

ViTPain model architecture: DinoV3 backbone with LoRA adapters, neutral reference cross-attention, and AU query head.

ViTPain is a reference-guided Vision Transformer designed for automated pain assessment:

  • DinoV3 Backbone with LoRA adapters for efficient fine-tuning
  • Neutral Reference Module: Cross-attention mechanism comparing the pain image against a subject-specific neutral reference to isolate pain-related changes
  • AU Query Head: Learnable queries attending to visual features for Action Unit intensity prediction
  • Multi-task Learning: Joint prediction of PSPI pain score (0–16) and six Action Unit intensities (AU4, AU6, AU7, AU9, AU10, AU43)

Resources

📊 3D-Pain Dataset

  • 82,500 synthetic face images
  • 2,500 unique identities
  • 25,000 pain expression heatmaps
  • 3 viewpoints per expression
  • Paired neutral & pain images
  • AU + PSPI annotations
  • Balanced by age, gender, ethnicity
🤗 Download Dataset

🧠 ViTPain Checkpoint

  • DinoV3-Large backbone
  • LoRA rank=8, alpha=16
  • Pretrained on 3D-Pain (150 epochs)
  • Best epoch: 141 (MAE 1.859)
  • Ready for fine-tuning
  • MIT Licensed
🤗 Download Model

Dataset Samples

The 3D-Pain dataset contains 2,500 unique synthetic identities, each with 10 pain expression variants rendered from 3 viewpoints. Below are sample neutral–pain pairs across diverse identities, demonstrating the controllable AU-driven expression synthesis.

Neutral → Pain Expression Pairs

Identity 1 - Neutral Identity 1 - Pain

Identity 1

Identity 2 - Neutral Identity 2 - Pain

Identity 2

Identity 3 - Neutral Identity 3 - Pain

Identity 3

Identity 4 - Neutral Identity 4 - Pain

Identity 4

Identity 5 - Neutral Identity 5 - Pain

Identity 5

Top row: neutral  |  Bottom row: pain (PSPI ≈ 9)

Multi-View Rendering

View 0

View 1 (frontal)

View 1

View 2

View 2

View 3

Each pain expression is rendered from 3 different viewpoints for the same identity.

BibTeX

@article{lin2025pain,
  title={Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment},
  author={Lin, Xin Lei and Mehraban, Soroush and Moturu, Abhishek and Taati, Babak},
  journal={arXiv preprint arXiv:2509.16727},
  year={2025}
}

Acknowledgements

This work was supported by the KITE Research Institute at the University Health Network and the University of Toronto. We thank the members of the Taati Lab for their valuable feedback and discussions.

The UNBC-McMaster Shoulder Pain Expression Archive Database was used for evaluation in this work. We gratefully acknowledge the original dataset creators for making it available to the research community.