Keyframing has long been the cornerstone of standard character animation pipelines, offering precise control over detailed postures and dynamics. However, this approach is labor-intensive, necessitating significant manual effort. Automating this process while balancing the trade-off between minimizing manual input and maintaining full motion control has therefore been a central research challenge. In this work, we introduce AutoKeyframe, a novel framework that simultaneously accepts dense and sparse control signals for motion generation by generating keyframes directly. Dense signals govern the overall motion trajectory, while sparse signals define critical key postures at specific timings. This approach substantially reduces manual input requirements while preserving precise control over motion. The generated keyframes can be easily edited to serve as detailed control signals. AutoKeyframe operates by automatically generating keyframes from dense root positions, which can be determined through arc-length parameterization of the trajectory curve. This process is powered by an autoregressive diffusion model, which facilitates keyframe generation and incorporates a skeleton-based gradient guidance technique for sparse spatial constraints and frame editing. Extensive experiments demonstrate the efficacy of AutoKeyframe, achieving high-quality motion synthesis with precise and intuitive control.
Given a complete root trajectory of length L, action label and sparse spatial constraints as control input, our method generates a sequence of motion keyframes, with each frame located on a specific point on the trajectory, which is specified by users. This keyframe sequence can be further completed into high-quality motion and serves as a solid foundation for artists to edit. To accomplish that, we train an autoregressive keyframe diffusion model, which takes as input the previous keyframe, action label, and various control signals derived from the trajectory, and learns the conditional distribution of the future keyframe. To facilitate accurate control and precise editing of the motion, we propose a skeleton-based gradient guidance approach to enable the keyframe generation to adhere to flexible spatial constraints. To further improve the generation quality, we construct a motion keyframe dataset using an adaptive keyframe selection method based on deep reinforcement learning.
We generate keyframes sequences and complete the motion sequences using existing motion completion methods. We first generate examples using root trajectory (yellow spheres) only as control input (shown on the left). Then we impose additional sparse constraints (green spheres) to edit the motion (shown on the right). The modified part of the motion is highlighted in blue.
In this example, we impose two spatial constraints on the foot and wrist to make the character perform expressive kick and puch.
First generated
Edited
In this example, we fix the penetration and floating issues in the first generated result, allowing the character to make complex interaction with the environment by imposing only four spatial constraints.
First generated
Edited
We compare our method with MDM, HGHOI and OmniControl for motion generation under trajectory control.
MDM
HGHOI
OmniControl
Ours
MDM
HGHOI
OmniControl
Ours
MDM
HGHOI
OmniControl
Ours
We compare our method with OmniControl for motion generation under mixed contol of dense root trajectory and sparse spatial constraints on specific joints. We found that, although OmniControl accepts mixed control signals as input, the generated results often neglects the sparse control and prioritize the trajectory control, while our method accomodates both.
OmniControl
Ours
OmniControl
Ours
@inproceedings{autokeyframe_sig25,
author = {Zheng, Bowen and Chen, Ke and Yao, Yuxin and Zeng, Zijiao and Jiang, Xinwei and Wang, He and Lasenby, Joan and Jin, Xiaogang},
title = {AutoKeyframe: Autoregressive Keyframe Generation for Human Motion Synthesis and Editing},
year = {2025},
isbn = {9798400715402},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3721238.3730680},
doi = {10.1145/3721238.3730680},
booktitle = {ACM SIGGRAPH 2025 Conference Proceedings},
numpages = {12},
location = {Vancouver, BC, Canada},
series = {SIGGRAPH '25}
}