Keywords: Animation Interpolation, Tweening, Transformers, Sequence-to-sequence modeling, Character animation, Long-range dependencies
TL;DR: Richer training data and better feature representations can compensate for simpler model architectures.
Abstract: Motion in-betweening is a crucial tool for animators, enabling intricate control over pose-level details in each keyframe. Recent machine learning solutions for motion in-betweening rely on complex models, incorporating skeleton-aware architectures or requiring multiple modules and training steps. In this work, we introduce a simple yet effective Transformer-based framework, employing a single Transformer encoder to synthesize realistic motions in motion in-betweening tasks.
We find that data modeling choices play a significant role in improving in-betweening performance. Among others, we show that increasing data volume can yield equivalent or improved motion transitions, that the choice of pose representation is vital for achieving high-quality results, and that incorporating velocity input features enhances animation performance. These findings challenge the assumption that model complexity is the primary determinant of animation quality and provide insights into a more data-centric approach to motion interpolation. Additional videos and supplementary material are available at \url{https://silk-paper.github.io}.
Submission Number: 8
Loading