Gaussian See, Gaussian Do:
3D Semantic Motion Transfer

SIGGRAPH Asia 2025
1 Technion     2 NVIDIA     3 University of Toronto     4 Vector Institute
*Indicates Equal Contribution

Our method extracts the semantic motion from a multi-view source video and applies it to a static target shape in a way that is semantically meaningful. For example, a flying bird to a cartoon elephant:

Source

View 1

View 2

+

Static 3DGS

=

Dynamic 3DGS

Abstract

We present Gaussian See, Gaussian Do, a novel approach for semantic 3D motion transfer from multiview video. Our method enables rig-free, cross-category motion transfer between objects with semantically meaningful correspondence. Building on implicit motion transfer techniques, we extract motion embeddings from source videos via condition inversion, apply them to rendered frames of static target shapes, and use the resulting videos to supervise dynamic 3D Gaussian Splatting reconstruction. Our approach introduces an anchor-based view-aware motion embedding mechanism, ensuring cross-view consistency and accelerating convergence, along with a robust 4D reconstruction pipeline that consolidates noisy supervision videos. We establish the first benchmark for semantic 3D motion transfer and demonstrate superior motion fidelity and structural consistency compared to adapted baselines.

Source

Source

Target

Target

How does it work?

(1) Structured Multiview Motion Inversion. We start by using the Inversion method to extract Motion Embeddings from the source video, while using multiple Anchor Embeddings to ensure information sharing between angles:

Method explanation GIF

(2) View-aware Semantic Motion Transfer. We then apply the learned motion embeddings on multiple rendered views of the target object, resulting in a inconsistent, sparse set of 2D videos.

Method explanation GIF

(3) 4D Consolidation. Our final step is a consolidation process, transforming the generated …supervisions into a 4D representation. We utilized a control points mechanism, while incorporating novel rotation constraint, resulting in a a smooth, temporally coherent dynamic 3D model.

Method explanation GIF

Results

Semantic Transfer

Source Motion

Extra Horse Rigged

Animated Target

Extra Horse Rigged

Source Motion

Semantic - Brian Standing Jump

Animated Targets

Semantic - Brian Standing Jump

Source Motion

Semantic - Falling Tree

Animated Targets

Semantic - Falling Tree

Real-world Scenes

Source Motion

Human to Other - Brian Breakdance

Animated Target

Human to Other - Brian Breakdance

Source Motion

Semantic - Flag

Animated Target

Semantic - Flag

Animal Motion Transfer

Source Motion

Animal Motions - Black Panther

Animated Targets

Animal Motions - Black Panther

Source Motion

Animal Motions - Polar Bear

Animated Targets

Animal Motions - Polar Bear

Source Motion

Semantic - Common Dolphin

Animated Targets

Semantic - Common Dolphin

Source Motion

Horse Rigged Game Ready

Animated Targets

Horse Rigged Game Ready

Human Motion Transfer

Source Motion

Human to Human Like - Brian Breakdance

Animated Targets

Human to Human Like - Brian Breakdance

Source Motion

Human to Human - Brian Breakdance

Animated Targets

Human to Human - Brian Breakdance

Source Motion

Human to Human - Brian Walking

Animated Targets

Human to Human - Brian Walking

Source Motion

Human to Other - Brian Breakdance

Animated Targets

Human to Other - Brian Breakdance

Source Motion

Human to Other - Brian Breakdance

Animated Targets

Human to Other - Brian Breakdance

BibTeX


            placeholder