Gaussian See, Gaussian Do

Our method extracts the semantic motion from a multi-view source video and applies it to a static target shape in a way that is semantically meaningful. For example, a flying bird to a cartoon elephant:

Source

View 1

View 2

Static 3DGS

Dynamic 3DGS

Abstract

We present Gaussian See, Gaussian Do, a novel approach for semantic 3D motion transfer from multiview video. Our method enables rig-free, cross-category motion transfer between objects with semantically meaningful correspondence. Building on implicit motion transfer techniques, we extract motion embeddings from source videos via condition inversion, apply them to rendered frames of static target shapes, and use the resulting videos to supervise dynamic 3D Gaussian Splatting reconstruction. Our approach introduces an anchor-based view-aware motion embedding mechanism, ensuring cross-view consistency and accelerating convergence, along with a robust 4D reconstruction pipeline that consolidates noisy supervision videos. We establish the first benchmark for semantic 3D motion transfer and demonstrate superior motion fidelity and structural consistency compared to adapted baselines.

How does it work?

(1) Structured Multiview Motion Inversion. We start by using the Inversion method to extract Motion Embeddings from the source video, while using multiple Anchor Embeddings to ensure information sharing between angles:

(2) View-aware Semantic Motion Transfer. We then apply the learned motion embeddings on multiple rendered views of the target object, resulting in a inconsistent, sparse set of 2D videos.

(3) 4D Consolidation. Our final step is a consolidation process, transforming the generated …supervisions into a 4D representation. We utilized a control points mechanism, while incorporating novel rotation constraint, resulting in a a smooth, temporally coherent dynamic 3D model.

Results

Semantic Transfer

Source Motion

Animated Target

Source Motion

Animated Targets

Source Motion

Animated Targets

Real-world Scenes

Source Motion

Animated Target

Source Motion

Animated Target

Animal Motion Transfer

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Human Motion Transfer

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

BibTeX

@article{bekor2025gaussian, title={Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video}, author={Bekor, Yarin and Harari, Gal Michael and Perel, Or and Litany, Or}, journal={arXiv preprint arXiv:2511.14848}, year={2025} }

Gaussian See, Gaussian Do:3D Semantic Motion Transfer

Our method extracts the semantic motion from a multi-view source video and applies it to a static target shape in a way that is semantically meaningful. For example, a flying bird to a cartoon elephant:

Abstract

How does it work?

Results

Semantic Transfer

Source Motion

Animated Target

Source Motion

Animated Targets

Source Motion

Animated Targets

Real-world Scenes

Source Motion

Animated Target

Source Motion

Animated Target

Animal Motion Transfer

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Human Motion Transfer

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

Source Motion

Animated Targets

BibTeX

Gaussian See, Gaussian Do:
3D Semantic Motion Transfer