Learning Physics-Guided Residual Dynamics for Deformable Object Simulation

Shivansh Patel1,2, Kaifeng Zhang2,3, Sanjay Pokkali1, Svetlana Lazebnik1, Yunzhu Li2,3
1 University of Illinois Urbana-Champaign   2 SceniX Inc.   3 Columbia University

Abstract

Simulating deformable objects is essential for a wide range of robotic manipulation applications. Yet, accurately predicting their dynamics remains challenging. Physics-based simulations are interpretable, but they require a known model class and precise identification of physical parameters, which are often difficult to obtain. Learning-based approaches can be more expressive in capturing dynamics, but they often generalize poorly outside the training distribution, require substantial training data, and lack physical consistency. We propose Physics-Guided Residual Dynamics (PReD), a hybrid simulation framework that combines an optimizable spring–mass simulator as a backbone with a learned neural network that predicts residual corrections to compensate for discrepancies between physics-based predictions and reality. Our velocity-based formulation ensures stable simulation, while a sliding-window transformer captures temporal dependencies. We validate our approach on real-world deformable objects, demonstrating that PReD outperforms both purely physics-based simulators and learning-based methods. We further demonstrate the practical utility of our framework in action-conditioned 3D video prediction using 3D Gaussian Splatting and in Model Predictive Control for manipulation planning on challenging tasks such as cable rerouting, where purely physics-based simulation fails.


Overview

We combine a tuned spring-mass simulator with a learned residual dynamics network. The simulator predicts the next state from current observations and actions, while the network predicts per-particle velocity corrections that are time-integrated to obtain final positions. This hybrid design leverages physics priors for improved generalization while capturing complex phenomena that are challenging to model analytically.


Method

Physics Backbone

We use a spring-mass simulator as a backbone. The simulator includes various parameters like stiffness, damping, etc. This backbone is optimized using CMA-ES optimization.

Residual Dynamics

The neural network predicts per-particle residual velocities, which are added to the simulator output and integrated to compute corrected positions. Training uses multi-step rollout to reduce error accumulation and improve stability over long horizons.

Network Architecture

We use a Point Transformer V3 encoder, a NeRF-style decoder to predict residual velocities, and a sliding-window transformer with gating to refine corrections using temporal history.


Results

PReD achieves the lowest error across all tracking metrics and improves visual quality in action-conditioned video prediction. Qualitative comparisons show improved structural consistency and more realistic deformations across all objects.


Planning with PReD

PReD supports MPPI planning for manipulation planning, achieving 8/10 cable rerouting successes versus 2/10 for the physics-only baseline.


Interactive Simulation with PReD and 3DGS

PReD enables action-conditioned 3D video prediction using 3D Gaussian Splatting for interactive simulation.