GaussianProperty

Integrating Physical Properties to 3D Gaussians with LMMs

Xinli Xu1*, Wenhang Ge1*, Dicong Qiu1*, ZhiFei Chen1, Dongyu Yan1, Zhuoyun LIU1, Haoyu Zhao1, HanFeng Zhao3, Shunsi Zhang3, Junwei Liang1,2, Ying-Cong Chen1,2†
1HKUST(GZ) 2HKUST 3Quwan

*Equal Contribution, Corresponding Authors


GaussianProperty can zero-shot estimate the lower and upper bounds along optimal values of grasping forces for multi-part objects grasping.


GaussianProperty can reconstruct multi-part objects from real world, zero-shot estimate their physical properties, and generate corresponding deformation dynamics in simulation.

Abstract

Estimating physical properties for visual data is a crucial task in computer vision, graphics, and robotics, underpinning applications such as augmented reality, physical simulation, and robotic grasping. However, this area remains under-explored due to the inherent ambiguities in physical property estimation. To address these challenges, we introduce GaussianProperty, a training-free framework that assigns physical properties of materials to 3D Gaussians. Specifically, we integrate the segmentation capability of SAM with the recognition capability of GPT-4V(ision) to formulate a global-local physical property reasoning module for 2D images. Then we project the physical properties from multi-view 2D images to 3D Gaussians using a voting strategy. We demonstrate that 3D Gaussians with physical property annotations enable applications in physics-based dynamic simulation and robotic grasping. For physics-based dynamic simulation, we leverage the Material Point Method (MPM) for realistic dynamic simulation. For robot grasping, we develop a grasping force prediction strategy that estimates a safe force range required for object grasping based on the estimated physical properties. Extensive experiments on material segmentation, physics-based dynamic simulation, and robotic grasping validate the effectiveness of our proposed method, highlighting its crucial role in understanding physical properties from visual data.

Gaussian Property Architecture

GaussianProperty

GausssianProperty initially leverages SAM to get the segmentation map of the object. Then the original images and the masks are sent to the foundation models like GPT-4V(ision) to get the corresponding physical properties by inquiring the material candidates. After acquiring physical properties from 2D images, we using a multi-view approach and a voting strategy to add physical properties to the reconstruction 3D Gaussians.

GaussianProperty

Part-Level Segmentation

Given a well-reconstructed 3D Gaussian representation, our objective is to attribute physical properties to each Gaussian. Understanding an object's physical properties requires delving into the characteristics of its individual parts, as each part may present unique attributes. Considering this, we utilize SAM for image segmentation, adeptly predicts masks with precise boundaries that capture whole, part, and subpart levels, thereby reflecting the object's hierarchical semantic structure. In this work, we emphasize the significance of part-level information, which enables us to dissect an object into its constituent parts. This approach facilitates a more accurate and exhaustive comprehension of the physical properties of visual data. Our method not only harnesses the semantic stratification provided by SAM but also actively integrates it to remedy the ambiguity arising from objects possessing multiple physical attributes.

Part-level Segmentation

Physics Property Matching

After achieving precise part-level semantic segmentation, we match the segmented parts with their corresponding physical properties. A material candidates list is established with 15 ubiquitous material families and more than 600 materials, integral to everyday objects and structures. A global-to-local chain-of-thought (CoT) is utilized to assist GPT-4V in recognizing the material properties of the object, followed by a Gradual Prompt Guidance to help the model progressively build up understanding of the entire object and discern the association between its parts and the whole.

GPT Prompting

Downstream Applications

Generative Dynamics

Generative Dynamics

Physical simulation is a crucial application of our method because it allows us to directly add all predicted physical properties to the Gaussian points without the need for manual querying and annotation. This integration speeds up dynamic rendering significantly. We present a potential downstream task of 3D Gaussians with physical property, i.e., the generative dynamics. By imposing force, the 3D Gaussians generate corresponding motion. For example, in the first row, we applied a top-down force, the chair exhibited a movement corresponding to the applied force.

Robot Grasping

Robot Grasping

Robot Grasping is a downstream application of GaussianProperty. We propose to leverage GaussianProperty on the estimation of an optimal grasping force applied by the robotic gripper that is sufficient to lift the target object without slipping while remaining below a threshold to prevent damage or deformation. To evaluate the effectiveness and performance of our proposed method, we collect 16 objects composed of diverse materials, and implemented three robot grasping baselines with fixed grasping forces, which are widely adopted force-sensitive grasping strategies in robotics.

Experiment Results

Baseline Comparison

We compare our method with NeRF2Physics on material segmentation across different categories on the ABO and the MVImgNet datasets. Our method achieves a more comprehensive and accurate understanding of the object, as well as a more precise material segmentation.

Baseline Comparison

Ablation on Frequency-based Voting

Frequency-based voting ensures consistency and reliability in the predicted properties by effectively aggregating information from different viewpoints, minimizing errors, and enhancing overall prediction accuracy.

Ablation on Frequency-based Voting

Hardness Estimation

We qualitatively compare our method with NeRF2Physics on hardness estimation. Our method provides more accurate hardness prediction with clear boundaries.

Hardness Estimation

BibTeX

@misc{xu2024gaussianproperty,
      title={GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs}, 
      author={Xinli Xu and Wenhang Ge and Dicong Qiu and ZhiFei Chen and Dongyu Yan and Zhuoyun LIU and Haoyu Zhao and HanFeng Zhao and Shunsi Zhang and Junwei Liang and Ying-Cong Chen},
      year={2024},
}