Neural Invertible Warp for NeRF

Shin-Fang Chng     Ravi Garg     Hemanth Saratchandran     Simon Lucey    
Adelaide University        Australian Institute for Machine Learning       

 

 

ECCV 2024

Abstract

This paper tackles the simultaneous optimization of pose and Neural Radiance Fields (NeRF).

Departing from the conventional practice of using explicit global representations for camera pose, we propose a novel overparameterized representation that models camera poses as learnable rigid warp functions. We establish that modeling the rigid warps must be tightly coupled with constraints and regularization imposed. Specifically, we highlight the critical importance of enforcing invertibility when learning rigid warp functions via neural network and propose the use of an Invertible Neural Network (INN) coupled with a geometry-informed constraint for this purpose.

We present results on synthetic and real-world datasets, and demonstrate that our approach outperforms existing baselines in terms of pose estimation and high-fidelity reconstruction due to enhanced optimization convergence.


 

 

Method


We illustrate our method using two images defined in their respective local camera coordinate system (C) . We propose to model rigid warp functions for each pixel in the camera coordinate (C) as individual rays. Our proposed INN takes in the pixel coordinates and camera center defined in each camera coordinates (C) , coupled with frame-dependent latent code and outputs their corresponding coordinates in the world coordinates (W) . After this warping operation, we compute the colors and volume densities along each ray is computed by solving a Neural Radiance Fields (NeRF) through a photometric loss function.




Results

BARF1
Naive
Implicit-Invertible MLP
Explicit-Invertible MLP
Groundtruth
Joint optimization of homography and neural field estimation: Overparameterization benefits the joint optimization of homography and neural field estimation. We highlight the importance of invertibility in modeling rigid warp functions using neural network. By explicitly modeling the inversion through learning an INN (Explicit-Invertible MLP), we achieve superior performance in pose convergence compared to using an Implicit-Invertible MLP approach, which only enforce approximate invertibility. Please refer to our paper for more details and statistical runs.

Homeomorphism perspective: A qualitative analysis

L2G
Ours
Neural Invertible Warp vs. L2G: We present a qualitative analysis on a single-view pose estimation that sheds light on the empirical effectiveness of our approach compared to an overparameterized method L2G 2 method, which uses a MLP to predict SE(3) transformations. We superimposed the intermediate rendered images with Groundtruth (lighter visualization). We observe noticeable deformations (circled) in our approach. These deformations indicate that the INN predicts general homeomorphisms - continuous and invertible transformations - not limited to the rigid motions of the SE(3) group. This results in a flexible optimization trajectory that is not prone to a suboptimal minima, offering an advantage over predicting within the SE(3) group.

Demo: Qualitative Results

How to interpret this visualization: This visualization demonstrates that our proposed method renders novel view synthesis results closer to the Groundtruth (GT) compared to existing methods as it converges to a better pose estimation. You can slide the cursor to compare the differences and observe any misalignments due to pose inaccuracies.

Qualitative Results on LLFF Scenes

GT LLFF-leaves L2G
GT LLFF-leaves Ours

GT LLFF-trex BARF
GT LLFF-trex Ours

Qualitative Results on Blender Scenes

GT Blender-mic L2G
GT Blender-mic Ours
GT Blender-materials L2G
GT Blender-materials Ours

Qualitative Results on DTU Scenes

GT DTU-Scan24 BARF
GT DTU-Scan24 Ours
GT DTU-Scan105 L2G
GT DTU-Scan105 Ours
GT DTU-Scan55 BARF
GT DTU-Scan55 Ours

BibTeX


      @article{
      }

 

References

[1] Lin, Chen-Hsuan, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. Barf: Bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF international conference on computer vision (ICCV), 2021.
[2] Chen, Yue, Xingyu Chen, Xuan Wang, Qi Zhang, Yu Guo, Ying Shan, and Fei Wang. Local-to-global registration for bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
sk