This paper tackles the simultaneous optimization of pose and Neural Radiance Fields (NeRF).
Departing from the conventional practice of using explicit global representations for camera pose, we propose a novel overparameterized representation that models camera poses as learnable rigid warp functions. We establish that modeling the rigid warps must be tightly coupled with constraints and regularization imposed. Specifically, we highlight the critical importance of enforcing invertibility when learning rigid warp functions via neural network and propose the use of an Invertible Neural Network (INN) coupled with a geometry-informed constraint for this purpose.
We present results on synthetic and real-world datasets, and demonstrate that our approach outperforms existing baselines in terms of pose estimation and high-fidelity reconstruction due to enhanced optimization convergence.
We illustrate our method using two images defined in their respective local camera coordinate system (C) . We propose to model rigid warp functions for each pixel in the camera coordinate (C) as individual rays. Our proposed INN takes in the pixel coordinates and camera center defined in each camera coordinates (C) , coupled with frame-dependent latent code and outputs their corresponding coordinates in the world coordinates (W) . After this warping operation, we compute the colors and volume densities along each ray is computed by solving a Neural Radiance Fields (NeRF) through a photometric loss function.
@article{
}