PF3plat : Pose-Free Feed-Forward
3D Gaussian Splatting

1Korea University 2KAIST 3Microsoft Research Asia
*Equal Contribution, Corresponding Authors

Pf3plat estimates multi-view consistent depth, accurate camera pose,
photorealistic novel views from uncalibrated image collections.

Summary of Our Work

  • We address the challenging task of pose-free, feed-forward 3D reconstruction and novel view synthesis using 3DGS, and present PF3plat.
  • Our approach works well with wide-baseline images and requires no additional data beyond RGB images, relaxing many common assumptions of existing methods to improve practicality.
  • We identify and address the instable training process arising from the use of pixel-aligned 3DGS, and further improve the performance by introducing lightweight refinement modules.
  • PF3plat demonstrates state-of-the-art performance on RealEstate10K, ACID and DL3DV datasets for pose-free novel view synthesis task.

Architecture

(Coarse Alignment) Given a set of unposed images, PF3plat first coarsely aligns the Gaussians with initial depth and pose estimated from pre-trained monocular depth estimation and visual correspondence models.
(Fine Alignment) Initial depth and pose estimates are further refined through learnable modules to improve the quality of 3D reconstruction and novel view synthesis where the refined estimates are used to estimate geometry confidence scores.

Qualitative Results

Uncurated qualitative results of PF3plat on multiple datasets. Best viewed on landscape mode for mobile devices.

Qualitative Comparisons

Compared to previous state-of-the-art methods, Pf3plat shows superior performance in photorealistic novel view synthesis across all datasets. Specifically, Pf3plat enables accurate pose estimation even for scenes with complex geometry and large textureless regions where previous works fail.

RealEstate10K
trained on RealEstate10K
ACID
trained on ACID
DL3DV
trained on DL3DV

Comparisons of Cross-dataset Generalization

Here, we present the superior performance of cross-dataset generalization of PF3plat evaluated in the setting on both
DL3DV → RealEstate10K and RealEstate10K → DL3DV.

trained on DL3DV

Details of Fine-Alignment Module

Given the initial depth and pose from our coarse alignment module, PF3plat further refines the depth and pose estimates through learnable modules to improve the quality of 3D reconstruction and novel view synthesis.
The refined estimates are used to estimate geometry confidence scores, which assess the reliability of 3D Gaussian centers and condition the prediction of Gaussian parameters accordingly.

trained on DL3DV

BibTeX


      @article{hong2024pf3plat,
      title   = {PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting},
      author  = {Sunghwan Hong and Jaewoo Jung and Heeseong Shin and Jisang Han and Jiaolong Yang and Chong Luo and Seungryong Kim},
      journal = {arXiv preprint arXiv:2410.22128},
      year    = {2024}
    }