NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion Supplementary Material

NerfDiff: Single-image View Synthesis with
NeRF-guided Distillation from 3D-aware Diffusion

ICML 2023

Abstract

Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input. Existing approaches condition neural radiance fields (NeRF) on local image features, projecting points to the input image plane, and aggregating 2D features to perform volume rendering. However, under severe occlusion, this projection fails to resolve uncertainty, resulting in blurry renderings that lack details. In this work, we propose NerfDiff, which addresses this issue by distilling the knowledge of a 3D-aware conditional diffusion model (CDM) into NeRF through synthesizing and refining a set of virtual views at test-time. We further propose a novel NeRF-guided distillation algorithm that simultaneously generates 3D consistent virtual views from the CDM samples, and finetunes the NeRF based on the improved virtual views. Our approach significantly outperforms existing NeRF-based and geometry-free approaches on challenging datasets including ShapeNet, ABO, and Clevr3D.

Overview of Our Method

Pipeline

Our method incorporates a training and fine-tuning pipeline. We first learn the single-image NeRF and 2D CDM which is conditioned on the single-image NeRF renderings (left). At test time, we use the learned network parameters to predict an initial NeRF representation for fine-tuning. The NeRF-guided denoised images from the frozen CDM then supervise the NeRF in-turn (right).

Architecture

Details of the training pipeline of the single-image NeRF for NerfDiff. Using a UNet, we first map an input image to a camera-aligned triplane-based NeRF representation. This triplane efficiently conditions volume rendering from a target view, resulting in an initial rendering. This rendering conditions the diffusion process so the CDM can consistently denoise at that target pose.

NeRF-Guided Distillation

The core algorithm of the proposed method is "NeRF-Guided Distillation", which distills the knowledge of a 3D-aware CDM into the single-image NeRF from multiple virtual views for generating high-quality images. In the meanwhile, the multi-view diffusion process is guided by the NeRF representation to preserve 3D consistency of the diffusion. The details of the algorithm is shown below:

@inproceedings{gu2023nerfdiff, title={NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion}, author={Jiatao Gu and Alex Trevithick and Kai-En Lin and Josh Susskind and Christian Theobalt and Lingjie Liu and Ravi Ramamoorthi}, year={2023}, booktitle={International Conference on Machine Learning} }

NerfDiff: Single-image View Synthesis with
NeRF-guided Distillation from 3D-aware Diffusion

ICML 2023

Jiatao Gu¹ Alex Trevithick² Kai-En Lin² Josh Susskind¹ Christian Theobalt³ Lingjie Liu^3,4 Ravi Ramamoorthi²

¹Apple ²UC San Diego ³MPI ⁴UPenn

Paper PDF

arXiv

Code & Data (Comming soon)

Abstract

Overview of Our Method

Pipeline

Architecture

NeRF-Guided Distillation

Results on Various Datasets

Please click on the dataset names to see more video results.

ShapeNet Cars Dataset

ShapeNet Chairs Dataset

ABO Dataset

Clevr Dataset

Ablation Studies

Ablation on finetuning approaches

Citation

NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion

ICML 2023

Jiatao Gu 1 Alex Trevithick 2 Kai-En Lin 2 Josh Susskind 1 Christian Theobalt 3 Lingjie Liu 3,4 Ravi Ramamoorthi 2 1Apple 2UC San Diego 3MPI 4UPenn Paper PDF arXiv Code & Data (Comming soon)

Abstract

Overview of Our Method

Pipeline

Architecture

NeRF-Guided Distillation

Results on Various Datasets

Please click on the dataset names to see more video results.

ShapeNet Cars Dataset

ShapeNet Chairs Dataset

ABO Dataset

Clevr Dataset

Ablation Studies

Ablation on finetuning approaches

Citation

NerfDiff: Single-image View Synthesis with
NeRF-guided Distillation from 3D-aware Diffusion

Jiatao Gu¹ Alex Trevithick² Kai-En Lin² Josh Susskind¹ Christian Theobalt³ Lingjie Liu^3,4 Ravi Ramamoorthi²

¹Apple ²UC San Diego ³MPI ⁴UPenn

Paper PDF

arXiv

Code & Data (Comming soon)