A method to generate 3D objects from text without 3D data by utilizing 2D diffusion models (Imagen)
Changed the loss in the distillation of the 2D diffusion model to CLIP. By combining the upcoming SDS (Score Distillation Sampling) and NeRF, it is possible to create a high-quality 3D object with only the user's prompt.
Score Distillation Sampling (SDS Loss)
A new loss function that extracts gradients from the diffusion model's score function to optimize randomly initialized NeRF parameters.
Score predictions (denoising direction) and actual noisy data differences are used to calculate MSE gradient (parameter direction), aiming for a density distillation loss effect
DreamFusion: Text-to-3D using 2D Diffusion
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D assets and efficient architectures for denoising 3D data, neither of which currently exist.
https://dreamfusion3d.github.io/


Seonglae Cho