đź’ˇ LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

1UvA-Bosch Delta Lab 2BCAI-Bosch 3Toyota Technological Institute at Chicago

LumiNet transfers complex lighting conditions from a target image (a) to a source image (b), synthesizing a relit version of the source image (c) while preserving its main geometry and albedo.

Abstract

We introduce LumiNet, a novel architecture that leverages generative models and latent intrinsic representations for effective lighting transfer.

Given a source image and a target lighting image, LumiNet synthesizes a relit version of the source scene that captures the target's lighting. Our approach makes two key contributions: a data curation strategy from the StyleGAN-based relighting model for our training, and a modified diffusion-based ControlNet that processes both latent intrinsic properties from the source image and latent extrinsic properties from the target image. We further improve lighting transfer through a learned adaptor (MLP) that injects the target's latent extrinsic properties via cross-attention and fine-tuning.

Unlike traditional ControlNet, which generates images with conditional maps from a single scene, LumiNet processes latent representations from two different images - preserving geometry and albedo from the source while transferring lighting characteristics from the target. Experiments demonstrate that our method successfully transfers complex lighting phenomena including specular highlights and indirect illumination across scenes with varying spatial layouts and materials, outperforming existing approaches on challenging indoor scenes using only images as input.

Lighting Zoo

Given multiple different target lights, LumiNet generates various lighting conditions in the same (real-world) scene. (Light transfer results are interactive — magnifier will appear)

Slider Image 1

Original image

Slider Image 2

Relit image

Lighting Transfer Results

We present some of the relighting results here. Given a target light (top row) and a scene to be relit (second row), our method transfers complex indoor scene lighting phenomena—including direct illumination, specular highlights, cast shadows, inter-reflections, and other indirect effects—while maintaining scene geometry and albedo. (Light transfer results are interactive — magnifier will appear)

Guide Image 1 Guide Image 2 Guide Image 3 Guide Image 4

Target light

Dynamic Image 1

Original image

Dynamic Image 2

Relit image

Visual Comparison

We compare LumiNet with IC-Light-v2,RGB-X, and Latent-Intrinsic. Both RGB-X and IC-Light-v2 require text prompts to achieve relighting, where we use descriptions derived from the target lighting image (including actions like turning lights on/off, lamp placement, and scene type) as text prompts.

Acknowledgements

We thank Pingping Song, Melis Öcal, and Alexander Timans for their insightful discussions. We are also grateful to David Forsyth for his suggestion to emphasize that light in scenes cannot simply appear, along with other valuable comments on our manuscript. Additionally, we thank Xiao Zhang for providing the code and model for Latent Intrinsic. Finally, we thank all our participants for their time in finishing our user study. This project is financially supported by Bosch (Bosch Center for Artificial Intelligence), the University of Amsterdam and the allowance of Top consortia for Knowledge and Innovation (TKIs) from the Netherlands Ministry of Economic Affairs and Climate Policy.

BibTeX

@article{Xing2024LumiNet,
  author    = {Xing Xiaoyan, Groh Konrad, Karaoglu Sezer, Gevers Theo, Bhattad Anand},
  title     = {LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting},
  year      = {2024},
}