LF-MultiDiffusion (Ours, 2m 6s)
Panorama
Look-around perspective video
ECCV 2026
1The University of Tokyo 2RIKEN Center for Advanced Intelligence Project
We propose LF-MultiDiffusion, a training-free panorama generation method that extends MultiDiffusion to support linear projections between target and reference image spaces. Our key idea is to reformulate latent aggregation as a regularized least-squares problem and solve it efficiently with a Krylov-based iterative solver inside the denoising loop. This formulation enables denser and more natural mappings than prior training-free methods, yielding more stable generation with far fewer perspective views. As a result, LF-MultiDiffusion reduces the number of image generator evaluations during denoising and significantly improves inference efficiency. Experiments show that LF-MultiDiffusion achieves better visual quality, text alignment, and panoramic consistency than the strongest training-free baseline, while providing a 15.36× speedup.
Each method is shown with its generated panorama image and a perspective-view video that looks around the scene. The time next to each method name denotes the wall-clock generation time reported in the paper. See Additional Demo Examples below for more details.
Panorama
Look-around perspective video
Panorama
Look-around perspective video
Panorama
Look-around perspective video