Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians

1 Shanghai Artificial Intelligence Laboratory, 2 Tongji University, 3 University of Science and Technology of China, 4 The Chinese University of Hong Kong

TL;DR: We introduce Octree-GS, featuring an LOD-structured 3D Gaussian approach supporting level-of-detail decomposition for scene representation that contributes to the final rendering results.

Our method can guarantee continuous real-time rendering while achieving better visual quality. (The white points represent the distribution of neural Gaussian to reflect quantity variance)



Abstract

The recent 3D Gaussian splatting (3D-GS) has shown remarkable rendering fidelity and efficiency compared to NeRF-based neural scene representations. While demonstrating the potential for real-time rendering, 3D-GS encounters rendering bottlenecks in large scenes with complex details due to an excessive number of Gaussian primitives located within the viewing frustum. This limitation is particularly noticeable in zoom-out views and can lead to inconsistent rendering speeds in scenes with varying details. Moreover, it often struggles to capture the corresponding level of details at different scales with its heuristic density control operation. Inspired by the Level-of-Detail (LOD) techniques, we introduce Octree-GS, featuring an LOD-structured 3D Gaussian approach supporting level-of-detail decomposition for scene representation that contributes to the final rendering results. Our model dynamically selects the appropriate level from the set of multi-resolution anchor points, ensuring consistent rendering performance with adaptive LOD adjustments while maintaining high-fidelity rendering results.



Method Overview

Illustration of our proposed Octree-GS: Starting from a sparse point cloud, we construct an octree for the bounded 3D space. Each octree level provides a set of anchor Gaussians assigned to the corresponding LOD level. Unlike conventional 3D-GS methods treating all Gaussians equally, our approach involves anchor Gaussians with different LODs. During novel view rendering, we determine the required LOD level ℓ for each occupied anchor voxel within the octree from the observation center and invoke all anchor Gaussians up to that level for final rendering. This process, shown in the middle, results in an increased level of detail by gradually fetching anchors from higher LODs in an accumulation manner. Our model is trained with standard image reconstruction loss and additional regularization loss following the practice of Scaffold-GS.



Results

In real large scenes, Octree-GS can ensure continuous real-time rendering while maintaining fine rendering details. Compared with current SOTA methods, our method has significant advantages when rendering in the high-altitude views.

Scaffold-GS
Octree-GS
Mip-Splatting
Scaffold-GS
Octree-GS
Mip-Splatting

Comparison with SOTA method

Compared to existing baselines, Octree-GS successfully captures very fine details present in the scene, particularly for objects with thin structures such as trees, light-bulbs, decorative texts, etc.


Performance at different resolutions

Thanks to our LOD-structured 3D Gaussians design, Octree-GS can adaptively handle the changed footprint size and effectively address the aliasing issues inherent to 3D-GS and Scaffold-GS.


Effectiveness of Progressive Training

Visualization of anchor Gaussians in different LODs (several levels are omitted for visual brevity), displaying both anchor points and splatted 2D Gaussians in each image. Progressive training can guide the coarse-to-fine reconstruction process, avoid overlapping between different LOD levels. This strategy can not only reduce the number of rendered neural Gaussians, but improve the rendering accuracy of coarser LOD levels (e.g. LOD0, LOD1).


Effectiveness of LOD Bias

LOD bias is set as a learnable parameters for each anchor Gaussian as a residual to LOD levels. it effectively supplement the high-frequency regions with more consistent details to be rendered during inference process, such as those sharp edges of an object.


Visualization at different LODs

A clear division of roles is evident between different levels: LOD 0 captures most rough scene contents, and higher LODs gradually pick up the previously missed high-frequency details. The following is a hierarchical visualization of the rendering results on various types of scenes.

Real-Time Interactive Viewer

Gaussian Viewers for Octree-GS is available now! We provide pre-built binaries of Real-Time Viewer for Windows here and you can test it with examples below.