MatrixCity: A Large-scale City Dataset
for City-scale Neural Rendering and Beyond

Yixuan Li*1     Lihan Jiang*2     Linning Xu1     Yuanbo Xiangli1     Zhenzhi Wang1     Dahua Lin1,2     Bo Dai2
*denotes equal contribution

Abstract

overview

Neural radiance fields (NeRF) and its subsequent variants have led to remarkable progress in neural rendering. While most of recent neural rendering works focus on objects and small-scale scenes, developing neural rendering methods for city-scale scenes is of great potential in many real-world applications. However, this line of research is impeded by the absence of a comprehensive and high-quality dataset, yet collecting such a dataset over real city-scale scenes is costly, sensitive, and technically infeasible. To this end, we build a large-scale, comprehensive, and high-quality synthetic dataset for city-scale neural rendering researches. Leveraging the Unreal Engine 5 City Sample project, we developed a pipeline to easily collect aerial and street city views with ground-truth camera poses, as well as a series of additional data modalities. Flexible control on environmental factors like light, weather, human and car crowd is also available in our pipeline, supporting the need of various tasks covering city-scale neural rendering and beyond. The resulting pilot dataset, MatrixCity, contains 60k aerial images and 350k street images from two city maps of total size 28km2. On top of MatrixCity, a thorough benchmark is also conducted, which not only reveals unique challenges of the task of city-scale neural rendering, but also highlights potential improvements for future works.

Our Excellent MatrixCity Dataset!

Example of our MatrixCity Dataset. MatrixCity Dataset contains various city environments with multiple viewing angles, as well as additional properties, like depth and normal.

Urban Roaming Experience

Example NeRF Results on MatrixCity Dataset. The rendered novel views delivers an immersive experience for city roaming.

Comparison with Previous Datasets

overview

Comparison of statistics and properties between our MatrixCity dataset with previous datasets. (a) High Quality. The dataset is rendered with movie-level quality, closely resembling real-world data. (b) Large-scale and Diversity. Our dataset encompasses two cities with extensive coverage, capturing a wide range of buildings, pedestrians, signs, vehicles, and diverse lighting conditions. (c) Controllable Environments. We can control the lighting angle and intensity, the density and height of fog, and the density of flow of pedestrians and vehicles in a fine-grained manner. (d) Multiple Properties. Our developed plugin has the capability to extract additional information, such as depth, surface normals, and decomposed reflectance components, with minimal additional cost.

Data Collection Method

overview

we developed a plugin that automatically generates camera trajectories, reducing the need for manual annotation and increasing the efficiency of data collection. Camera trajectories generated by our plugins can be rendered in any Unreal Engine 5 scenes. Here we illustrate the data collection in the small city in Unreal Engine 5. (a) Aerial block split for the entire small city; (b&c) Camera aerial and street trajectory of block 4 (visualized in bird-eye views) used in our plugin for data collection.

Other Properties

overview

Matrixcity dataset provides controlled environment factors such as illumination (a), fog density (b) and decomposed reflectance (c)

Benchmark

overview

Benchmark for novel view synthesis. We present the performance of five state-of-the-art and representative methods on our dataset.

BibTeX

License

Our Matrixcity dataset is copyright by us and published under the Creative Commons Attribution-NonCommercial 4.0 International License.