1 Literature Review
This section contains detailed reviews of key papers and methods related to Next-Best-View planning and foundation models.
1.1 Key Papers
1.1.1 Next-Best-View Planning
1.1.2 Foundation Models & Scene Understanding
- EFM3D/EVL: Egocentric foundation models for 3D perception and voxel lifting
- SceneScript: Structured scene language for semantic scene manipulation
1.2 Local Corpus
The full source of every referenced paper is mirrored under literature/tex-src/. Key entry points:
arXiv-EFM3D/main.texarXiv-GenNBV/main.texarXiv-VIN-NBV/main.texarXiv-project-aria/main.texarXiv-scene-script/main.tex
Each directory contains the canonical figures (figures/), tables, and supplemental PDFs required to reproduce results or cite specific numbers in our documentation.
1.3 Research Context
These literature reviews form the theoretical foundation for integrating pre-trained egocentric models with quality-driven NBV planning. The combination of VIN-NBV’s RRI optimization with EFM3D’s foundation model capabilities enables better generalization to complex indoor environments compared to traditional coverage-based approaches.