---
title: "Literature Review"
phase: thesis
audience: public
status: current
owner: jan
format: html
---
# Literature Review
This section is a thesis-oriented distillation, not a general survey. Each page extracts what ARIA-NBV can adopt from the local paper corpus: {{< gls next-best-view >}} objectives, egocentric observation contracts, candidate proposal mechanisms, rollout/value-learning gates, and failure modes that affect target-conditioned reconstruction.
The current project direction is deliberately narrow: keep {{< gls relative-reconstruction-improvement >}} and {{< gls target-specific-rri >}} as the authoritative utility signals; use {{< gls project-aria >}}, {{< gls aria-synthetic-environments >}}, {{< gls egocentric-foundation-model-3d >}}, and {{< gls egocentric-voxel-lifting >}} as the actor-visible state substrate; require target-conditioned fitted Double-Q / {{< gls finite-horizon-q-function >}} over trusted finite candidate sets; and keep IQL, continuous actor-critic, simulator-backed RL, and 3DGS control behind evidence gates.
## Adoption-State Overview
| domain | papers | adoption state | core signal | do not adopt |
|---|---|---|---|---|
| {{< gls next-best-view >}} objective and candidate planning | [VIN-NBV](vin_nbv.qmd), [PB-NBV](pb_nbv.qmd), [GenNBV](gen_nbv.qmd), [Hestia](hestia.qmd), receding-horizon and shadowcasting NBV | thesis-core method for RRI ranking; proposal/diagnostic for projection and frontier heuristics; stretch/bridge for continuous policies | Mesh-supervised {{< gls oracle-rri >}}, finite {{< gls candidate-view >}} sets, efficient candidate shortlists, validity-aware motion constraints | Replacing {{< gls relative-reconstruction-improvement >}} with coverage, projected frontier area, or policy reward before calibration |
| ARIA ecosystem and actor-visible state | [Project Aria](project_aria.qmd), [EFM3D/EVL](efm3d.qmd), [EFM3D scene embeddings](../theory/efm3d_scene_embeddings.qmd), {{< gls aria-synthetic-environments >}}, {{< gls machine-perception-services >}} | core substrate | Calibrated egocentric streams, trajectories, online calibration, semi-dense points, DINO/EVL local evidence, predicted {{< gls oriented-bounding-box >}} support, semidense+DINO representation ablations | Leaking GT meshes or GT boxes into actor-visible selection/scoring |
| Rollout, value learning, and RL | [RL sources for rollout and Q_H](rl_planning.qmd), Trajectory Transformer, Gumbel-Top-k, Double DQN, IQL, soft Q-learning, PPO/SAC | thesis-core method for fitted Double-Q / {{< gls finite-horizon-q-function >}}; gated follow-up for IQL and actor-critic | Deterministic rollout traces first, stochastic rollout data second, masked target-conditioned fitted Double-Q over finite candidates third | Starting with continuous online RL before a trusted reward loop and support-aware offline store exist |
| Coverage/information utility channels | [SCONE and FisherRF](scone_fisherrf.qmd), SCONE, FisherRF | proposal/diagnostic | Target-local support, visibility, directional novelty, Fisher-style diminishing returns | Replacing target RRI with coverage or uncertainty reduction |
| {{< gls three-dimensional-gaussian-splatting >}} / radiance-field active reconstruction | [Active 3DGS and targeted NBV](active_3dgs_nbv.qmd), ActiveNeRF, FisherRF, Next Best Sense, dynamic/object-centric 3DGS, FOV-HPE | proposal/diagnostic and stretch/bridge | Uncertainty, Fisher information, target/object weighting, downstream-task view selection | Treating 3DGS uncertainty or human-pose reward as a substitute for ASE mesh-supervised target RRI |
| Semantic scene representations | [SceneScript](scene_script.qmd) | stretch/bridge | Structured scene language, editable entities, global layout priors, ASE-scale scene representation | Making SceneScript a thesis-core dependency before observed target contracts and target RRI are trusted |
Adoption-state labels used in the pages:
- **core substrate**: required observation, dataset, or representation contract.
- **thesis-core method**: needed for the current thesis claim.
- **proposal/diagnostic**: useful for candidate proposals, reports, or sanity checks.
- **gated follow-up**: useful only after prerequisite evidence exists.
- **stretch/bridge**: future direction beyond the required thesis result.
- **background**: context only.
## Domain Hierarchy
### 1. NBV Objective And Candidate Planning
- [VIN-NBV](vin_nbv.qmd): source-backed {{< gls relative-reconstruction-improvement >}} objective, {{< gls oracle-rri >}} labels, CORAL ordinal training, and greedy candidate ranking [@VIN-NBV-frahm2025].
- [PB-NBV](pb_nbv.qmd): projection/ellipsoid candidate shortlisting and frontier/occupied evidence separation [@PB-NBV-jia2025].
- [GenNBV](gen_nbv.qmd): continuous {{< gls five-degrees-of-freedom >}} PPO baseline with coverage-gain rewards, useful as a simulator-gated contrast [@GenNBV-chen2024].
- [Hestia](hestia.qmd): hierarchical look-at-then-fly control and directional voxel-face visibility, useful for continuous-policy bridge design [@Hestia-lu2026].
### 2. ARIA Ecosystem And Actor-Visible State
- [Project Aria](project_aria.qmd): calibrated egocentric device, VRS/tooling path, {{< gls machine-perception-services >}} trajectories, online calibration, and semi-dense maps [@projectaria-engel2023].
- [EFM3D/EVL](efm3d.qmd): local actor-visible DINO/voxel evidence, occupancy/head outputs, and {{< gls oriented-bounding-box >}} support, with broader scene memory delegated to semidense/fused point evidence [@EFM3D-straub2024].
### 3. Rollout, Value Learning, And RL
- [RL sources for rollout and Q_H](rl_planning.qmd): planning-as-sequence-decoding, stochastic beams, overestimation control, offline-RL support constraints, and the mandatory target-conditioned fitted Double-Q / {{< gls finite-horizon-q-function >}} gate.
### 4. 3DGS / Radiance-Field Active Reconstruction
- [SCONE and FisherRF](scone_fisherrf.qmd): coverage-as-support and information-as-diminishing-returns channels for candidate tokens and diagnostics [@SCONE-guedon2022; @FisherRF-jiang2024].
- [Active 3DGS and targeted NBV](active_3dgs_nbv.qmd): uncertainty, Fisher information, object-centric utility, dynamic scenes, and task-specific view selection as proposal/diagnostic signals.
### 5. Semantic Scene Representations
- [SceneScript](scene_script.qmd): structured scene language and editable entity-level representation as stretch-only semantic/global planning context [@SceneScript-avetisyan2024].
## Current Synthesis
```{mermaid}
flowchart LR
A["Project Aria / ASE observed state"] --> B["EVL and semi-dense reconstruction proxy"]
B --> C["Scene + target oracle RRI labels"]
C --> D["One-step candidate scorer"]
D --> E["Trusted finite-candidate rollouts"]
E --> F["Target-conditioned fitted Double-Q Q_H"]
F -. "after evidence" .-> G["IQL / actor-critic / simulator bridge"]
```
The thesis should first prove that deterministic bounded oracle lookahead improves cumulative {{< gls target-specific-rri >}} over one-step greedy under equal acquisition budget. It should then train a target-conditioned fitted Double-Q / {{< gls finite-horizon-q-function >}} model over finite candidate sets and require it to beat one-step greedy/model scoring on cumulative target RRI. IQL, actor-critic bridges, SB3/PPO/SAC, Habitat/Isaac, and 3DGS control remain gated follow-up or stretch work.
## Local Corpus
The local source mirrors and paper manifest are tracked under `docs/literature/`.
- [`sources.jsonl`](../../literature/sources.jsonl): canonical paper manifest.
- [`tex-src/`](../../literature/tex-src/): local LaTeX mirrors when available.
Key local mirrors include VIN-NBV, GenNBV, Hestia, PB-NBV, EFM3D, Project Aria, SceneScript, Trajectory Transformer, Double DQN, IQL, Gumbel-Top-k, Deep Energy-Based Policies, SCONE, FisherRF, Dynamic 3DGS, Next Best Sense, and Instance/Object-centric NBV. FOV-HPE is tracked as DOI/PDF evidence in the local corpus, not as a local TeX mirror.
## Navigation
Return to [main documentation](../../index.qmd), [research questions](../thesis/questions.qmd), [roadmap](../thesis/roadmap.qmd), or [RRI theory](../theory/rri_theory.qmd).