---
title: aria_nbv Package (formerly oracle_rri)
---
> **Note**: This page is superseded by `aria_nbv_overview.qmd` for the consolidated package map. Keep this one for finer module notes.
# Name & Scope
- `aria_nbv` is the in-repo package that powers our NBV research stack. The on-disk module is still imported as `oracle_rri`; the docs adopt the new name ahead of the eventual code rename.
- Purpose: typed, torch-friendly ingestion of ASE/ATEK snippets, candidate-view generation, rendering, fusion, and RRI computation utilities.
# Module Stack (current code layout)
- `configs/`, `config.py`, `utils/`: config-as-factory base classes (`BaseConfig`, `SingletonConfig`), structured logging (`Console`).
- `data/`: typed snippet views (`EfmSnippetView`, cameras, trajectories, semidense points, OBBs).
- `data_handling/`: ATEK WebDataset loader, metadata resolver, downloader (CLI-capable).
- `pose_generation/`: Monte-Carlo candidate sampling with collision and free-space pruning.
- `rendering/`: PyTorch3D- and trimesh-based depth rendering for candidates + typed batch wrapper.
- `views/` & `viz/`: candidate point-cloud rendering helpers and mesh/trajectory viz.
- `analysis/`: depth debugger utilities.
# Core Modules & Classes
## Config & Logging
- **`utils.base_config.BaseConfig`**: Pydantic-powered factory base with TOML IO and `inspect()` rendering (Rich).
- **`configs.path_config.PathConfig`**: Singleton paths to data roots, ATEK/ASE URL manifests, mesh directories.
- **`utils.console.Console`**: Rich wrapper with prefixes and optional shared PL logger hook.
## Data Handling (`data_handling/`)
### Dataset loader (`dataset.py`)
- **External dependencies used**:
- `atek.data_loaders.load_atek_wds_dataset`, `select_and_remap_dict_keys` for WebDataset ingest.
- `efm3d.aria.PoseTW`, `CameraTW` for typed geometry conversion (`to_camera_tw`).
- `trimesh` for GT mesh loading/simplification.
- **Batch handling**: `_explode_batched_dict` splits WebDataset B-dim into per-sample dicts; `ase_collate` returns lists plus EFM-ready dicts.
- **Mesh pairing**: `scene_to_mesh` mapping + optional decimation + caching; tolerant when meshes absent unless `require_mesh=True`.
- **Key typing**: `CameraStream`, `Trajectory`, `SemiDensePoints` mirror ATEK dataclasses; preserve Aria frame (x-left, y-up, z-forward), `T_A_B` convention.
- **EFM remap**: `ASESample.to_efm_dict()` applies `EfmModelAdaptor.get_dict_key_mapping_all()` and can include `gt_mesh`.
### Metadata & Download
- **`metadata.ASEMetadata`**: parses `ase_mesh_download_urls.json` and `AriaSyntheticEnvironment_ATEK_download_urls.json` → `SceneInfo(scene_id, mesh_url, mesh_sha, snippet_ids)`.
- **`downloader.ASEDownloaderConfig/ASEDownloader`**: orchestrates mesh + WDS downloads (uses ATEK `download_atek_wds_sequences`, `requests` for meshes, SHA1 verification, unzip). CLI-friendly via `pydantic_settings`.
## Analysis
- **`analysis/depth_debugger.py`**: utilities to inspect depth/point clouds; useful for validating candidate sampling and fusion (see TODOs).
## Visualisation
- **`viz/mesh_viz.py`**: rendering helpers (Trimesh + matplotlib) for meshes and point clouds; wire into Streamlit dashboard per TODOs.
# NBV Pipeline at a Glance
```{mermaid}
flowchart LR
A[ASE/ATEK shard\n+ GT mesh] --> B(data_handling.dataset\nEfmSnippetView)
B --> C(pose_generation.\nCandidateViewGenerator)
C -->|PoseTW batch| D(rendering.\nCandidateDepthRenderer)
D -->|Depth maps| E[Fuse depth→PC\n(EFM3D pointcloud utils)]
E --> F[Metrics: RRI / Chamfer]
F --> G[Policy: pick NBV]
G -->|update pose| C
```
# RRI Definition (used throughout docs/code)
- Let \(P_t\) be the current reconstruction point cloud, \(P_q\) the point cloud from a candidate view, and \(M\) the ground-truth mesh surface (sampled to points \(M_s\)).
- Bidirectional Chamfer distance:
$$
\mathrm{CD}(P, M) = \frac{1}{|P|}\sum_{p\in P}\min_{m\in M}\|p-m\|_2^2
+ \frac{1}{|M_s|}\sum_{m\in M_s}\min_{p\in P}\|m-p\|_2^2.
$$
- Relative Reconstruction Improvement (higher is better):
$$
\mathrm{RRI}(P_t, P_q, M) = \frac{\mathrm{CD}(P_t, M) - \mathrm{CD}(P_t \cup P_q, M)}{\mathrm{CD}(P_t, M) + \varepsilon},
$$
with a small \(\varepsilon\) for stability. Positive values indicate improvement after fusing the candidate view.
# External Library Hooks (per component)
- **Trimesh**: mesh IO, surface sampling, ray–mesh intersections; optional quadric decimation when loading GT meshes.
- **EFM3D**:
- `PoseTW`, `CameraTW`: camera/pose typing + operations.
- `utils.ray` (`ray_grid`, `transform_rays`, `ray_obb_intersection`) for candidate depth rendering.
- `utils.pointcloud` (`dist_im_to_point_cloud_im`, `pointcloud_to_occupancy_snippet`) for depth → PC and occupancy.
- **ATEK**:
- `load_atek_wds_dataset`, `process_wds_sample` for streaming shards.
- Key mapping (`EfmModelAdaptor.get_dict_key_mapping_all`) when exporting to EFM schema.
- Download helpers (`atek_data_store_download.download_atek_wds_sequences`) via `ASEDownloader`.
# Theoretical context (Wikipedia quick refs)
- **Active perception/vision**: NBV is an instance of active perception—moving the sensor to harvest more informative observations; active vision systems explicitly reorient cameras to reduce occlusion and improve depth estimates. [Wikipedia :: Active perception](https://en.wikipedia.org/wiki/Active_perception)
- **Point clouds as state**: Our reconstruction state \(P_t\) is a point cloud—an unordered set of 3D samples that approximates scene geometry. [Wikipedia :: Point cloud](https://en.wikipedia.org/wiki/Point_cloud)
- **Chamfer vs. Hausdorff**: The Chamfer distance we use for RRI is a smooth, averaged variant of the symmetric Hausdorff distance between two point sets, trading strict worst-case guarantees for robustness to outliers. [Wikipedia :: Chamfer distance](https://en.wikipedia.org/wiki/Chamfer_distance); [Wikipedia :: Hausdorff distance](https://en.wikipedia.org/wiki/Hausdorff_distance)
```{mermaid}
sequenceDiagram
participant Agent
participant CandidateGen as CandidateGen
participant Renderer
participant Metric
Agent->>CandidateGen: last pose, mesh, extent
CandidateGen-->>Agent: PoseTW candidates + masks
Agent->>Renderer: valid poses + mesh + camera
Renderer-->>Metric: depth maps → point clouds
Metric-->>Agent: RRI scores (Chamfer/occupancy)
Agent->>Agent: select next-best view (argmax)
```
# Suggested Mermaid for planned processing layer
```{mermaid}
classDiagram
class CandidateViewGenerator{
+generate(last_pose, mesh, extent)->CandidateSamplingResult
}
class CandidateDepthRenderer{
+render(sample, candidates)->CandidateDepthBatch
}
class PointCloudFusion{
+merge(P_t, P_q)->P_fused
}
class OracleRRI{
+score(P_t, P_q, mesh)->float
}
CandidateViewGenerator --> CandidateDepthRenderer : PoseTW batch
CandidateDepthRenderer --> PointCloudFusion : depth->PC
PointCloudFusion --> OracleRRI : fused PC
```
Use this layout to guide module additions under `aria_nbv/` (import path `oracle_rri` for now) while keeping current data-handling and config patterns.