aria_nbv Implementation Overview

Note: High-level package navigation now lives in aria_nbv_overview.qmd. This page keeps the detailed function/library pointers for reference.

1 EFM3D & ATEK Function Library Overview

This document provides a comprehensive overview of all EFM3D and ATEK functions that are relevant for implementing the aria_nbv (formerly oracle_rri) Relative Reconstruction Improvement (RRI) computation for Next-Best-View planning.

1.1 EFM3D Core Utilities

1.1.1 Ray Operations (`efm3d.utils.ray`)

Purpose: Generate and transform camera rays for 3D reconstruction and novel view synthesis.

grid_ray(pixel_grid: torch.Tensor, camera: CameraTW) -> tuple[torch.Tensor, torch.Tensor]
- Description: Unprojects pixel grid coordinates to 3D ray directions
- Usage: Converting image coordinates to world-space rays for rendering candidate views
- Theory: Essential for computing what each camera pixel “sees” in 3D space
ray_grid(cam: CameraTW) -> tuple[torch.Tensor, torch.Tensor]
- Description: Generates rays for all pixels in a camera’s image grid
- Usage: Batch ray generation for efficient candidate view rendering
- Theory: Creates the complete ray bundle for a virtual camera
transform_rays(rays_old: torch.Tensor, T_new_old: PoseTW) -> torch.Tensor
- Description: Transforms rays between coordinate systems using pose transformations
- Usage: Moving rays from candidate camera poses to world coordinates
- Theory: Enables coordinate system alignment for multi-view geometry
ray_obb_intersection(rays_v: torch.Tensor, voxel_extent: torch.Tensor, ...) -> torch.Tensor | tuple[...]
- Description: Computes ray intersections with oriented bounding boxes
- Usage: Determining which voxels are intersected by candidate view rays
- Theory: Critical for efficient ray-voxel intersection tests in 3D reconstruction
sample_depths_in_grid(rays_v: torch.Tensor, ds_max: torch.Tensor, ...) -> tuple[...]
- Description: Samples depth values along rays within voxel grids
- Usage: Generating sample points for volume rendering and reconstruction
- Theory: Implements stratified sampling for neural radiance field-style rendering

1.1.2 Point Cloud Processing (`efm3d.utils.pointcloud`)

Purpose: Handle point cloud operations for 3D reconstruction and occupancy mapping.

get_points_world(batch: dict, batch_idx: int | None = None, ...) -> tuple[torch.Tensor, torch.Tensor]
- Description: Extracts world-coordinate point clouds from ASE dataset batches
- Usage: Converting depth images and semi-dense SLAM points to 3D coordinates
- Theory: Foundation for creating the current reconstruction P_t
collapse_pointcloud_time(pc_w: torch.Tensor) -> torch.Tensor
- Description: Merges point clouds across time, removing duplicates and NaN values
- Usage: Combining temporal observations into a single reconstruction
- Theory: Temporal fusion for more complete scene representations
pointcloud_to_voxel_ids(pc_v: torch.Tensor, vW: int, vH: int, vD: int, voxel_extent: torch.Tensor) -> tuple[...]
- Description: Maps 3D points to voxel grid indices with validity checking
- Usage: Converting continuous point clouds to discrete voxel representations
- Theory: Spatial discretization for efficient 3D processing
pointcloud_occupancy_samples(p3s_w: torch.Tensor, Ts_wc: torch.Tensor, ...) -> tuple[...]
- Description: Samples occupied, surface, and free space points from point clouds
- Usage: Creating training data for occupancy field learning
- Theory: Generates diverse 3D samples for learning scene geometry

1.1.3 Voxel Operations (`efm3d.utils.voxel`)

Purpose: Handle 3D voxel grid operations and coordinate transformations.

tensor_wrap_voxel_extent(voxel_extent, B=None, device="cpu") -> torch.Tensor
- Description: Normalizes voxel extent representations across batches
- Usage: Ensuring consistent voxel coordinate systems
- Theory: Standardizes 3D bounding box representations
create_voxel_grid(vW: int, vH: int, vD: int, voxel_extent, device="cpu") -> torch.Tensor
- Description: Creates 3D coordinate grids for voxel centers
- Usage: Generating sampling coordinates for 3D reconstruction
- Theory: Establishes regular 3D sampling patterns

1.1.4 Voxel Sampling (`efm3d.utils.voxel_sampling`)

Purpose: Sample from 3D voxel grids with interpolation and coordinate conversion.

pc_to_vox(pc_v: torch.Tensor, vW: int, vH: int, vD: int, voxel_extent) -> tuple[...]
- Description: Converts point cloud coordinates to voxel grid coordinates
- Usage: Mapping between continuous and discrete 3D representations
- Theory: Essential for sampling from learned 3D representations
sample_voxels(feat3d: torch.Tensor, pts_v: torch.Tensor, differentiable=False) -> tuple[...]
- Description: Samples features from 3D voxel grids using trilinear interpolation
- Usage: Querying learned 3D features at arbitrary points
- Theory: Enables continuous sampling from discrete 3D representations

1.1.5 Depth Processing (`efm3d.utils.depth`)

Purpose: Convert between depth representations and 3D point clouds.

dist_im_to_point_cloud_im(dist_m: torch.Tensor, cams: CameraTW) -> tuple[...]
- Description: Converts distance images to 3D point clouds using camera calibration
- Usage: Processing ASE dataset depth maps into 3D points
- Theory: Fundamental unprojection operation for 3D reconstruction

1.1.6 Reconstruction Utilities (`efm3d.utils.reconstruction`)

Purpose: Core functions for 3D scene reconstruction and occupancy field learning.

build_gt_occupancy(occ, visible, p3s_w, Ts_wc, cams, T_wv, voxel_extent) -> tuple[...]
- Description: Creates ground truth occupancy grids from point clouds
- Usage: Generating supervision signals for 3D reconstruction
- Theory: Converts sparse points to dense occupancy representations
compute_occupancy_loss_subvoxel(occ, visible, p3s_w_all, ...) -> torch.Tensor
- Description: Computes reconstruction loss using subvoxel sampling
- Usage: Training occupancy networks with point cloud supervision
- Theory: Implements differentiable 3D reconstruction loss

1.1.7 Mesh Processing (`efm3d.utils.mesh_utils`)

Purpose: Evaluate mesh reconstruction quality and compute geometric distances.

eval_mesh_to_mesh(pred: str | trimesh.Trimesh, gt: str | trimesh.Trimesh, ...) -> tuple[...]
- Description: Computes accuracy/completeness metrics between predicted and ground truth meshes
- Usage: CRITICAL for RRI computation - this is the Chamfer Distance calculation we need
- Theory: Implements bidirectional distance evaluation for mesh quality assessment
compute_pts_to_mesh_dist(pts: torch.Tensor, faces: torch.Tensor, verts: torch.Tensor, step: int) -> np.ndarray
- Description: Computes distances from point cloud to mesh surface
- Usage: Core component of Chamfer Distance computation
- Theory: Point-to-surface distance for reconstruction evaluation

1.2 Camera and Pose Systems (`efm3d.aria`)

1.2.1 Camera Calibration (`efm3d.aria.camera`)

Purpose: Handle camera calibration, projection, and coordinate transformations.

CameraTW class: Wrapper for camera calibrations with projection/unprojection methods
- project(self, p3d: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: 3D to 2D projection
- unproject(self, p2d: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: 2D to 3D ray generation
- Usage: Converting between 2D image coordinates and 3D world coordinates
- Theory: Essential for multi-view geometry and novel view synthesis

1.2.2 Pose Transformations (`efm3d.aria.pose`)

Purpose: Handle SE(3) pose transformations and coordinate system conversions.

PoseTW class: SE(3) pose transformations with composition and inversion
- transform(self, p3d: torch.Tensor) -> torch.Tensor: Apply pose transformation to 3D points
- compose(self, other) -> PoseTW: Chain pose transformations
- inverse(self) -> PoseTW: Invert pose transformation
- Usage: Managing coordinate transformations between camera poses and world coordinates
- Theory: Foundation for multi-view geometry and coordinate system alignment

1.3 ATEK Evaluation Framework

1.3.1 Surface Reconstruction Metrics

Purpose: Standardized evaluation of 3D reconstruction quality.

evaluate_single_mesh_pair(pred_mesh_filename, gt_mesh_filename, ...) -> tuple[...]
- Description: CORE FUNCTION for RRI - computes Chamfer Distance and reconstruction metrics
- Usage: This is exactly what we need for computing RRI = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT)
- Theory: Implements the standard surface reconstruction evaluation protocol
evaluate_mesh_over_a_dataset(input_folder, pred_mesh_filename, gt_mesh_filename, ...) -> dict
- Description: Batch evaluation across multiple scenes
- Usage: Evaluating NBV performance across the ASE validation set
- Theory: Dataset-level performance assessment

1.4 Implementation Plan

1.4.1 Core Classes to Implement

1.4.1.1 1. `OracleRRI` Class

class OracleRRI:
    def __init__(self, gt_mesh_path: str, voxel_extent: torch.Tensor, device: str):
        """Initialize with ground truth mesh for oracle computation"""

    def compute_rri(self, current_pointcloud: torch.Tensor, candidate_pointcloud: torch.Tensor) -> float:
        """
        Compute RRI = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT)
        Uses ATEK's evaluate_single_mesh_pair internally
        """

    def batch_compute_rri(self, current_pc: torch.Tensor, candidate_pcs: list[torch.Tensor]) -> torch.Tensor:
        """Efficiently compute RRI for multiple candidate views"""

1.4.1.2 2. `CandidateViewGenerator` Class

class CandidateViewGenerator:
    def __init__(self, camera_calibration: CameraTW, sampling_strategy: str):
        """Generate candidate camera poses around current position"""

    def generate_spherical_candidates(self, center_pose: PoseTW, radius: float, n_samples: int) -> list[PoseTW]:
        """Sample poses on sphere around current position"""

    def generate_hemisphere_candidates(self, center_pose: PoseTW, radius: float, n_samples: int) -> list[PoseTW]:
        """Sample poses on hemisphere (avoiding ground/ceiling)"""

1.4.1.3 3. `CandidateViewRenderer` Class

class CandidateViewRenderer:
    def __init__(self, voxel_extent: torch.Tensor, resolution: tuple[int, int, int]):
        """Render synthetic observations from candidate poses"""

    def render_depth_from_pose(self, pose: PoseTW, camera: CameraTW, current_reconstruction: torch.Tensor) -> torch.Tensor:
        """
        Generate synthetic depth image from candidate viewpoint
        Uses EFM3D ray casting and voxel sampling
        """

    def depth_to_pointcloud(self, depth: torch.Tensor, pose: PoseTW, camera: CameraTW) -> torch.Tensor:
        """Convert rendered depth to world-coordinate point cloud"""

1.4.2 Integration with EFM3D Pipeline

1.4.2.1 Memory-Efficient Implementation

Use sample_voxels for efficient point cloud querying
Batch candidate view processing to avoid memory crashes
Implement progressive sampling for large scenes

1.4.2.2 Data Flow

Input: ASE dataset batch with GT depth and camera poses
Current Reconstruction: Use get_points_world + collapse_pointcloud_time
Candidate Generation: CandidateViewGenerator around latest pose
Synthetic Rendering: Use EFM3D ray casting to simulate candidate observations
Point Cloud Fusion: Merge current + candidate point clouds
RRI Computation: ATEK evaluate_single_mesh_pair for Chamfer Distance
Best View Selection: argmax(RRI scores) for next-best-view

1.4.3 Theoretical Foundation

The core insight is that RRI measures reconstruction improvement:

P_t: Current point cloud reconstruction from first t views
P_{t∪q}: Enhanced reconstruction after adding candidate view q
RRI(q) = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT)
Negative RRI = improvement (lower Chamfer distance is better)

The challenge is ensuring consistent point cloud sampling between P_t and P_{t∪q} for valid Chamfer Distance comparison. This requires:

Unified voxel discretization using EFM3D utilities
Consistent density sampling to avoid bias
Memory-efficient batching for large scenes

1.4.4 Key Technical Challenges

Point Cloud Density Consistency: Ensure P_t and P_{t∪q} have comparable point densities
Memory Management: Avoid GPU crashes during large-scale ray casting
Coordinate System Alignment: Proper use of PoseTW transformations
Synthetic View Realism: Generate plausible depth observations from candidate poses

1.4.5 Success Metrics

Correctness: RRI correlates with actual reconstruction improvement
Efficiency: Can process 100+ candidate views per scene within memory limits
Integration: Works end-to-end with ASE dataset and EFM3D inference pipeline
Validation: Produces sensible next-best-view selections on validation scenes

This implementation will leverage the mature EFM3D and ATEK libraries while focusing on the specific challenge of consistent point cloud sampling for valid RRI computation.

--- title: "aria_nbv Implementation Overview" format: html --- > **Note**: High-level package navigation now lives in `aria_nbv_overview.qmd`. This page keeps the detailed function/library pointers for reference. # EFM3D & ATEK Function Library Overview This document provides a comprehensive overview of all EFM3D and ATEK functions that are relevant for implementing the aria_nbv (formerly `oracle_rri`) Relative Reconstruction Improvement (RRI) computation for Next-Best-View planning. ## EFM3D Core Utilities ### Ray Operations (`efm3d.utils.ray`) **Purpose**: Generate and transform camera rays for 3D reconstruction and novel view synthesis. - `grid_ray(pixel_grid: torch.Tensor, camera: CameraTW) -> tuple[torch.Tensor, torch.Tensor]` - **Description**: Unprojects pixel grid coordinates to 3D ray directions - **Usage**: Converting image coordinates to world-space rays for rendering candidate views - **Theory**: Essential for computing what each camera pixel "sees" in 3D space - `ray_grid(cam: CameraTW) -> tuple[torch.Tensor, torch.Tensor]` - **Description**: Generates rays for all pixels in a camera's image grid - **Usage**: Batch ray generation for efficient candidate view rendering - **Theory**: Creates the complete ray bundle for a virtual camera - `transform_rays(rays_old: torch.Tensor, T_new_old: PoseTW) -> torch.Tensor` - **Description**: Transforms rays between coordinate systems using pose transformations - **Usage**: Moving rays from candidate camera poses to world coordinates - **Theory**: Enables coordinate system alignment for multi-view geometry - `ray_obb_intersection(rays_v: torch.Tensor, voxel_extent: torch.Tensor, ...) -> torch.Tensor | tuple[...]` - **Description**: Computes ray intersections with oriented bounding boxes - **Usage**: Determining which voxels are intersected by candidate view rays - **Theory**: Critical for efficient ray-voxel intersection tests in 3D reconstruction - `sample_depths_in_grid(rays_v: torch.Tensor, ds_max: torch.Tensor, ...) -> tuple[...]` - **Description**: Samples depth values along rays within voxel grids - **Usage**: Generating sample points for volume rendering and reconstruction - **Theory**: Implements stratified sampling for neural radiance field-style rendering ### Point Cloud Processing (`efm3d.utils.pointcloud`) **Purpose**: Handle point cloud operations for 3D reconstruction and occupancy mapping. - `get_points_world(batch: dict, batch_idx: int | None = None, ...) -> tuple[torch.Tensor, torch.Tensor]` - **Description**: Extracts world-coordinate point clouds from ASE dataset batches - **Usage**: Converting depth images and semi-dense SLAM points to 3D coordinates - **Theory**: Foundation for creating the current reconstruction P_t - `collapse_pointcloud_time(pc_w: torch.Tensor) -> torch.Tensor` - **Description**: Merges point clouds across time, removing duplicates and NaN values - **Usage**: Combining temporal observations into a single reconstruction - **Theory**: Temporal fusion for more complete scene representations - `pointcloud_to_voxel_ids(pc_v: torch.Tensor, vW: int, vH: int, vD: int, voxel_extent: torch.Tensor) -> tuple[...]` - **Description**: Maps 3D points to voxel grid indices with validity checking - **Usage**: Converting continuous point clouds to discrete voxel representations - **Theory**: Spatial discretization for efficient 3D processing - `pointcloud_occupancy_samples(p3s_w: torch.Tensor, Ts_wc: torch.Tensor, ...) -> tuple[...]` - **Description**: Samples occupied, surface, and free space points from point clouds - **Usage**: Creating training data for occupancy field learning - **Theory**: Generates diverse 3D samples for learning scene geometry ### Voxel Operations (`efm3d.utils.voxel`) **Purpose**: Handle 3D voxel grid operations and coordinate transformations. - `tensor_wrap_voxel_extent(voxel_extent, B=None, device="cpu") -> torch.Tensor` - **Description**: Normalizes voxel extent representations across batches - **Usage**: Ensuring consistent voxel coordinate systems - **Theory**: Standardizes 3D bounding box representations - `create_voxel_grid(vW: int, vH: int, vD: int, voxel_extent, device="cpu") -> torch.Tensor` - **Description**: Creates 3D coordinate grids for voxel centers - **Usage**: Generating sampling coordinates for 3D reconstruction - **Theory**: Establishes regular 3D sampling patterns ### Voxel Sampling (`efm3d.utils.voxel_sampling`) **Purpose**: Sample from 3D voxel grids with interpolation and coordinate conversion. - `pc_to_vox(pc_v: torch.Tensor, vW: int, vH: int, vD: int, voxel_extent) -> tuple[...]` - **Description**: Converts point cloud coordinates to voxel grid coordinates - **Usage**: Mapping between continuous and discrete 3D representations - **Theory**: Essential for sampling from learned 3D representations - `sample_voxels(feat3d: torch.Tensor, pts_v: torch.Tensor, differentiable=False) -> tuple[...]` - **Description**: Samples features from 3D voxel grids using trilinear interpolation - **Usage**: Querying learned 3D features at arbitrary points - **Theory**: Enables continuous sampling from discrete 3D representations ### Depth Processing (`efm3d.utils.depth`) **Purpose**: Convert between depth representations and 3D point clouds. - `dist_im_to_point_cloud_im(dist_m: torch.Tensor, cams: CameraTW) -> tuple[...]` - **Description**: Converts distance images to 3D point clouds using camera calibration - **Usage**: Processing ASE dataset depth maps into 3D points - **Theory**: Fundamental unprojection operation for 3D reconstruction ### Reconstruction Utilities (`efm3d.utils.reconstruction`) **Purpose**: Core functions for 3D scene reconstruction and occupancy field learning. - `build_gt_occupancy(occ, visible, p3s_w, Ts_wc, cams, T_wv, voxel_extent) -> tuple[...]` - **Description**: Creates ground truth occupancy grids from point clouds - **Usage**: Generating supervision signals for 3D reconstruction - **Theory**: Converts sparse points to dense occupancy representations - `compute_occupancy_loss_subvoxel(occ, visible, p3s_w_all, ...) -> torch.Tensor` - **Description**: Computes reconstruction loss using subvoxel sampling - **Usage**: Training occupancy networks with point cloud supervision - **Theory**: Implements differentiable 3D reconstruction loss ### Mesh Processing (`efm3d.utils.mesh_utils`) **Purpose**: Evaluate mesh reconstruction quality and compute geometric distances. - `eval_mesh_to_mesh(pred: str | trimesh.Trimesh, gt: str | trimesh.Trimesh, ...) -> tuple[...]` - **Description**: Computes accuracy/completeness metrics between predicted and ground truth meshes - **Usage**: **CRITICAL for RRI computation** - this is the Chamfer Distance calculation we need - **Theory**: Implements bidirectional distance evaluation for mesh quality assessment - `compute_pts_to_mesh_dist(pts: torch.Tensor, faces: torch.Tensor, verts: torch.Tensor, step: int) -> np.ndarray` - **Description**: Computes distances from point cloud to mesh surface - **Usage**: Core component of Chamfer Distance computation - **Theory**: Point-to-surface distance for reconstruction evaluation ## Camera and Pose Systems (`efm3d.aria`) ### Camera Calibration (`efm3d.aria.camera`) **Purpose**: Handle camera calibration, projection, and coordinate transformations. - `CameraTW` class: Wrapper for camera calibrations with projection/unprojection methods - `project(self, p3d: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]`: 3D to 2D projection - `unproject(self, p2d: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]`: 2D to 3D ray generation - **Usage**: Converting between 2D image coordinates and 3D world coordinates - **Theory**: Essential for multi-view geometry and novel view synthesis ### Pose Transformations (`efm3d.aria.pose`) **Purpose**: Handle SE(3) pose transformations and coordinate system conversions. - `PoseTW` class: SE(3) pose transformations with composition and inversion - `transform(self, p3d: torch.Tensor) -> torch.Tensor`: Apply pose transformation to 3D points - `compose(self, other) -> PoseTW`: Chain pose transformations - `inverse(self) -> PoseTW`: Invert pose transformation - **Usage**: Managing coordinate transformations between camera poses and world coordinates - **Theory**: Foundation for multi-view geometry and coordinate system alignment ## ATEK Evaluation Framework ### Surface Reconstruction Metrics **Purpose**: Standardized evaluation of 3D reconstruction quality. - `evaluate_single_mesh_pair(pred_mesh_filename, gt_mesh_filename, ...) -> tuple[...]` - **Description**: **CORE FUNCTION for RRI** - computes Chamfer Distance and reconstruction metrics - **Usage**: This is exactly what we need for computing RRI = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT) - **Theory**: Implements the standard surface reconstruction evaluation protocol - `evaluate_mesh_over_a_dataset(input_folder, pred_mesh_filename, gt_mesh_filename, ...) -> dict` - **Description**: Batch evaluation across multiple scenes - **Usage**: Evaluating NBV performance across the ASE validation set - **Theory**: Dataset-level performance assessment ## Implementation Plan ### Core Classes to Implement #### 1. `OracleRRI` Class ```python class OracleRRI: def __init__(self, gt_mesh_path: str, voxel_extent: torch.Tensor, device: str): """Initialize with ground truth mesh for oracle computation""" def compute_rri(self, current_pointcloud: torch.Tensor, candidate_pointcloud: torch.Tensor) -> float: """ Compute RRI = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT) Uses ATEK's evaluate_single_mesh_pair internally """ def batch_compute_rri(self, current_pc: torch.Tensor, candidate_pcs: list[torch.Tensor]) -> torch.Tensor: """Efficiently compute RRI for multiple candidate views""" ``` #### 2. `CandidateViewGenerator` Class ```python class CandidateViewGenerator: def __init__(self, camera_calibration: CameraTW, sampling_strategy: str): """Generate candidate camera poses around current position""" def generate_spherical_candidates(self, center_pose: PoseTW, radius: float, n_samples: int) -> list[PoseTW]: """Sample poses on sphere around current position""" def generate_hemisphere_candidates(self, center_pose: PoseTW, radius: float, n_samples: int) -> list[PoseTW]: """Sample poses on hemisphere (avoiding ground/ceiling)""" ``` #### 3. `CandidateViewRenderer` Class ```python class CandidateViewRenderer: def __init__(self, voxel_extent: torch.Tensor, resolution: tuple[int, int, int]): """Render synthetic observations from candidate poses""" def render_depth_from_pose(self, pose: PoseTW, camera: CameraTW, current_reconstruction: torch.Tensor) -> torch.Tensor: """ Generate synthetic depth image from candidate viewpoint Uses EFM3D ray casting and voxel sampling """ def depth_to_pointcloud(self, depth: torch.Tensor, pose: PoseTW, camera: CameraTW) -> torch.Tensor: """Convert rendered depth to world-coordinate point cloud""" ``` ### Integration with EFM3D Pipeline #### Memory-Efficient Implementation - Use `sample_voxels` for efficient point cloud querying - Batch candidate view processing to avoid memory crashes - Implement progressive sampling for large scenes #### Data Flow 1. **Input**: ASE dataset batch with GT depth and camera poses 2. **Current Reconstruction**: Use `get_points_world` + `collapse_pointcloud_time` 3. **Candidate Generation**: `CandidateViewGenerator` around latest pose 4. **Synthetic Rendering**: Use EFM3D ray casting to simulate candidate observations 5. **Point Cloud Fusion**: Merge current + candidate point clouds 6. **RRI Computation**: ATEK `evaluate_single_mesh_pair` for Chamfer Distance 7. **Best View Selection**: argmax(RRI scores) for next-best-view ### Theoretical Foundation The core insight is that RRI measures reconstruction improvement: - **P_t**: Current point cloud reconstruction from first t views - **P_{t∪q}**: Enhanced reconstruction after adding candidate view q - **RRI(q) = Chamfer(P_{t∪q}, GT) - Chamfer(P_t, GT)** - **Negative RRI = improvement** (lower Chamfer distance is better) The challenge is ensuring **consistent point cloud sampling** between P_t and P_{t∪q} for valid Chamfer Distance comparison. This requires: 1. Unified voxel discretization using EFM3D utilities 2. Consistent density sampling to avoid bias 3. Memory-efficient batching for large scenes ### Key Technical Challenges 1. **Point Cloud Density Consistency**: Ensure P_t and P_{t∪q} have comparable point densities 2. **Memory Management**: Avoid GPU crashes during large-scale ray casting 3. **Coordinate System Alignment**: Proper use of PoseTW transformations 4. **Synthetic View Realism**: Generate plausible depth observations from candidate poses ### Success Metrics - **Correctness**: RRI correlates with actual reconstruction improvement - **Efficiency**: Can process 100+ candidate views per scene within memory limits - **Integration**: Works end-to-end with ASE dataset and EFM3D inference pipeline - **Validation**: Produces sensible next-best-view selections on validation scenes This implementation will leverage the mature EFM3D and ATEK libraries while focusing on the specific challenge of consistent point cloud sampling for valid RRI computation.

1 EFM3D & ATEK Function Library Overview

1.1 EFM3D Core Utilities

1.1.1 Ray Operations (efm3d.utils.ray)

1.1.2 Point Cloud Processing (efm3d.utils.pointcloud)

1.1.3 Voxel Operations (efm3d.utils.voxel)

1.1.4 Voxel Sampling (efm3d.utils.voxel_sampling)

1.1.5 Depth Processing (efm3d.utils.depth)

1.1.6 Reconstruction Utilities (efm3d.utils.reconstruction)

1.1.7 Mesh Processing (efm3d.utils.mesh_utils)

1.2 Camera and Pose Systems (efm3d.aria)

1.2.1 Camera Calibration (efm3d.aria.camera)

1.2.2 Pose Transformations (efm3d.aria.pose)