Project Roadmap

1 Project Roadmap

1.1 Seminar Phase (6 ECTS, ~180 hours)

1.1.1 Phase 1: Foundation & Familiarization

Goal:

  • Get familiar with the NBV problem domain and read relevant literature
  • Understand the Aria ecosystem — ASE data, SceneScript, Project Aria Tools, ATEK, EFM3D/ EVL

1.1.2 Phase 2: Oracle RRI

Goal: Implement an “oracle” Relative Reconstruction Improvement (RRI) computation method based on the GT meshes and semi-dense point clouds or compute dense PC from GT depth maps (problem: they are only covering actual trajectory)

1.1.3 Phase 3: RRI Prediction Network w/ EVL or Scene-Script backbone

Goal:

  • Use pre-trained SceneScript or EFM3D/ EVL model to predict RRI for candidate views based on previous observations (partial point cloud, previous poses and calibrated video frames).
  • Define training procedures and loss functions.

1.1.3.1 Architecture

  • Encoder: SceneScript encoder or EVL backbone (frozen or fine-tuned)
  • Candidate Encoder: View pose + frustum features, projection of partial PC into candidate view -> learnable encoder
  • Predictor: MLP regressing RRI score

1.2 Master Thesis Phase (30 ECTS, ~900 hours)

1.2.1 Phase 4: Entity-Aware RRI

1.2.2 Phase 5: View Synthesis Integration

1.2.3 Phase 6: Learning-Based NBV Prediction

1.2.4 Phase 7: Fine-Tune EFM for NBV on Target Platform

  • Project Aria Glasses: High-fidelity egocentric data
  • Meta Quest 3: Inside-out tracking, depth sensor
  • iPhone LiDAR: Portable, high-quality depth

1.2.5 Phase 8: Human-in-the-Loop System

Goal: Develop interactive AR interface for real-time NBV guidance

1.2.5.1 Components

  1. Entity Selection UI
    • Tap entities in AR view
    • Set importance weights
  2. Real-Time NBV Computation
    • Streaming point cloud input
    • Incremental updates of the scene representation
    • Fast RRI and pose prediction
  3. View Guidance Overlay
    • AR arrows/markers showing optimal viewpoints
    • Distance and quality indicators
    • Real-time voice2voice feedback

1.2.6 Phase 9: Real-World Deployment & Evaluation (Months 7-8, ~200 hours)

Goal: Deploy on mobile devices and validate in real environments