1 VinOfflineDatasetStats

data_handling.VinOfflineDatasetStats(
    store_dir,
    version,
    num_samples,
    sampled_samples,
    split_counts,
    num_scenes,
    num_snippets,
    materialized_blocks,
    candidate_count,
    rri,
    vin_points,
    numeric_bytes,
    block_shapes=dict(),
    block_diagnostics=list(),
    sample_summaries=list(),
    candidate_count_values=list(),
    rri_values=list(),
    vin_point_values=list(),
    rri_component_values=dict(),
    rri_component_summaries=dict(),
    candidate_pose_values=dict(),
    candidate_pose_summaries=dict(),
    memory_diagnostics=list(),
    backbone_diagnostics=list(),
    batch_shapes=dict(),
)

Store-level diagnostics for an immutable VIN offline dataset.

1.1 Attributes

Name Description
store_dir Absolute store directory inspected for diagnostics.
version Offline dataset format version from manifest.json.
num_samples Number of rows listed in sample_index.jsonl.
sampled_samples Number of rows scanned for per-sample statistics.
split_counts Sample counts keyed by split name.
num_scenes Number of unique scene IDs in the sample index.
num_snippets Number of unique (scene_id, snippet_id) pairs.
materialized_blocks Optional block flags from the manifest.
candidate_count Distribution of valid candidate counts per sample.
rri Distribution of finite oracle RRI values across sampled candidates.
vin_points Distribution of VIN point lengths across sampled snippets.
numeric_bytes Approximate bytes occupied by manifest-declared numeric shard blocks.
block_shapes Stored numeric block shapes keyed by logical block name.
block_diagnostics Manifest-declared block diagnostics for Streamlit and CLI tables.
sample_summaries Per-row sanity summaries for sampled records.
candidate_count_values Sampled candidate-count values used for histograms.
rri_values Sampled finite oracle RRI values used for histograms.
vin_point_values Sampled VIN point lengths used for histograms.
rri_component_values Sampled finite RRI component values keyed by component name.
rri_component_summaries Finite-value summaries for RRI component values.
candidate_pose_values Candidate pose diagnostics in the reference-rig frame.
candidate_pose_summaries Finite-value summaries for candidate pose diagnostics.
memory_diagnostics Estimated per-sample runtime memory summaries.
backbone_diagnostics Streaming statistics for sampled backbone numeric fields.
batch_shapes Shape preview from one VinOracleBatch read path.