1 VinModelV3Config

vin.model_v3.VinModelV3Config()

Configuration for VinModelV3 (streamlined one-step VIN baseline).

1.1 Attributes

Name Description
target_type Factory target for BaseConfig.setup_target (config-as-factory).
backbone Frozen EVL backbone configuration that supplies voxel features.
pose_encoder Pose encoder configuration (R6D + LFF; stable relative pose encoding).
pos_grid_encoder_lff LFF encoder for XYZ voxel position keys used by global pooling.
head_hidden_dim Hidden dimension for the scorer MLP (optuna favored compact heads).
head_num_layers Number of MLP layers before the CORAL layer (best trials used 1).
head_dropout Dropout probability in the MLP (sweep best used near-zero dropout).
num_classes Number of ordinal bins (VIN-NBV uses 15 for sweep comparability).
coral_preinit_bias Pre-initialize CORAL biases for faster, more stable ordinal learning.
field_dim Channel dimension of the projected voxel field (compact by design).
field_gn_groups Requested GroupNorm groups (clamped to a divisor of field_dim) for stability.
semidense_proj_grid_size Grid size for semidense projection stats (higher for tiny-CNN cues).
semidense_proj_max_points Maximum semidense points used for projection stats (sweep best used 4096).
semidense_cnn_enabled Whether to encode a tiny 2D CNN over the semidense projection grid.
semidense_cnn_channels Hidden channel width for the semidense projection CNN.
semidense_cnn_out_dim Output feature dimension of the semidense projection CNN.
use_traj_encoder Whether to encode snippet trajectories and append trajectory context.
traj_encoder Optional trajectory encoder for snippet rig poses (R6D + LFF).
use_voxel_valid_frac_gate Deprecated: voxel gate removed. Keep False (use FiLM only).
semidense_obs_count_min Global minimum of semidense observation count n_obs (cache summary).
semidense_obs_count_max Global maximum of semidense observation count n_obs (cache summary).
semidense_obs_count_p95 Global 95th percentile of semidense observation count n_obs (cache summary).
semidense_obs_count_mean Global mean of semidense observation count n_obs (cache summary).
semidense_obs_count_std Global standard deviation of semidense observation count n_obs (cache summary).
semidense_inv_dist_std_min Global minimum of semidense inverse depth std 1/sigma_d (cache summary).
semidense_inv_dist_std_max Global maximum of semidense inverse depth std 1/sigma_d (cache summary).
semidense_inv_dist_std_p95 Global 95th percentile of semidense inverse depth std 1/sigma_d (cache summary).
semidense_inv_dist_std_mean Global mean of semidense inverse depth std 1/sigma_d (cache summary).
semidense_inv_dist_std_std Global standard deviation of semidense inverse depth std 1/sigma_d (cache summary).
apply_cw90_correction Undo rotate_yaw_cw90 on poses (requires CW90-corrected cameras).
global_pool_grid_size Target grid size for pose-conditioned global pooling (best trials used ~5).
scene_field_channels Ordered list of scene-field channels to include in the voxel field.