1 VinModelV3Config
vin.model_v3.VinModelV3Config()Configuration for VinModelV3 (streamlined one-step VIN baseline).
1.1 Attributes
| Name | Description |
|---|---|
| target_type | Factory target for BaseConfig.setup_target (config-as-factory). |
| backbone | Frozen EVL backbone configuration that supplies voxel features. |
| pose_encoder | Pose encoder configuration (R6D + LFF; stable relative pose encoding). |
| pos_grid_encoder_lff | LFF encoder for XYZ voxel position keys used by global pooling. |
| head_hidden_dim | Hidden dimension for the scorer MLP (optuna favored compact heads). |
| head_num_layers | Number of MLP layers before the CORAL layer (best trials used 1). |
| head_dropout | Dropout probability in the MLP (sweep best used near-zero dropout). |
| num_classes | Number of ordinal bins (VIN-NBV uses 15 for sweep comparability). |
| coral_preinit_bias | Pre-initialize CORAL biases for faster, more stable ordinal learning. |
| field_dim | Channel dimension of the projected voxel field (compact by design). |
| field_gn_groups | Requested GroupNorm groups (clamped to a divisor of field_dim) for stability. |
| semidense_proj_grid_size | Grid size for semidense projection stats (higher for tiny-CNN cues). |
| semidense_proj_max_points | Maximum semidense points used for projection stats (sweep best used 4096). |
| semidense_cnn_enabled | Whether to encode a tiny 2D CNN over the semidense projection grid. |
| semidense_cnn_channels | Hidden channel width for the semidense projection CNN. |
| semidense_cnn_out_dim | Output feature dimension of the semidense projection CNN. |
| use_traj_encoder | Whether to encode snippet trajectories and append trajectory context. |
| traj_encoder | Optional trajectory encoder for snippet rig poses (R6D + LFF). |
| use_voxel_valid_frac_gate | Deprecated: voxel gate removed. Keep False (use FiLM only). |
| semidense_obs_count_min | Global minimum of semidense observation count n_obs (cache summary). |
| semidense_obs_count_max | Global maximum of semidense observation count n_obs (cache summary). |
| semidense_obs_count_p95 | Global 95th percentile of semidense observation count n_obs (cache summary). |
| semidense_obs_count_mean | Global mean of semidense observation count n_obs (cache summary). |
| semidense_obs_count_std | Global standard deviation of semidense observation count n_obs (cache summary). |
| semidense_inv_dist_std_min | Global minimum of semidense inverse depth std 1/sigma_d (cache summary). |
| semidense_inv_dist_std_max | Global maximum of semidense inverse depth std 1/sigma_d (cache summary). |
| semidense_inv_dist_std_p95 | Global 95th percentile of semidense inverse depth std 1/sigma_d (cache summary). |
| semidense_inv_dist_std_mean | Global mean of semidense inverse depth std 1/sigma_d (cache summary). |
| semidense_inv_dist_std_std | Global standard deviation of semidense inverse depth std 1/sigma_d (cache summary). |
| apply_cw90_correction | Undo rotate_yaw_cw90 on poses (requires CW90-corrected cameras). |
| global_pool_grid_size | Target grid size for pose-conditioned global pooling (best trials used ~5). |
| scene_field_channels | Ordered list of scene-field channels to include in the voxel field. |