KITTI Sensor Fusion

Sensor Fusion Approaches

Early Fusion

Fuses raw data from multiple sources (camera, LiDAR, GPS/IMU) and then performs object detection on the combined data.

Late Fusion

First detects objects independently in each sensor stream, then fuses the detection results.

Modified Late Fusion (This Implementation)

This notebook uses a hybrid approach:

Detect objects in 2D camera images using YOLOv5
Associate detected object centers with LiDAR point cloud data to obtain depth
Use GPS/IMU data to determine world coordinates

Note: While labeled as "Early Fusion" in the title, this is fundamentally a late fusion approach because detection happens first on camera data, then detections are enriched with LiDAR depth information.

Sensor Details

RGB Camera: 15 Hz (15 fps) for visual data capture
LiDAR: Velodyne LiDAR at 10Hz for 3D point cloud generation
GPS/IMU: OXTS GPS navigation system at 100Hz for positioning
Fusion Pipeline: Combined sensor data for robust object detection and tracking
- Calibrate sensors to establish coordinate system relationships
- Detect objects in camera images (Detection)
- Project 3D LiDAR point clouds to 2D image space (Fusion)
- Associate LiDAR depth with each detected object (Depth Association)

Why 3D Detection? Detection in 3D space is crucial for autonomous vehicles as it provides precise physical location of objects in the world, enabling better path planning and collision avoidance.

Sensor Fusion Approaches

Early Fusion

Late Fusion

Modified Late Fusion (This Implementation)

Sensor Details

Demo Video