Real-Time Multi-Camera Soccer Tracking System

I helped to develop a real-time, multi-camera soccer ball and player tracking solution. Unlike other competitors in this space, our approach was entirely visual: no GPS trackers, no player- or ball-mounted sensors. The foundation of the system was multi-view stereo computer vision. In production, our full pipeline achieved 60 frames per second and our data stream was used by TV broadcasters to annotate the video feed in real-time.

Many talented consultants and in-house researchers and engineers worked together to make this technology a reality. My contributions were camera calibration, and a 3D triangulation engine to localize players and the ball in 3D space from 2D object detections.

I created a process for calibrating fixed cameras installed in football stadiums using structure from motion. This process required special attention, because unlike typical SfM scenarios, the positions of the cameras were non-negotiable, and the interior of a stadium is highly symmetrical. Tools like COLMAP are prone to misregister images in these situations unless careful guardrails are established. With this process we were able to create highly accurate reconstructions.

I built an engine that triangulated 3D object positions from sets of per-camera 2D object detections using the camera calibration. The design of this engine was based on the RANSAC algorithm, bolstered with epipolar geometry for both accuracy improvement and performance optimization. This engine achieved sub-pixel reprojection error, and operated at more than 120 Hz, triangulating the positions of all 22 players, plus the ball, from 20+ camera views in 4K. The results were so astounding that we recycled them back into our camera calibration procedure to refine camera poses and model the optical distortion of the lenses.