A Cheat Sheet For Multi-Object Tracking

Renu Khandelwal
8 min readMay 12, 2022


Multiple Object Tracking(MOT)

MOT takes a single continuous video and splits it into discrete frames at a specific frame rate(fps) to output

  • Detection: what objects are present in each frame
  • Localization: where objects are in each frame
  • Association: whether objects in different frames belong to the same or different objects

Typical Applications of MOT

Multi-object tracking(MOT) has its application in

  • Video surveillance for traffic control, digital forensics
  • Gesture recognition
  • Robotics
  • Augmented Reality
  • Self-driving vehicles

Challenges with MOT

  • Accurately detect the objects of interest in the frame with high confidence. Issues with accurate object detection are failing to detect an object of interest, assigning a wrong class label to a detected object, or incorrectly localizing an identified object.
  • ID Switching occurs when two similar objects overlap or blend, causing the identity switching; hence, keeping track of the object id is difficult.
  • Background distortion: Busy background makes it difficult to detect small objects during object detection
  • Occlusion: occurs when something you want to see is hidden or occluded by another object.
  • Multiple Spatial Spaces, Deformation, or Object rotation
  • Image illumination
  • Visual streaking or smearing captured on camera due to motion blur

Characteristics of a Multi-object tracker(MOT)

A good multi-object tracker(MOT)

  1. Tracks object by identifying the correct number of trackers at the precise locations in each frame.
  2. Identify objects by tracking individual objects consistently over a long period,
  3. Track objects despite occlusion, illumination changes, background, motion blur, etc.
  4. Detect and Track objects fast

Popular MOT Algorithms

Centroid based Object



