Member-only story
YOLOX: New Improved YOLO
An Advanced, Anchor-free, High performing YOLO detector
YOLOX is a high-performing object detector, an improvement to the existing YOLO series. YOLO series are in constant exploration of techniques to improve the object detection techniques for optimal speed and accuracy trade-off for real-time applications.
Key features of the YOLOX object detector
- Anchor-free detectors significantly reduce the number of design parameters
- A decoupled head for classification, regression, and localization improves the convergence speed
- SimOTA advanced label assignment strategy reduces training time and avoids additional solver hyperparameters
- Strong data augmentations like MixUp and Mosiac to boost YOLOX performance
Speed-accuracy trade-off of different object detector models and Size-accuracy curve of lite object detector models

YOLOX Architecture
Baseline YOLOv3 Architecture
YOLOX uses YOLOv3 with Darknet53 as the baseline and a Spatial Pyramid Pooling(SPP) layer, which downsamples in Convolutional layers to get the best features in Max-Pooling layers to strengthen the feature extraction.
An improvement in YOLOX over YOLOv3 is adding Exponential Moving Average(EMA) weight updates, cosine lr schedule, IoU loss, and IoU-aware branch.
Decoupled Head
YOLOX uses a decoupled head for classification and regression, and the IoU branch is added to the regression branch.
For each level of the FPN feature, a 256 feature channel is reduced to a 1 × 1 Convolutional layer. Then two parallel branches are added with two 3 × 3 Convolutional layers for classification and regression tasks. The IoU branch is added to the regression branch. The decoupled head significantly improves the converging speed compared to the couples head used in the YOLO series.