Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
HOI-DETR is a transformer-based framework for detecting hands, hand-held objects, and their interactions in images and video. Built on the Co-DETR architecture, it adds a lightweight interaction ...
Abstract: Traditional real-time object detection networks deployed in autonomous aerial vehicles (AAVs) struggle to extract features from small objects in complex backgrounds with occlusions and ...
This important work introduces an integrated open-source platform for behavioral acquisition and pose estimation that substantially improves the accessibility and speed of real-time animal tracking ...
Abstract: Tiny-object detection is increasingly crucial in fields such as remote sensing, traffic monitoring, and robotics. Inspired by human visual perception, the attention mechanism has become a ...