File metadata and controls Preview Code Blame 154 lines (108 loc) · 14.3 KB Raw title [Paper Note] Learning from Semantic Dictionaries: Discriminative Codebook Contrastive Learning for Unified Visual ...
HOI-DETR is a transformer-based framework for detecting hands, hand-held objects, and their interactions in images and video. Built on the Co-DETR architecture, it adds a lightweight interaction ...