Abstract: The detection of traffic objects in aerial scenes holds significant application potential in both military and civilian sectors. However, current aerial traffic object detection techniques ...
Abstract: Object pose estimation is a core means for robots to understand and interact with their environment. For this task, monocular category-level methods are attractive as they require only a ...
A comprehensive repository for fine-tuning the Donut model for document image classification and parsing tasks. This project provides optimized training pipelines using Hugging Face Transformers with ...