LLaVA-3D could perform both 2D and 3D vision-language tasks. The left block (b) shows that compared with previous 3D LMMs, our LLaVA-3D achieves state-of-the-art performance across a wide range of 3D ...
Abstract: Augmented Reality (AR) integration via handheld devices is a subject of significant interest for mobile applications and interactions between humans and machines. The mobile AR technique ...