Abstract: Foundation models like the Segment Anything Model (SAM) have significantly advanced promptable image segmentation in computer vision. However, extending these capabilities to videos presents ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results