ObjectPose9D Dataset
A newly constructed dataset that aggregates images with 9D pose annotations from diverse sources.
A newly constructed dataset that aggregates images with 9D pose annotations from diverse sources.
A method for accurate and flexible multi-object 9-DoF pose manipulation in image generation.
A benchmark dataset designed to test the visual reasoning capabilities of models in mathematical contexts.
A widely used dataset for image classification tasks consisting of 60,000 32×32 color images in 10 classes.
A smaller version of the ImageNet dataset, containing 200 classes with 500 training images per class, used for image classification tasks.
CLIP is a model that learns visual concepts from natural language descriptions, enabling zero-shot transfer to various vision tasks.