Poster
Test-Time Visual In-Context Tuning
Jiahao Xie · Alessio Tonioni · Nathalie Rauschmayr · Federico Tombari · Bernt Schiele
Visual in-context learning (ICL), as a new paradigm in computer vision, allows the model to rapidly adapt to various tasks with only a handful of prompts and examples. While effective, the existing visual ICL paradigm exhibits poor generalizability under distribution shifts. In this work, we propose test-time Visual In-Context Tuning (VICT), a method that can learn adaptive visual ICL models on the fly with a single test sample. Specifically, We flip the role between task prompts and the test sample and use a cycle consistency loss to reconstruct the original task prompt output. Our key insight is that a model should be aware of a new test distribution if it can successfully recover the original task prompts. Extensive experiments on six representative vision tasks with 15 corruptions demonstrate that our VICT can improve the generalizability of VICL to unseen new domains. Code will be released to facilitate future research.
Live content is unavailable. Log in and register to view live content