Skip to yearly menu bar Skip to main content


Poster

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang · Yuan-Ming Li · Zhi-Wei Xia · Yu-Ming Tang · Kun-Yu Lin · Jian-Fang Hu · Wei-Shi Zheng


Abstract:

Error detection in procedural activities is essential for consistent and correct outcomes in AR-assisted and robotic systems. Some existing methods can only detect errors in action labels, while others can only detect errors by comparing the actual action with static prototypes. Prototype-based methods overlook situations where more than one action is valid following a sequence of executed actions. This leads to two issues: not only can the model not effectively detect errors using static prototypes when the inference environment or action execution distribution differs from training, but the model may also use the wrong prototypes to detect errors if the ongoing action's label is not the same as the predicted one. To address this problem, we propose an Adaptive Multiple Normal Action Representation (AMNAR) framework. AMNAR predicts all valid next actions and reconstructs their corresponding normal action representations, which are compared against the ongoing action to detect errors. Extensive experiments demonstrate that AMNAR achieves state-of-the-art performance, highlighting the effectiveness of AMNAR and the importance of modeling multiple valid next actions in error detection.

Live content is unavailable. Log in and register to view live content