Explaining agent behavior through intentional sequences
University of Wisconsin - Whitewater
MetadataShow full item record
One of the main goals of this research and Explainable Artificial Intelligence (XAI) is to improve trust between human and AI. Local XAI approaches address trust through explanations of individual decisions but fail to provide an overall perspective of an agent’s policy when applied to sequential decision-making settings. Global surrogate methods provide an overall view of agent behavior but only explain decisions based on current features of the environment. These drawbacks are further complicated due to state-spaces often being very large in these settings, making an agent’s policy incomprehensible; therefore, placing trust in these artificially intelligent agents is difficult because humans will likely not understand the agent’s aptitude at a task. We propose a policy summarization technique based on extracting intentional sequences of behavior using an information theoretic approach. These summaries capture the essence of an agent’s intentions through high-belief and high-desire situations that demonstrate meaningful short and long-term trade-offs made within an agent’s policy as they achieve their goals. Results from the pilot study indicate the proposed intentionality measure weakly correlates with informativeness to the agent’s decision-making process, the perceived usefulness of the summaries is positive compared to random sequences which are generally not useful, and the cognitive effort required to understand the summaries is neutral compared to random sequences which generally require more effort. Preliminary results indicate the proposed approach falls short in enabling human users to appropriately trust an agent’s policy, however, it is difficult to draw any final conclusions until a larger human evaluation is conducted.
This file was last viewed in Adobe Acrobat Pro.