Explaining agent behavior through intentional sequences
Date
2021-12Author
Reilly, Evan
Publisher
University of Wisconsin - Whitewater
Advisor(s)
Nguyen, Hien
Ganguly, Arnab
Mukherjee, Lopamudra
Metadata
Show full item recordAbstract
One of the main goals of this research and Explainable Artificial Intelligence (XAI) is to improve trust between human and AI. Local XAI approaches address trust through explanations of individual decisions but fail to provide an overall perspective of an agent’s policy when applied to sequential decision-making settings. Global surrogate methods provide an overall view of agent behavior but only explain decisions based on current features of the environment. These drawbacks are further complicated due to state-spaces often being very large in these settings, making an agent’s policy incomprehensible; therefore, placing trust in these artificially intelligent agents is difficult because humans will likely not understand the agent’s aptitude at a task. We propose a policy summarization technique based on extracting intentional sequences of behavior using an information theoretic approach. These summaries capture the essence of an agent’s intentions through high-belief and high-desire situations that demonstrate meaningful short and long-term trade-offs made within an agent’s policy as they achieve their goals. Results from the pilot study indicate the proposed intentionality measure weakly correlates with informativeness to the agent’s decision-making process, the perceived usefulness of the summaries is positive compared to random sequences which are generally not useful, and the cognitive effort required to understand the summaries is neutral compared to random sequences which generally require more effort. Preliminary results indicate the proposed approach falls short in enabling human users to appropriately trust an agent’s policy, however, it is difficult to draw any final conclusions until a larger human evaluation is conducted.
Subject
Artificial intelligence.
Intentionality (Philosophy)
Trust.
Reinforcement learning.
Permanent Link
http://digital.library.wisc.edu/1793/82594Description
This file was last viewed in Adobe Acrobat Pro.