Modelling Agent Policies with Interpretable Imitation Learning

EasyChair Preprint 2959

6 pages•Date: March 14, 2020

Tom Bewley, Jonathan Lawry and Arthur Richards

Abstract

As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents’ latent state representations by selecting from a large space of candidate features constructed from the Markov state. We present initial promising results from an implementation in a multi-agent traffic environment.

Keyphrases: Decision Tree, Explainable Artificial Intelligence, Imitation Learning, Representation Learning, interpretability, traffic modelling

Links:

https://easychair.org/publications/preprint/VtvR

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:2959,
  author    = {Tom Bewley and Jonathan Lawry and Arthur Richards},
  title     = {Modelling Agent Policies with Interpretable Imitation Learning},
  howpublished = {EasyChair Preprint 2959},
  year      = {EasyChair, 2020}}

Download PDF Open PDF in browser