LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals
IntermediateGilat Toker, Nitay Calderon et al.Jan 15arXiv
This paper builds LIBERTy, a new way to fairly judge how well AI explains its decisions about big, human ideas like age, race, or experience.
#concept-based explanations#structural counterfactuals#structured causal models