Switch to: References

Add citations

You must login to add citations.
  1. Explaining AI through mechanistic interpretability.Lena Kästner & Barnaby Crook - 2024 - European Journal for Philosophy of Science 14 (4):1-25.
    Recent work in explainable artificial intelligence (XAI) attempts to render opaque AI systems understandable through a divide-and-conquer strategy. However, this fails to illuminate how trained AI systems work as a whole. Precisely this kind of functional understanding is needed, though, to satisfy important societal desiderata such as safety. To remedy this situation, we argue, AI researchers should seek mechanistic interpretability, viz. apply coordinated discovery strategies familiar from the life sciences to uncover the functional organisation of complex AI systems. Additionally, theorists (...)
    Download  
     
    Export citation  
     
    Bookmark