Order:
  1. Conceptual Engineering Using Large Language Models.Bradley Allen - forthcoming - In Vincent C. Müller, Aliya R. Dewey, Leonard Dung & Guido Löhr, Philosophy of Artificial Intelligence: The State of the Art. Berlin: SpringerNature.
    We describe a method, based on Jennifer Nado’s proposal for classification procedures as targets of conceptual engineering, that implements such procedures by prompting a large language model. We apply this method, using data from the Wikidata knowledge graph, to evaluate stipulative definitions related to two paradigmatic conceptual engineering projects: the International Astronomical Union’s redefinition of PLANET and Haslanger’s ameliorative analysis of WOMAN. Our results show that classification procedures built using our approach can exhibit good classification performance and, through the generation (...)
    Download  
     
    Export citation  
     
    Bookmark   2 citations  
  2. Carnap’s Robot Redux: LLMs, Intensional Semantics, and the Implementation Problem in Conceptual Engineering (extended abstract).Bradley Allen - manuscript
    In his 1955 essay "Meaning and synonymy in natural languages", Rudolf Carnap presents a thought experiment wherein an investigator provides a hypothetical robot with a definition of a concept together with a description of an individual, and then asks the robot if the individual is in the extension of the concept. In this work, we show how to realize Carnap's Robot through knowledge probing of an large language model (LLM), and argue that this provides a useful cognitive tool for conceptual (...)
    Download  
     
    Export citation  
     
    Bookmark  
  3.  28
    A Benchmark for the Detection of Metalinguistic Disagreements between LLMs and Knowledge Graphs.Bradley Allen & Paul Groth - forthcoming - In Reham Alharbi, Jacopo de Berardinis, Paul Groth, Albert Meroño-Peñuela, Elena Simperl & Valentina Tamma, ISWC 2024 Special Session on Harmonising Generative AI and Semantic Web Technologies. CEUR-WS.
    Evaluating large language models (LLMs) for tasks like fact extraction in support of knowledge graph construction frequently involves computing accuracy metrics using a ground truth benchmark based on a knowledge graph (KG). These evaluations assume that errors represent factual disagreements. However, human discourse frequently features metalinguistic disagreement, where agents differ not on facts but on the meaning of the language used to express them. Given the complexity of natural language processing and generation using LLMs, we ask: do metalinguistic disagreements occur (...)
    Download  
     
    Export citation  
     
    Bookmark