Evaluation and Design of Generalist Systems (EDGeS)

Ai Magazine (2023)
  Copy   BIBTEX

Abstract

The field of AI has undergone a series of transformations, each marking a new phase of development. The initial phase emphasized curation of symbolic models which excelled in capturing reasoning but were fragile and not scalable. The next phase was characterized by machine learning models—most recently large language models (LLMs)—which were more robust and easier to scale but struggled with reasoning. Now, we are witnessing a return to symbolic models as complementing machine learning. Successes of LLMs contrast with their inscrutability, inaccuracy, and hallucinations, which underwrite concerns over the reliability and trustworthiness of these systems, motivating investigations into commonsense reasoning, AI explainability, and formal verification techniques. Moreover, proper assessments of hybrid machine learning/symbolic systems require novel strategies to facilitate comparisons of performance and guide future AI progress. The EDGeS AAAI 2023 Spring symposium brought together researchers focusing on novel assessments and benchmarks for evaluating hybrid and artificial general intelligence (AGI) systems. This symposium revealed what was already suspected: research concerning evaluation of machine learning/symbolic hybrids and AGI is, unfortunately, lacking. Even so, the discussion was fruitful.

Author Profiles

Analytics

Added to PP
2023-07-02

Downloads
149 (#79,916)

6 months
66 (#68,415)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?