Switch to: Citations

Add references

You must login to add references.
  1. The Challenges of Large‐Scale, Web‐Based Language Datasets: Word Length and Predictability Revisited.Stephan C. Meylan & Thomas L. Griffiths - 2021 - Cognitive Science 45 (6):e12983.
    Language research has come to rely heavily on large‐scale, web‐based datasets. These datasets can present significant methodological challenges, requiring researchers to make a number of decisions about how they are collected, represented, and analyzed. These decisions often concern long‐standing challenges in corpus‐based language research, including determining what counts as a word, deciding which words should be analyzed, and matching sets of words across languages. We illustrate these challenges by revisiting “Word lengths are optimized for efficient communication” (Piantadosi, Tily, & Gibson, (...)
    Download  
     
    Export citation  
     
    Bookmark   6 citations