Papers by T. Florian Jaeger
Frontiers in Psychology, 2012
pr ed ict ed re sp on se predicted category boundary Supervised and unsupervised learning in phon... more pr ed ict ed re sp on se predicted category boundary Supervised and unsupervised learning in phonetic adaptation Do people use category labels during adaptation? Our question: Language learning doesn't stop once you reach adulthood: talkers use linguistic cues to realize their intentions in different ways. To adapt to a new talker, you have to learn the way they use cues. If you know their intented meaning, this learning should be a lot easier. Learning with known category labels is called supervised learning, and learning from cues only is called unsupervised learning.
Cognitive Science, 2015
The role of processing constraints on sentence structure has been a topic of central interest in ... more The role of processing constraints on sentence structure has been a topic of central interest in cognitive science. One proposal (Hawkins, 2004) suggests that language production system is organized to facilitate efficient parsing. We experimentally test this hypothesis using a miniature artificial language learning paradigm. Our findings support this account. Even though the input languages did not favor early placement of cues to grammatical function assignment (case and word order), participants used these cues in their own productions significantly more often in such a way as to allow early correct parsing commitments. This preference interacted with a bias to mark the less expected: Participants tended to use more case-marking in non-English OSV sentences. Our results underscore the potential of miniature artificial learning for language production research.
Cognitive Science, 2016
Speech perception is made much harder by variability between talkers. As a result, listeners need... more Speech perception is made much harder by variability between talkers. As a result, listeners need to adapt to each different talker’s particular acoustic cue distributions. Thinking of this adaptation as a form of statistical inference, we explore the role that listeners’ prior expectations play in adapting to an unfamiliar talker. Specifically, we test the hypothesis that listeners will have a harder time adapting to talkers whose cue distributions fall outside the range of normal variation across talkers. We also show that it is possible to infer listeners’ shared prior expectations based on patterns of adaptation to different cue distributions. This provides a potentially powerful tool for directly probing listeners’ prior expectations about talkers that does not rely on speech produced by many different talkers, which is costly to collect and annotate, and only indirectly related to listeners’ subjective expectations.
Frontiers in Psychology, 2021
A central component of sentence understanding is verb-argument interpretation, determining how th... more A central component of sentence understanding is verb-argument interpretation, determining how the referents in the sentence are related to the events or states expressed by the verb. Previous work has found that comprehenders change their argument interpretations incrementally as the sentence unfolds, based on morphosyntactic (e.g., case, agreement), lexico-semantic (e.g., animacy, verb-argument fit), and discourse cues (e.g., givenness). However, it is still unknown whether these cues have a privileged role in language processing, or whether their effects on argument interpretation originate in implicit expectations based on the joint distribution of these cues with argument assignments experienced in previous language input. We compare the former,linguisticaccount against the latter,expectation-basedaccount, using data from production and comprehension of transitive clauses in Swedish. Based on a large corpus of Swedish, we develop a rational (Bayesian) model of incremental argum...
Bilingualism: Language and Cognition, 2019
Native language (L1) processing draws on implicit expectations. An open question is whether non-n... more Native language (L1) processing draws on implicit expectations. An open question is whether non-native learners of a second language (L2) similarly draw on expectations, and whether these expectations are based on learners’ L1 or L2 knowledge. We approach this question by studying inverse preference effects on lexical encoding. L1 and L2 speakers of Spanish described motion events, while they were either primed to express path, manner, or neither. In line with other work, we find that L1 speakers adapted more strongly after primes that are unexpected in their L1. For L2 speakers, adaptation depended on their L2 proficiency: The least proficient speakers exhibited the inverse preference effect on adaptation based on what was unexpected in their L1; but the more proficient speakers were, the more they exhibited inverse preference effects based on what was unexpected in the L2. We discuss implications for L1 transfer and L2 acquisition.
Linguistics Vanguard, 2018
It has long been noted that language production seems to reflect a correlation between message re... more It has long been noted that language production seems to reflect a correlation between message redundancy and signal reduction. More frequent words and contextually predictable instances of words, for example, tend to be produced with shorter and less clear signals. The same tendency is observed in the language code (e.g. the phonological lexicon), where more frequent words and words that are typically contextually predictable tend to have fewer segments or syllables. Average predictability in context (informativity) also seems to be an important factor in understanding phonological alternations. What has received little attention so far is the relation between various information-theoretic indices – such as frequency, contextual predictability, and informativity. Although each of these indices has been associated with different theories about the source of the redundancy-reduction link, different indices tend to be highly correlated in natural language, making it difficult to tease...
Language Learning, 2016
Can recent second language (L2) exposure affect what we judge to be similar events? Using a primi... more Can recent second language (L2) exposure affect what we judge to be similar events? Using a priming paradigm, we manipulated whether native Swedish adult learners of L2 Spanish were primed to use path or manner during L2 descriptions of scenes depicting caused motion events (encoding phase). Subsequently, participants engaged in a nonverbal task, arranging events on the screen according to similarity (test phase). Path versus manner priming affected how participants judged event similarity during the test phase. The effects we find support the hypotheses that (a) speakers create or select ad hoc conceptual categories that are based on linguistic knowledge to carry out nonverbal tasks, and that (b) short‐term, recent L2 experience can affect this ad hoc process. These findings further suggest that cognition can flexibly draw on linguistic categories that have been implicitly highlighted during recent exposure.Open PracticesThis article has been awarded an Open Data badge. All data ar...
Annual Meeting of the Berkeley Linguistics Society, 2006
Speech perception requires ongoing perceptual category learning. Each talker speaks differently, ... more Speech perception requires ongoing perceptual category learning. Each talker speaks differently, and listeners need to learn each talker's particular acoustic cue distributions in order to comprehend speech robustly from multiple talkers. This pho-netic adaptation is a semi-supervised learning problem, because sometimes a particular cue value occurs with information that labels the talker's intended category for the listener, but other times no such labels are available. Previous work has shown that adaptation can occur in both purely supervised (all labeled) and purely unsupervised (all unlabeled) settings, but the interaction between them has not been investigated. We compare unsupervised with (semi-) supervised phonetic adaptation and find, surprisingly, that adult listeners do not take advantage of labeling information to adapt more quickly or effectively , even though the labels affect their categorization. This suggests that, like language acquisition, phonetic adaptat...
Zwischen Kern und Peripherie, 2014
WIREs Cognitive Science, 2010
Functionalist typologists have long argued that pressures associated with language usage influenc... more Functionalist typologists have long argued that pressures associated with language usage influence the distribution of grammatical properties across the world's languages. Specifically, grammatical properties may be observed more often across languages because they improve a language's utility or decrease its complexity. While this approach to the study of typology offers the potential of explaining grammatical patterns in terms of general principles rather than domain‐specific constraints, the notions of utility and complexity are more often grounded in intuition than empirical findings. A suitable empirical foundation might be found in the terms of processing preferences: in that case, psycholinguistic measures of complexity are then expected correlate with typological patterns. We summarize half a century of psycholinguistic work on ‘processing complexity’ in an attempt to make this work accessible to a broader audience: What makes something hard to process for comprehend...
Proceedings of the National Academy of Sciences, 2012
Languages of the world display many structural similarities. We test the hypothesis that some of ... more Languages of the world display many structural similarities. We test the hypothesis that some of these structural properties may arise from biases operating during language acquisition that shape languages over time. Specifically, we investigate whether language learners are biased toward linguistic systems that strike an efficient balance between robust information transfer, on the one hand, and effort or resource demands, on the other hand, thereby increasing the communicative utility of the acquired language. In two experiments, we expose learners to miniature artificial languages designed in such a way that they do not use their formal devices (case marking) efficiently to facilitate robust information transfer. We find that learners restructure such languages in ways that facilitate efficient information transfer compared with the input language. These systematic changes introduced by the learners follow typologically frequent patterns, supporting the hypothesis that some of th...
Language and Cognitive Processes, 2013
Frontiers in Psychology, 2013
Uploads
Papers by T. Florian Jaeger
the uncertainty over the next step in the syntactic derivation (single step entropy) and the surprisal of the verb’s complement.
We additionally estimate word-by-word surprisal and total entropy over parses of the sentence using a probabilistic context-free grammar (PCFG). Surprisal and total entropy, but not single step entropy, were significant predictors of reading times in different parts of the sentence. This suggests that a complete model
of sentence processing should incorporate both entropy and surprisal."
extent to which the speech production system is organized for robust communication. One view holds that speakers’ decision to produce more or less clear signals or to speak faster or slower is primarily or even exclusively driven by the demands inherent to production planning. The opposing view holds that these demands are balanced against the goal to be understood. We investigate the degree of hyperarticulation in the presence of easily confusable minimal pair neighbors (e.g., saying "pill" when "bill" is contextually co-present and thus a plausible alternative). We directly test whether production difficulty alone can explain such hyperarticulation. The results argue against production-centered accounts. We also investigate how specific hyperarticulation is to the segment that contrasts the target against the contextually plausible alternative. Our evidence comes from a novel web-based speech recording paradigm.
as the growing difference in viewing duration between predictable and predictive items. In other words, as participants learned, they processed predictable items increasingly faster. Our results indicate that participants who make implicit predictions as they learn, and have their expectations met, achieve higher learning outcomes on an offline post-test. Potential links between these findings, obtained
with novel stimuli in an experimental context, and the role of prediction in natural language comprehension are considered.
by the principle of efficient information transmission:
Speakers tend to omit elements whose information content is
contextually predictable, while providing more linguistic signal
to convey otherwise less predictable information. However,
previous findings in support of this hypothesis are also compatible
with alternative accounts based on production difficulty.
To distinguish between these competing accounts, we conducted
experiments on speaker’s preference in optional casemarking
in Japanese. The results suggest that Japanese speakers
are more likely to omit the object case-marker when an
associated noun has properties (e.g., animacy) that are prototypical to a grammatical object. Moreover, case-marker omission was facilitated when other elements in a sentence made
the grammatical function assignment more predictable. The
results were obtained with all the factors related to production
difficulty held constant, and thus provide support for the models
of communicatively efficient language production.
non-linguistic categories are probabilistic. However, in linguistic
theory, quantifier meanings have traditionally been defined
set-theoretically in terms of categorical evaluation functions.
In 4 “adaptation” experiments, we provide evidence for
the alternative hypothesis that quantifiers are represented as
probability distributions over scales (e.g., Zadeh, 1965). We
manipulate exposure to different distributions of “some” and
“many” and find that listeners adapt to those distributions, as
predicted. Our results suggest that the interpretation of quantifiers
is best modeled as a process involving rich, probabilistic
representations
Keywords: sentence processing, adaptation, Bayesian modeling
modifications in response to a visual contrast In a video description task with speakers of Yucatec Maya. We analyzed modifications of referring expressions on the part of a speaker, and we also examined the effect of over- and under-informativeness on the listener’s comprehension. We found that prior experience with difficult comprehension did not significantly affect the listener’s rate of informativeness when in the role of speaker, but we found that experience as a speaker did result in reduced rates of under-informativeness. That is to say, as a speaker’s own experience progressed, the speaker became less under-informative. We discuss these results as audience design-based learning.
Keywords: referring expressions, informativeness, audience
design, learning, Yucatec Maya, field-based psycholinguistics
Keywords: Language Comprehension; Ambiguity Resolution;
Learning Effects; Language Experience
has focused on indirect tests of the hypothesis that speakers
aim to keep per-word entropy constant across discourses to
achieve communicative efficiency (Genzel & Charniak, 2002).
We present novel and more direct evidence by examining the
role of topic shift in discourse planning. If speakers aim for
constant per-word entropy, they should encode less unconditional per-word entropy (as estimated based on only sentence internal cues) following topic shifts, as there is less relevant context to condition on. Applying latent topic modeling to a large set of English texts, we find that speakers are indeed sensitive to the recent topic structure in the predicted way.
Keywords: discourse production; topic shift; communicative
efficiency
account for a large proportion of actual human languages.
To explain this distribution, typologists often invoke principles
of human cognition which might make certain orders easier or
harder to learn or use. We present a novel method for carrying
out very large scale artificial language learning tasks over
the internet, which allows us to test large batteries of systematically designed languages for differential learnability. An exploratory study of the learnability of all possible configurations
of subject, verb, and object finds that the two most frequent
orders in human languages are the most easily learned, and
yields suggestive evidence compatible with other typological
and psycholinguistic observations.
Keywords: artificial grammar; language acquisition; language
typology; psycholinguistics; word order
inputs remains a question of ongoing debate. One important data point comes from DeLong et al. (2005) who reported that an N400-like event-related potential correlated with a probabilistic index of upcoming input. This result is often cited as evidence for gradient probabilistic prediction of form and/or semantics, prior to the bottom-up input becoming available. However, a recent multi-lab study reports a failure to find these effects (Nieuwland et al., 2017). We review the evidence from both studies, including differences in the design and analysis approach between them. Building on over a decade of research on prediction since DeLong et al. (2005)’s original study, we also begin to spell out the computational nature of predictive processes that one might expect to correlate with ERPs that are evoked by a functional element whose form is dependent on an upcoming predicted word. For paradigms with this type of design, we propose an index of anticipatory processing, Bayesian surprise, and apply it to the updating of semantic predictions. We motivate this index both theoretically and empirically. We show that, for studies of the type discussed here, Bayesian surprise can be closely approximated by another, more easily estimated information theoretic index, the surprisal (or Shannon information) of the input. We re-analyze the data from Nieuwland and colleagues using surprisal rather than raw probabilities as an index of prediction. We find that surprisal is gradiently correlated with the amplitude of the N400, even in the data shared by Nieuwland and colleagues. Taken together, our review suggests that the evidence from both studies is compatible with anticipatory semantic processing. We do, however, emphasize the need for future studies to further clarify the nature and degree of form prediction, as well as its neural signatures, during language comprehension.