Part II Language Comprehension
4 Perception of language
5 The Internal Lexicon
6 Sentence Comprehension and Memory
7 Discourse Comprehension and Memory
4 Perception of Language
Main Points:
The study of speech sounds is called phonetics. Articulatory phonetics refers to the study of how speech sounds are produced. Acoustic phonetics refers to the study of the resulting speech sounds.
Speech exhibits characteristics not found in other forms of auditory perception.
The phenomenon of categorical perception suggested that speech is a special mode of perception.
Perception of speech is influenced by the contexts in which it appears. We use top-down processing to identify some sounds in context.
Visual perception of language is achieved through a succession of processing levels. Perception of letters in a word context is superior to perception of isolated or unrelated letters.
Recent models of the perception of language assume that we process information at multiple levels in an interactive way. These models can account for several findings in speech perception and visual word perception.
The structure of speech
The process of speech perception is an extraordinary complex one, for two major reasons: the environmental context often interferes with the speech signal(Under normal listening conditions, the speech we hear competes with other stimuli for our limited processing capacity. Other auditory signals, such as a conversation across the room or someone's sneezing or burping, can interfere with the fidelity of the speech signal); the variability of the speech signal itself(there is no one-to-one correspondence between the characteristics of the acoustic stimulus and the speech sound we hear).
How do we achieve stable phonetic perception when the acoustic stimulus competes with other stimuli and contains a good deal of inherent variability?
The ease with which we recognize phonetic segments suggests that listeners make a series of adjustments in the course of perceptual recognition. Some of these adjustments are based on the implicit knowledge of the way speech sounds are produced.
Prosodic Factors: stress, intonation, and rate. Ferreira(2003) has defined prosody as "a general term that refers to the aspects of an utterance' s sound that are not specific to the words themselves." Prosodic factors influence the overall meaning of an utterance. That is, we can take a given word or utterance and change the stress or intonational pattern and create an entirely different meaning. Prosodic factors are sometimes called
. The same word or sentence may be expressed prosodically in different ways, and these variations become important cues to the speaker's meaning and emotional state. With prosodic variation in mind, we can turn to the smaller speech segments on which prosodic factors are superimposed.
Articulatory Phonetics
The study of speech sounds is called phonetics. Articulatory phonetics refers to the study of how speech sounds are produced.
Speech sounds differ principally in whether the airflow is obstructed and, if so, at what point and in what way. Although vowels are produced by letting air flow from the lungs in an unobstructed way, consonants are produced by impeding the airflow at some point.
The utility of distinctive features is that they allow us to describe the relationships that exist among various speech sounds in an economical manner.
Acoustic Phonetics
One of the most common ways of describing the acoustical energy of speech sounds is called a sound spectrogram.
Each of the spectrograms contains a series of dark bands, called formants, at various frequency levels. Formant transitions are the large rises or drops in formant frequency that occur over short durations of time. These transitions nearly always occur either at the beginning or the end of the syllable. In between is the formant' s steady state, during which format frequency is relatively stable. It is a bit oversimplified but basically correct to say that the transitions correspond to the consonantal portion of the syllable, and the steady state to the vowel.
Parallel transmission: examination some of the acoustic properties of the speech signal. It refers to the fact that different phonemes of the same syllable are encoded into the speech signal simultaneously. There is no sharp physical break between adjacent sounds in a syllable.
Context-conditioned Variation: the phenomenon that the exact spectrographic appearance of a given phone is related to( or conditioned by) the speech context. Context-conditioned variation is closely related to the manner in which syllables are produced, or the manner of articulation.
Summary
Speech may be described in terms of the articulatory movements needed to produce a speech sound and the acoustic properties of the sound. Vowels differ from consonants in that the air-flow from the lungs is not obstructed during production; consonants differ from one another in terms of the manner and place of the obstruction, as well as the presence or absence of vocal cord vibration during articulation.
The acoustic structure of speech sounds is revealed by spectrographic analyses of formants, their steady states, and formant transitions. The spectrographic pattern associated with a consonant is influenced by its vowel context and is induced by the coarticulated manner in which syllables are produced. Moreover, prosodic factors such as stress, intonation, and speech rate also contribute to the variability inherent in the speech signal.
Perception of isolated speech segments
Levels of speech processing:
At the auditory level, the signal is represented in terms of its frequency, intensity, and temporal attributes as with any auditory stimulus.
At the phonetic level, we identify individual phones by a combination of acoustic cues, such as formant transitions.
At the phonological level, the phonetic segment is converted into a phoneme, and phonological rules are applied to the sound sequence.
These levels may be construed as successive discriminations that we apply to speech signal. We first discriminate auditory signals from other sensory signals and determine that the stimulus is something that we have heard. Then we identify the peculiar properties that qualify it as speech, only later recognizing it as meaningful speech of a particular language.
Speech as a modular system:
Lack of Invariance--- the perception of speech segments must occur through a process that is different from and presumably more complex than that of "ordinary" auditory perception. In other words, speech is a special mode of perception.
Categorical perception--- to comprehend speech, we must impose an absolute or categorical identification on the incoming speech signal rather than simply a relative determination of the various physical characteristics of the signal.
Two criteria determine categorical perception: the presence of sharp identification functions and the failure to discriminate between sounds within a given sound class.
The motor theory of speech perception--- listeners use implicit articulatory knowledge---knowledge about how sounds are produced---as an aid in perception. Sounds produced in similar ways but with varying acoustic representations are perceived in similar ways.
Liberman and Mattingly updated the motor theory with regard to current thinking in cognitive psychology. In the revised theory, the claim is that the objects of speech are the intended phonetics gestures of the speaker.
Summary
Various investigators have argued that speech is perceived through a special mode of perception. Part of the argument rests on the failure to find invariant relationships between acoustic properties and perceptual experiences, and part is supported by the empirical phenomena of categorical perception, duplex perception and phonetic trading relations.
The motor theory of speech perception claims that we perceive speech sounds by identifying the intended phonetic gestures that may produce the sounds. Although the status of the concept of phonetic gestures is somewhat controversial, the theory has been supported by studies of visual processing during speech perception. In addition, the theory has implications for neurolinguistics and language acquisition in children.
Perception of continuous speech
Prosodic factors in speech recognition
stress: Martin(1972) has argued that the stress pattern of speech provides cues for listeners to anticipate what is coming next and that listeners tend to organize their perception around stressed syllables.
rate: rate normalization & speaker normalization.
Semantic and Syntactic Factors in speech perception
Context and Speech Recognition: a word isolated from its context becomes less intelligible.
Phonemic Restoration: a most dramatic demonstration of the role of top-down processing of speech signals comes from what is called phonemic restoration.
The trace model of speech perception
...to be continued...