Term
|
Definition
PHONETICS: the study of the production (physiology) and perception (acoustics) of speech sounds. Physiological Phonetics: about speech mechanism and how speech sounds are produced Acoustic Phonetics: how the speech sounds are heard/perceived |
|
|
Term
|
Definition
Two ways of transcribing will be covered in this course: Broad Transcription: tries to capture the speech sounds that make up a word Indicated by virgules / / Narrow Transcription: includes detail to captures precisely how each sound is produced (for example, nasality) Indicated by brackets [ ] |
|
|
Term
IPA The International Phonetic Alphabet |
|
Definition
There are 44 symbols for transcription OF ENGLISH. ONE SYMBOL = ONE SOUND Symbols represented sounds NOT our orthographic system (spelling).
IPA symbols not used in transcribing English: c, q, x, y
Please note the difference between phonetics and phonics. Phonetics is the study of the production and perception of speech sounds. That is the focus of this course. Phonics is the link between speech sounds and the printed alphabet of a language. THIS IS NOT of interest in this course. |
|
|
Term
|
Definition
This course covers descriptive phonetics, a system of describing how sounds are made. We are not interested in prescriptive phonetics, which mandates “correct” pronunciation. Why? |
|
|
Term
|
Definition
Speech Mechanism is a term used to refer to the various structures and systems , discussed in this lecture, that are involved in producing speech sounds. oral cavity – space inside mouth alveolar ridge – ridge behind top teeth nasal cavity – above oral cavity (separated by palate) pharynx - throat pharyngeal cavity – space in throat hard palate – bone at roof of mouth soft palate (velum) – muscular tissue at back of mouth & beginning of throat. It can open/close the passage from the throat to the nasal cavity. This is important for directing air to oral cavity to speak. velopharyngeal port – space between nasal and pharyngeal cavities
Why do you think cleft palate patients are often very nasal? Hint: A cleft palate occurs when the hard palate doesn’t fuse during gestation. How would that affect the soft palate? |
|
|
Term
|
Definition
Inspiration – breathing in Chest expands Diaphragm contracts
Expiration – breathing out Chest contracts Diaphragm expands The respiratory system consists of lungs, rib cage, thorax, abdomen, trachea, and muscles associated with breathing. The main muscles for inhalation is the diaphragm. When relaxed, the diaphragm is shaped like a bowl. It separates the chest cavity from the abdominal cavity. When the it contracts, the diaphragm moves downward toward the abdominal area causing the chest cavity to expand. This lowers the air pressure in the lungs causing air in the environment to be pulled into the lungs through air passages. This process is called inhalation/inspiration/breathing in.
As the diaphragm relaxes, space in the chest cavity is reduced forcing air out of the lungs. This is called exhalation/expiration/breathing out. Neurons signal the diaphragm and the intercostal muscles to regulate the contractions which initiate the breathing process.
Breathing is controlled by the medulla oblongata. The diaphragm is innervated by the phrenic nerves.
Respiration enables phonation. |
|
|
Term
|
Definition
Phonation is the process of generating voice. Through controlled exhalation, the respiratory system passes air through the larynx. The larynx is the main structure of the phonatory system. The vocal folds (VF) are the central and most important part of the larynx and, using the controlled pressurized subglottal* air passed through them, generates voice.
-glottis = space between vocal folds -*subglottal = area below the vocal folds -adduction of VF = closing the VF - abduction of VF = opening the VF ADDUCTOR muscles: lateral cricoarytenoid, interarytenoid (transvers and oblique portions) ABDUCTOR muscles: posterior cricoarytenoid Muscles that ELONGATE or TENSE the VF: cricothyroid (pars recta and pars oblique portions)
The primary role of vocal folds is not to generate voice. The primary role of VF is keep foreign objects (such as food, drink, saliva) out of the airway or to forcefully expel them if they do fall below the VF. |
|
|
Term
|
Definition
The resonatory system is made up of three cavities within the vocal tract. (The vocal tract = all speech-related systems that are located above the VF.)
Resonance plays a major role in determining the quality and characteristic of the voice.
A healthy voice has a specific ratio of oral to nasal resonance.
The resonatory system is comprised of pharyngeal, oral and nasal cavities. The pharyngeal cavity is a tube of muscles and membrane that extends from the epiglottis to the soft palate. It is divided into three portions: Nasopharynx (hyperpharynx): extends from upper part of nasal cavity to soft palate can be closed of from the oropharynx when the velum closes off the velopharyngeal port Oropharynx (mediopharynx):extends from soft palate to hyoid bone opens to mouth can create a variety of forms for speech Laryngopharynx (hypopharynx):extends from oropharynx to the entrance of the esophagus and sits on top of the trachea vibrating mechanism that houses the vocal folds The nasal cavity extends from the nares (nostrils) to the pharynx (throat).
The opening/closing of the velopharyngeal port is important to resonance.
Resonance within the nasal cavity adds a nasal quality to the voice.
The velopharyngeal valve/port is formed by the velum coming into contact with the pharyngeal wall. It must be closed to produce English vowels and non-nasal consonants. The oral cavity is the mouth area, extending from the lips to the soft palate.
Air resonates within the oral cavity. The shape it forms is important for the sounds made.
The oral cavity contains most of the articulators that form speech sounds for the English language. |
|
|
Term
|
Definition
Articulators are parts of the speech mechanism that form different sounds: tongue, lips, jaw (mandible), hard palate, soft palate (velum), teeth, glottis (space between VF).
The process of coordinating articulators for speech is driven by centers in the brain. |
|
|
Term
|
Definition
Sound: an audible disturbance of a medium produced by a source. Frequency is measured in Hz (cycles per second of vocal fold vibration for voice) Average male voice: 125 Hz Average female voice: 200-250 Hz Spectrogram: speech made visible vertical striations represent vocal folds opening and closing y-axis: frequency x-axis: time darkness: intensity gap: likely gap in speech stacks of energy: vowel |
|
|
Term
|
Definition
The “sound generator” is the larynx.
Approximate fundamental frequency (fo) of the human voice Infant cooing: 380 Hz 9-year-old talking: 260 Hz Adult woman: 256 Hz Adult man: 128 Hz /s/ or /z/: 4000-8000 Hz |
|
|
Term
|
Definition
Phone: a speech sound Phones produced by a speaker of a language will fall into a category of a recognizable phoneme. Phoneme: sounds of a language consonants Vowels English has approximately 44 phonemes. Allophones: various ways of making 1 sounds that still falls into the same phoneme category *The difference in 2 allophones is not phonemic. For example peach and beach start with phonemic differences, creating two separate words because /p/ and /b/ are different sounds. Allophones, on the other hand, have a phonetic difference and do not change the meaning . For example the /p/ in spy vs. pie Broad transcription captures phonemic transcription /p/ Narrow transcription captures phonetic transcription /ph/ stress: emphasis of a word created by the degree of respiratory effort
morpheme: smallest unit of sound that represents meaning
grapheme: letter or combination of letters that represents a speech sound
orthography: refers to letters of an alphabet and rules on how they form words. Orthographic spelling is NOT the same as phonetic transcription. It is what you are reading right now. |
|
|
Term
|
Definition
Place: where in the vocal tract For example: bilabial, labiodental, interdental, alveolar, palatal, velar, glottal, alveolopalatal, uvular, retroflex ***Review articulators from last presentation!
Manner: For example: stop, fricative, affricate, nasal, trill, flap, approximant
Voice: with or without phonation For example: /s/ or /z/ |
|
|
Term
|
Definition
Three Dimensions / Parameters Tongue Height: high to low Tongue Carriage: front to back Lip Rounding: rounded or unrounded Tension: tense to lax (such as short and long vowels) ALL vowels are voiced in the English language.
Terminology Vowel and Consonant (nouns): the phonemes Vocalic and Consonantal (adjectives): describes the properties Monophthong: pure vowel /i/ (meet) Diphthong: vowels that go together /eɪ/ (day), /oʊ/ (goat) |
|
|
Term
|
Definition
Consonants are classified by voice, place, manner, and described in that order.
Voice: with or without phonation For example: /s/ or /z/
Place: where in the vocal tract For example: bilabial, labiodental, interdental, alveolar, palatal, velar, glottal, alveolopalatal, uvular, retroflex ***Review articulators from last presentation!
Manner: how sound is being produced For example: stop, fricative, affricate, nasal, trill, flap, approximant |
|
|
Term
Classification Place of Articulation |
|
Definition
bilabial – two lips labiodental – lip & teeth interdental – tongue between teeth alveolar – tongue tip to alveolar ridge palatal – tongue at or near hard palate velar – back of tongue and soft palate glottal – space between |
|
|
Term
Classification Manner of Articulation |
|
Definition
stops (plosives) – briefest speech sound / air stopped fricatives – friction affricates –combination stop & fricative nasals – soft palate open / nasal resonance glides – semi vowel approximate / gliding movement of the active articulator from a partly constricted position into a more open position liquids – lateral (airstream flows along tongue, L-like) and rhotic consonants (R-like) |
|
|
Term
|
Definition
voiced – with laryngeal vibration voiceless – without |
|
|
Term
Morphemes/s/ & /z/ forming plurals |
|
Definition
Regular plurals in English are formed in two ways. book + /s/ car + /z/ 3 allomorphs of regular English plurals glass + /əz/
The plural allomorph is determined by the preceding sound. /s/ follows a voiceless sound /z/ follows a voiced sound /əz/ follows /s/ or /z/
Free morphemes can stand alone and be meaningful (the words book, car, glass above). These are nouns, root verbs, adjectives, etc Bound morphemes must be attached to a free morpheme. For example, plural markers shown above, past tense markers, prefixes, and suffixes |
|
|
Term
Morphemes/t/ & /d/ forming regular past tense |
|
Definition
Regular plurals in English are formed in two ways. toss + /t/ grab + /d/ 3 allomorphs of regular English past tense load + /əz/
The past tense allomorph is determined by the preceding sound. /t/ follows a voiceless sound such as /s/ /d/ follows a voiced sound such as /b/ /əd/ follows /d/ or /t/
Free morphemes can stand alone and be meaningful (the words book, car, glass above). These are nouns, root verbs, adjectives, etc Bound morphemes must be attached to a free morpheme. For example, plural markers shown above, past tense markers, prefixes, and suffixes |
|
|
Term
|
Definition
Sibilants are fricatives with greater acoustic energy and more high-frequency components than other fricatives.
Sibilants are: /s/ /z//ʃ//ʒ/ |
|
|
Term
|
Definition
hybrid features of stop & fricative |
|
|
Term
|
Definition
soft palate in open position, letting air into nasal cavity |
|
|
Term
|
Definition
|
|
Term
|
Definition
Syllabics are consonant phonemes that serve as an entire syllable. ˌ beneath a consonant, such as m, n, or l, indicates that it is a consonant that functions as an entire syllable. |
|
|
Term
|
Definition
Obstruents require a build up of air pressure which are important for making the sound. (obstruction of air)
They include stops, fricatives, and affricates
Stops, fricatives, and affricates are considered “pressure” consonants. They are referred to as such because build up of pressure in the oral cavity (by closing the velopharyngeal port and closing off the nasal cavity) is required to produce them. Individuals with cleft palate have difficulty producing pressure consonants. Why? |
|
|
Term
|
Definition
Some consonants can be produced without velopharyngeal closure, including /h/, /w/, /j/, /l/ and /r/.
Sonorants are produced with a relatively open vocal tract. This creates a resonant quality of the sounds.
Sonorants include: nasals, glides, and liquids |
|
|
Term
|
Definition
Speech consists of overlapping motor movements . • In typical speech, we average about 14 phonemes per second • The articulators are not all capable of the same speed of motion. • The TONGUE is faster than the jaw and lips are. • Phonemes are affected, not only by the immediate, adjacent segments, but sometimes by a phonemes a few segments away. |
|
|
Term
|
Definition
Assimilation (also called consonant harmony) The production of a word is influenced by a particular sound in that word. Remote ( or Non -Contiguous Assimilation) –a phoneme is influenced by a feature of another phoneme that is not directly preceding or following it. Ex: “dog” becomes “gog', “duck” becomes “guck” , “yellow” becomes “lello” Assimilation is also called Regressive Assimilation because a later sound affects one that comes before it Coalescence occurs when two neighboring speech sounds merge to form a new and different segment. Example: sandwich becomes “sammich” Contact (or Contiguous Assimilation)
the adaptive articulatory change that results in neighboring sound segments becoming similar in their production. This is a type of motor simplification or economy of effort. Ex: Influence of /u/ causes /s/ to be lip -rounded in Sue. The last segment in “phone” becomes a bilabilal /m/ in “phone booth”, at a normal rate sounds like: “phomebooth” /mp/ in pumpkin becomes velar nasal because of the velar /k/. “punkin” /pʌŋkɪn/
Spreading Spreading is a form of anticipatory coarticulation. In saying “see” lip spreading occurs in anticipation of /i/. In contrast, in saying “sue” lip rounding occurs with the initial consonant in anticipation of /u/ . Compare by saying the following. Then have a partner say both words/phrases and watch their lips. Watch for anticipatory lip rounding and lip spreading. |
|
|
Term
|
Definition
Sound segments (speech sounds) relate to what we say/produce. Suprasegmental features relate to how we say something to convey meaning. Suprasegmental phenomena extend beyond a single segment in an utterance Suprasegmentals • Sound segments (speech sounds) relate to what we say/produce. • Suprasegmental features relate to how we say something to convey meaning. • Suprasegmental phenomena extend beyond a single segment in an utterance. • Suprasgmentals include: Intonation Stress Rate Juncture |
|
|
Term
Intonation
vocal pitch contour |
|
Definition
Vocal pitch contour reveals a lot of information
•Intonation contours
–where rise and fall pattern of
sentence reflects emphasis, importance, and/or
meaning
“You bought that.” versus “You bought that?” |
|
|
Term
|
Definition
Rate
–
how fast or slowly someone says something
Increased rate may result in vowel reduction
Clear precise speech is slower than conversational |
|
|
Term
|
Definition
|
|
Term
|
Definition
Loudness: relative sound intensity of a syllable
Pitch: relative height of syllable /
perceptual correlate of
frequency
Tone: changes in pitch that function linguistically at the
word (morpheme) level
Intonation: refers to pitch when it functions
linguistically at the semantic level
Intonation contours
where rise and fall pattern of
sentence reflects emphasis, importance, and/or meaning
“You bought that.” versus “You bought that?”
Duration: relative length of a syllable |
|
|
Term
|
Definition
1968 Linguist, D.Stampe came up with a new way of evaluating developmental speech errors This system looks for patterns that are followed like rules •Stampe called these errors phonological processes (a pattern of sound change) •MULTIPLE EXAMPLES must be identified in speech sample in order to classify errors as patterns/phonological processes -A good clinician will always be logical. NO MATTER WHAT the kid does, they will describe the error and patterns(not simply rely on this list) |
|
|
Term
|
Definition
-Consonant Clusters Consonant Clusters occur when two or three consonants occur sequentially in a word.*For example:play,sprig •Consonant Cluster Reduction: •Part of the cluster is omitted. •hand “han” •street “treet” •tree “tee” •Consonant Cluster Simplification: •Child begins to produce a cluster and reduces it. As they start to add the additional element, they simplify it. •tree “twee” •Cluster Deletion •The entire cluster is deleted. •paste “pa” •Coalesence •In a cluster the child does not delete and element, but combines features. •swim “fim” Varies by cluster type. Often suppressed by age 3;5 to 5;0. Some cases have shown instances of this process through age 8;0 or 9;0. |
|
|
Term
|
Definition
-Consonant Clusters Consonant Clusters occur when two or three consonants occur sequentially in a word.*For example:play,sprig •Consonant Cluster Reduction: •Part of the cluster is omitted. •hand “han” •street “treet” •tree “tee” •Consonant Cluster Simplification: •Child begins to produce a cluster and reduces it. As they start to add the additional element, they simplify it. •tree “twee” •Cluster Deletion •The entire cluster is deleted. •paste “pa” •Coalesence •In a cluster the child does not delete and element, but combines features. •swim “fim” Varies by cluster type. Often suppressed by age 3;5 to 5;0. Some cases have shown instances of this process through age 8;0 or 9;0. |
|
|
Term
|
Definition
-Consonant Clusters Consonant Clusters occur when two or three consonants occur sequentially in a word.*For example:play,sprig •Consonant Cluster Reduction: •Part of the cluster is omitted. •hand “han” •street “treet” •tree “tee” •Consonant Cluster Simplification: •Child begins to produce a cluster and reduces it. As they start to add the additional element, they simplify it. •tree “twee” •Cluster Deletion •The entire cluster is deleted. •paste “pa” •Coalesence •In a cluster the child does not delete and element, but combines features. •swim “fim” Varies by cluster type. Often suppressed by age 3;5 to 5;0. Some cases have shown instances of this process through age 8;0 or 9;0. |
|
|
Term
|
Definition
Insertion of a vowel between consonants in a cluster. •/bəlu/ for “blue” -Typically suppressed between ages 2;6 and 8;0 |
|
|
Term
|
Definition
•Insertion of a vowel between consonants in a cluster. •/bəlu/ for “blue” -Typically suppressed between ages 2;6 and 8;0 |
|
|
Term
|
Definition
-Making a velar sound and alveolar. •t/k d/g/ n/ŋ (error/target) -Typically suppressed by age 3;6 |
|
|
Term
Consonant Assimilation (Harmony) |
|
Definition
-The pronunciation of the whole word is influenced by a particular sound in that word. -/gag/ for “dog” -“keɪk” for “take” -Will vary with assimilation type. Should not persist beyond age 3;0. |
|
|
Term
|
Definition
-Deletion of a consonant at the end of a syllable or word •/tɑ/ for “top -Often suppressed by age 2;2 |
|
|
Term
|
Definition
-Substitution of a glide for a liquid -/waɪt / for /raɪt/ -Only pertains to /r/ and /l/ that are changed, NOT to /ɚ/ and /ɝ -Often suppressed by age 5;0 to 7;0. |
|
|
Term
|
Definition
-Substitution of a stop for a fricative -/ɪd/ for /ɪz/ -As fricatives and affricates are acquired at different ages, stopping is not a unified phonological process. See next page for summary of suppression of stopping. |
|
|
Term
|
Definition
-Substitution of a vowel, typically /ʊ/ for syllabic [l] -/bɑɾʊ/ for “bottle” -Typically suppressed by age 4;7 |
|
|
Term
|
Definition
-Replication of a syllable. A two syllable word with different syllables turned into a replicated syllable -/wa wa/ for “water” /kiki/ for “kitty” -Common during first fifty word stage. Typically suppressed by age 1;6 to 1;9 |
|
|
Term
|
Definition
-The loss of r-coloring in central vowels with r-colering /ɝ/ and /ɚ/ -/mʌðɚ/ becomes /mʌðə/ -Typically suppressed by age 4;0 |
|
|
Term
Context Sensitive Voicing |
|
Definition
-Voicing a consonant before a vowel -/doʊ/ for /toʊ/ or /gi/ for /ki/ |
|
|
Term
|
Definition
-A normally voiced sound is replaced by a voiceless sound. -/pit/ for “beet” |
|
|
Term
Unstressed/Weak Syllable Deletion |
|
Definition
-Deletion of an entire weak syllable -/nænə/ for “banana” -Often suppressed by age 2;0. May last until approximately 4;0 |
|
|
Term
|
Definition
-Deletion of a consonant at the end of a word or syllable -/pæ/ for “pat” |
|
|
Term
|
Definition
-Backing -Initial Consonant Deletion -Glottal Substitution
"BIG" |
|
|
Term
Initial Consonant Deletion |
|
Definition
-Deletion of a consonant at the beginning of a word or syllable -/æt/ for “pat” |
|
|
Term
|
Definition
-Substitution of a glottal stop (in some cases glottal fricative) for another consonant. -/aʔɪn/ for “Austin” |
|
|
Term
|
Definition
-An alveolar sound is made velar. -k/t g/d ŋ/n(error sound/target sound) |
|
|
Term
|
Definition
Acoustics is a branch of physics devoted to study of sound. •Acoustic phonetics is a branch of phonetics that documents the transmission properties of speech sounds. •/s/ and /z/ contain high frequency components (from approx.4,000 to 12,000 Hz), which gives them their characteristic quality •Vowels contain more intense frequency areas, below 4,000 Hz |
|
|
Term
|
Definition
Sound –sensation produced by the hearing mechanism by vibration from the disturbance of a medium produced by a source(propagation of air molecules) •Vibrations vary according to their frequency, intensity, and duration. •The result of this disturbance is a propagation of air molecules -This alternating pattern of high and low pressure areas moving outward from a vibrating object form a sound wave |
|
|
Term
|
Definition
•Pure Tone (sine wave)acoustic energy at just one discrete energy •Frequency is measured in Hz (Herz). •Hz = # of cycles per second •Amplitude (volume)= distance above or below displacement of sound |
|
|
Term
|
Definition
-Pure tones (sine waves) do not typically occur in nature.Nearly all sounds heard in our natural environment involve energy at many frequencies. •There are two pieces of equipment that produce a pure tone: •pure tone generator •Audiometer •Sine waves are periodic, meaning there is a predictable, duplicated, up and down pattern |
|
|
Term
|
Definition
•Two pure tones are “in phase” with each other when cycles of vibration are occurring at exactly the same time. •Beat –when two sound waves of different frequencies are hard simultaneously, the interference between the sounds (beat) is perceived as variations in volume. (like a pulsing sound) •Beat Frequency = difference between two frequencies. •Example First Wave at40Hz, Second Wave at 50Hz | Combined Wave (Beat Frequency) = 10Hz •Two pure tones (of the same frequency) are “out of phase” with each other when cycles peak and trough at opposite times. The amplitude of the resulting wave is zero and you hear nothing. This is called standing wave phenomenon |
|
|
Term
Adding additional pure tones |
|
Definition
Adding additional pure tones creates energy at more than one frequency, creating a complex tone |
|
|
Term
|
Definition
-Frequency relates directly to pitch. •Pitch is a sensation. •Frequency is a fact of physics. It is measurable. •Pitch is a psychological phenomenon. •There is NOT a linear relationship! As frequency gets higher, it takes longer to affect a change in the sensation of pitch. |
|
|
Term
|
Definition
•dB (decibel) is logarithmic unit which expresses sound intensity •dB SPL -dB Sound Pressure Level –measure for sound in air •sound pressure level meter –volume measurement •dB HL -dB Hearing Level –used in audiograms as a measure of hearing loss |
|
|
Term
|
Definition
•Intensity relates directly to loudness. •Intensity is a measurable physical property of the acoustic signal. •Loudness is the subjective, psychological sensation of judged volume. |
|
|
Term
|
Definition
•20 Hz to 20,000 Hz •Below 20 Hz ---subsonic •Above 20,000 Hz ---ultrasonic •Speech: 100 -5000 Hz •Bats: 20KHz -100KHz (sonar) •Noise Notch –Hearing Loss at around 4K cause by constant noise exposure. |
|
|
Term
|
Definition
Sound generator:the larynx •Fundamental frequency (F0) of human voice •Infant cooing380 Hz •9yr.old talking 260 Hz •Adult woman 256 Hz •Adult man 128 Hz •/s/ or /z/ 4000-8000 Hz |
|
|
Term
|
Definition
-Periodic (or quasi periodic) •Vowels •When successive disturbances of air causing sound waves occurs at regular intervals and are all the same shape they produce a periodic wave. Vowels are quasi periodic. •Aperiodic •Transient (stop consonant) •Modifications made to air passing through vocal tract produces transient/aperiodic sound waves. |
|
|
Term
|
Definition
-Spectrogram: speech made visible •vertical striations represent VF opening and closing (glottal pulse) •y-axis: frequency •x-axis: time •darkness: intensity •gap: likely gap in speech •stacks of energy: vowel •The bottom line is the voice bar, a person’s fundamental frequency (F0) |
|
|
Term
|
Definition
•Periodic complex vibrations produce signals in which the component frequencies are multiples of the lowest frequency of pattern repetition, or fundamental frequency. •Plucking a guitar string at 100 Hz also generates harmonics at 200 Hz, 300 Hz, 400 Hz, etc |
|
|
Term
|
Definition
-A resonator is something that is set into forced vibration by another vibration. •Resonators do not initiate the sound energy. •Oral cavity isn’t where a vowel sound originates, it has an impact on the sound but does not initiate it. |
|
|
Term
|
Definition
Resonances of the vocal tract •The vocal tract acts as a filter for sound produced by the vocal folds •The vocal tract is a variable resonator •As its shape varies, the formants change in frequency •The frequency of a formant is measured at the center of the band of energy. •Formants are seen on a spectrogram when vowels are produced. •A vowel has a stack of bands of energy. •The human ear only needs to hear F1 and F2 to differentiate among all vowels. -F2 tongue carriage -F1 vowel height -F0 vocal folds |
|
|
Term
|
Definition
•Fundamental frequency •Determined by length and mass of vocal folds. •The bottom line of a spectrogram is the voice bar, a person’s fundamental frequency (F0) •Everyone has an optimal pitch that is just the right pitch for their vocal anatomy. •It should be their habitual pitch. •Not talking at this optimal pitch can cause fatigue, hoarseness, and vocal nodules. |
|
|
Term
|
Definition
•Related to tongue height •As the tongue moves from a high to low position, the pharyngeal cavity decreases in volume. Vowels with lower tongue positions will have smaller pharyngeal cavities that will resonate to higher frequencies. •High F1= low vowel •Low F1= high vowel |
|
|
Term
|
Definition
F2 •Correlates with the length of the oral cavity •Tongue retraction and lip rounding extend the length of the oral cavity and lower F2 •High F2= front vowel •Low F2= back vowel |
|
|
Term
|
Definition
VOT –voice onset time •Time that elapses between the beginning of a word and phonation differentiates voiced from voiceless sounds. •Categorical Perception –we hear categories of sounds •Stop consonants are perceived categorically •We can’t hear the degree of change in VOT, we simply note the difference, for example in /ti/ versus /di/. |
|
|
Term
|
Definition
•Semivowels •Glides /w/ /j/ •Liquids /r/ /l/ •Formant structures are like vowels and diphthongs •Nasal consonants •antiresonances |
|
|
Term
|
Definition
-There are two ways to get a computer to talk -Speech synthesis -Digitized speech |
|
|
Term
Speech Synthesis (TTS Text to Speech) |
|
Definition
-The task of the synthesizer is to take written text and turn it into an intelligible utterance -Formant synthesis by rule Spoken phonemes are articulated in a context of adjacent phonemes: “saw me” versus “saw off” (more nasality in first selection) -Siri is an example of synthesized speech. |
|
|
Term
|
Definition
Reading machines for the blind OCR: optical character recognition Users prefer a high speaking rate (up to 600 words per minute) Formant synthesis is faster than concatenative synthesis |
|
|
Term
|
Definition
-Concatenative synthesis by computer assembly of speech from pieces of natural speech but without coarticulation.(concatenation – stringing segments together) concatenative synthesis is when prerecorded speech sounds are connected to form new words and/or phrases. -$146.95 balance in checking Your checking account balance is One Hundred Forty Six Dollars -What is the unit that should be stored? Word: impractical because there are too many to store (500,000+; lack of coarticulation at word boundaries would result in unnatural connected speech Syllable: impractical for same reasons Phoneme: large coarticulatory effects between adjacent phonemes -Diphone Acoustic piece of speech from the middle of one phoneme to the middle of the next phoneme Contains acoustic transitions from one phoneme to the next Requires minimum of 1,000 diphones to synthesize unrestricted English text And Highly intelligible Storing, selecting, smoothly concatenating snippets of speech |
|
|
Term
|
Definition
Coarticulation can extend over several phonemes; diphones can’t capture this. |
|
|
Term
|
Definition
Advantage With a limited program and limited rules a computer could produce and unlimited amount of words.
Disadvantage Speech does not sound natural. No suprasegmentals! |
|
|
Term
|
Definition
Digitized speech varies from synthesized speech in that the full text is prerecorded. High fidelity recording Recording a human being thus it is very natural For example your outgoing message on voicemail is your voice. LIMITED to exactly what you record You must record every variation necessary. |
|
|
Term
|
Definition
Speech recognition is more complex than speech synthesis.
Synthesis requires relatively small amount of programming.
Recognition is more challenging because the computer has to be programmed to recognize spoken language and the MANY variables of pronunciation. Problems: ‘unit’ ‘unintentional’ Complete the project; Project your voice She gave her dog food. Reach, react, create, breath |
|
|