... LLCSondhi, M.M. & Schroeter, J. “Speech Production Models and Their Digital Implementations” Digital Signal Processing HandbookEd. Vijay K. Madisetti and Douglas B. WilliamsBoca Raton: CRC Press ... NasalCouplingNasalsoundsareproducedbyopeningthevelumandtherebycouplingthenasalcavitytothevocaltract.Innasalconsonants,thevocaltractitselfisclosedatsomepointbetweenthevelumandthelips,andalltheairflowisdivertedintothenostrils.Innasalvowelsthevocaltractremainsopen.(NasalvowelsarecommoninFrenchandseveralotherlanguages.TheyarenotnominallyphonemesofEnglish.However,somenasalizationofvowelscommonlyoccursinEnglishspeech.)Intermsofchainmatrices,thenasalcouplingcanbehandledwithouttoomuchadditionaleffort.Asfarasitsacousticalpropertiesareconcerned,thenasalcavitycanbetreatedexactlylikethevocaltract,withtheaddedsimplificationthatitsshapemayberegardedasfixed.Thecommonassumptionisthatthenostrilsaresymmetric,inwhichcasethecross-sectionalareasofthetwonostrilscanbeaddedandthenosereplacedbyasingle,fixed,variable-areatube.ThedescriptionofthecomputationsiseasiertofollowwiththeaidoftheblockdiagramshowninFig.44.5.FromaknowledgeoftheareafunctionsandlossesforthevocalandnasaltractsthreechainmatricesKgv,Kvt,andKvnarefirstcomputed.Theserepresent,respectively,thematricesfromglottistovelum,velumtotractclosure(orvelumtolips,incaseofanasalvowel),andvelumtonostrils.FromKvnwithsomeassumedimpedanceterminationatthenostrils,theinputimpedanceofthenostrilsatthevelummaybecomputedasindicatedinEq.(44.16b).Similarly,Kvtgivestheinputimpedanceatthevelum,ofthevocaltractlookingtowardthelips.Atthevelum,thesetwoimpedancesarecombinedinparalleltogiveatotalimpedance,sayZv.Withthisastermination,thevelocitytovelocitytransferfunction,Tgv,fromglottistovelumcanbecomputedfromKgvasshownc1999byCRCPressLLCFIGURE44.5:Chainmatricesforsynthesizingnasalsounds.inEq.(44.16b).Foragivenvolumevelocityattheglottis,Ug,thevolumevelocityatthevelumisUv=TgvUg,andthepressureatthevelumisPv=ZvUv.OncePvandUvareknown,thevolumevelocityand/orpressureatthenostrilsandlipscanbecomputedbyinvertingthematricesKvn and Kvt.44.4 SourcesofExcitationAsmentionedearlier,speechsoundsmaybeclassifiedbytypeofexcitation:periodic,turbulent,ortransient.Allofthesetypesofexcitationarecreatedbyconvertingthepotentialenergystoredinthelungsduetoexcesspressureintosoundenergyintheaudiblefrequencyrangeof20Hzto20kHz.Thelungsofayoungadultmalemayhaveamaximumusablevolume(“vitalcapacity”)ofabout5l.Whilereadingaloudthepressureinthelungsistypicallyintherangeof6to15cmofwater(6000to15000Pa).Vocalcordvibrationscanbesustainedwithapressureaslowas.2cmofwater.Attheotherextreme,apressureashighas195cmofwaterhasbeenrecordedforatrumpetplayer.Typicalaverageairflowfornormalspeechisabout0.1l/s.Itmaypeakashighas5l/sduringrapidinhalesinsinging.Periodicexcitationoriginatesmainlyatthevibratingvocalfolds,turbulentexcitationoriginatesprimarilydownstreamofthenarrowestconstrictioninthevocaltract,andtransientexcitationsoccurwheneveracompleteclosureofthevocalpathwayissuddenlyreleased.Inthefollowing,wewillexplorethesethreetypesofexcitationinsomedetail.Theinterestedreaderisreferredto[18]formoreinformation.44.4.1 ... SynthesisThevocaltractisapproximatedbyaconcatenationofabout20uniformsections.Thecross-sectionalareasofthesesectionsiseitherspecifieddirectly,orcomputedfromaspecificationofarticulatoryparametersasshowninFig.44.3.Thechainmatrixforeachsectioniscomputedatanadequatesamplingrateinthefrequencydomaintoavoidtime-aliasingofthecorrespondingtimefunctions.(Computationofthechainmatricesrequiresaspecificationofthelossesalso.Severalmodelsexistwhichassignthelossesintermsofthecross-sectionalarea[11,16]).Thechainmatricesfortheindividualsectionsarecombinedtoderivethematricesforvariousportionsofthetract,asappropriatefortheparticularspeechsoundbeingsynthesized.Forvoicedsounds,thematricesforthesectionsfromtheglottistothelipsaresequentiallymultipliedtogivethematrixfromtheglottistothelips.Fromthek11,k12,k21,k22componentsofthismatrix,thetransferfunctionUoutUinandtheinputimpedanceareobtainedasinEqs.(44.16a )and( 44.16b).KnowingtheradiationimpedanceZRatthelipswecancomputethetransferfunctionforoutputpressure,H=UoutUinZR.TheinverseFFTofthetransferfunctionHandtheinputimpedanceZingivethecorrespondingtimefunctionsh(n)andzin(n),respectively.Thesefunctionsarecomputedevery20ms,andtheintermediatevaluesareobtainedbylinearinterpolation.Forthecurrenttimesamplinginstantn,thecurrentpressurep1(n)attheinputtothevocaltractisthencomputedbyconvolvingzinwiththepastvaluesoftheglottalvolumevelocityug.Withp1known,thepressuredifferencePs−p1onthelefthandsideofEq.(44.22)isknown.Equation(44.18)isdiscretizedbyusingabackwarddifferenceforthetimederivative.Thus,anewvalueoftheglottalvolumevelocityisderived.This,togetherwiththecurrentvaluesofthedisplacementsofthevocalfolds,givesusnewvaluesforthedrivingforcesF1andF2forthecoupledoscillatorEqs.(44.24a) and( 44.24b).Thecoupledoscillatorequationsarealsodiscretizedbybackwarddifferencesfortimederivatives.Thus,thenewvaluesofthedrivingforcesgivenewvaluesforthedisplacementsofthevocalfolds.Thenewvalueofvolumevelocityalsogivesanewvalueforp1,andthecomputationalcyclerepeats,togivesuccessivesamplesofp1,ug,andthevocalfolddisplacements.Theglottalvolumevelocityobtainedinthisway,isconvolvedwiththeimpulseresponseh(n)toproducevoicedspeech.Ifthespeechsoundcallsforfrication,thechainmatrixofthetractisderivedastheproductoftwomatrices—fromtheglottistothenarrowestconstrictionandfromtheconstrictiontothelips,asdiscussedinthesectiononturbulentexcitation.Thisenablesustocomputethevolumevelocityattheconstriction,andthusintroduceanoisesourceonthebasisoftheReynoldsnumber.Finally,toproducenasalsounds,thechainmatrixforthenasaltractisalsocomputed,andtheoutputatthenostrilscomputedasdiscussedinthesectiononchainmatrices.Ifthelipsareopen,theoutputfromthelipsisalsocomputedandaddedtotheoutputfromthenostrilstogivethetotalspeechsignal.Detailsofthesynthesisproceduremaybefoundin[24].References[1]Edwards,H.T.,AppliedPhonetics:TheSoundsofAmericanEnglish,SingularPublishingGroup,SanDiego,1992,Chap.3.[2]Olive,J.P.,Greenwood,A.,andColeman,J.,AcousticsofAmericanEnglishSpeech,SpringerVerlag,NewYork,1993.[3]Fant,G.,AcousticTheoryofSpeechProduction,MoutonBookCo.,Gravenhage,1960,Chap.2.1,93-95.[4]Baer,T.,Gore,J.C.,Gracco,L.C.,andNye,P.W.,Analysisofvocaltractshapeanddimensionsusingmagneticresonanceimaging:Vowels,J.Acoust.Soc.Am.,90(2),799-828,Aug1991.c1999byCRCPressLLC44Speech Production Models and Their Digital...