Ngày tải lên :
22/01/2014, 12:20
... LLC
Sondhi, M.M. & Schroeter, J. “Speech Production Models and Their Digital Implementations”
Digital Signal Processing Handbook
Ed. Vijay K. Madisetti and Douglas B. Williams
Boca Raton: CRC Press ... NasalCoupling
Nasalsoundsareproducedbyopeningthevelumandtherebycouplingthenasalcavitytothevocal
tract.Innasalconsonants,thevocaltractitselfisclosedatsomepointbetweenthevelumandthe
lips,andalltheairflowisdivertedintothenostrils.Innasalvowelsthevocaltractremainsopen.
(NasalvowelsarecommoninFrenchandseveralotherlanguages.Theyarenotnominallyphonemes
ofEnglish.However,somenasalizationofvowelscommonlyoccursinEnglishspeech.)
Intermsofchainmatrices,thenasalcouplingcanbehandledwithouttoomuchadditionaleffort.
Asfarasitsacousticalpropertiesareconcerned,thenasalcavitycanbetreatedexactlylikethevocal
tract,withtheaddedsimplificationthatitsshapemayberegardedasfixed.Thecommonassumption
isthatthenostrilsaresymmetric,inwhichcasethecross-sectionalareasofthetwonostrilscanbe
addedandthenosereplacedbyasingle,fixed,variable-areatube.
Thedescriptionofthecomputationsiseasiertofollowwiththeaidoftheblockdiagramshown
inFig.44.5.Fromaknowledgeoftheareafunctionsandlossesforthevocalandnasaltractsthree
chainmatricesK
gv
,K
vt
,andK
vn
arefirstcomputed.Theserepresent,respectively,thematricesfrom
glottistovelum,velumtotractclosure(orvelumtolips,incaseofanasalvowel),andvelumto
nostrils.
FromK
vn
withsomeassumedimpedanceterminationatthenostrils,theinputimpedanceof
thenostrilsatthevelummaybecomputedasindicatedinEq.(44.16b).Similarly,K
vt
givesthe
inputimpedanceatthevelum,ofthevocaltractlookingtowardthelips.Atthevelum,thesetwo
impedancesarecombinedinparalleltogiveatotalimpedance,sayZ
v
.Withthisastermination,the
velocitytovelocitytransferfunction,T
gv
,fromglottistovelumcanbecomputedfromK
gv
asshown
c
1999byCRCPressLLC
FIGURE44.5:Chainmatricesforsynthesizingnasalsounds.
inEq.(44.16b).Foragivenvolumevelocityattheglottis,U
g
,thevolumevelocityatthevelumis
U
v
=T
gv
U
g
,andthepressureatthevelumisP
v
=Z
v
U
v
.OnceP
v
andU
v
areknown,thevolume
velocityand/orpressureatthenostrilsandlipscanbecomputedbyinvertingthematricesK
vn
and
K
vt
.
44.4 SourcesofExcitation
Asmentionedearlier,speechsoundsmaybeclassifiedbytypeofexcitation:periodic,turbulent,or
transient.Allofthesetypesofexcitationarecreatedbyconvertingthepotentialenergystoredinthe
lungsduetoexcesspressureintosoundenergyintheaudiblefrequencyrangeof20Hzto20kHz.
Thelungsofayoungadultmalemayhaveamaximumusablevolume(“vitalcapacity”)ofabout5
l.Whilereadingaloudthepressureinthelungsistypicallyintherangeof6to15cmofwater(6000
to15000Pa).Vocalcordvibrationscanbesustainedwithapressureaslowas.2cmofwater.Atthe
otherextreme,apressureashighas195cmofwaterhasbeenrecordedforatrumpetplayer.Typical
averageairflowfornormalspeechisabout0.1l/s.Itmaypeakashighas5l/sduringrapidinhalesin
singing.
Periodicexcitationoriginatesmainlyatthevibratingvocalfolds,turbulentexcitationoriginates
primarilydownstreamofthenarrowestconstrictioninthevocaltract,andtransientexcitations
occurwheneveracompleteclosureofthevocalpathwayissuddenlyreleased.Inthefollowing,we
willexplorethesethreetypesofexcitationinsomedetail.Theinterestedreaderisreferredto[18]
formoreinformation.
44.4.1 ... Synthesis
Thevocaltractisapproximatedbyaconcatenationofabout20uniformsections.Thecross-sectional
areasofthesesectionsiseitherspecifieddirectly,orcomputedfromaspecificationofarticulatory
parametersasshowninFig.44.3.Thechainmatrixforeachsectioniscomputedatanadequate
samplingrateinthefrequencydomaintoavoidtime-aliasingofthecorrespondingtimefunctions.
(Computationofthechainmatricesrequiresaspecificationofthelossesalso.Severalmodelsexist
whichassignthelossesintermsofthecross-sectionalarea[11,16]).
Thechainmatricesfortheindividualsectionsarecombinedtoderivethematricesforvarious
portionsofthetract,asappropriatefortheparticularspeechsoundbeingsynthesized.Forvoiced
sounds,thematricesforthesectionsfromtheglottistothelipsaresequentiallymultipliedtogive
thematrixfromtheglottistothelips.Fromthek
11
,k
12
,k
21
,k
22
componentsofthismatrix,the
transferfunction
U
out
U
in
andtheinputimpedanceareobtainedasinEqs.(44.16a )and( 44.16b).
KnowingtheradiationimpedanceZ
R
atthelipswecancomputethetransferfunctionforoutput
pressure,H=
U
out
U
in
Z
R
.TheinverseFFTofthetransferfunctionHandtheinputimpedanceZ
in
givethecorrespondingtimefunctionsh(n)andz
in
(n),respectively.Thesefunctionsarecomputed
every20ms,andtheintermediatevaluesareobtainedbylinearinterpolation.
Forthecurrenttimesamplinginstantn,thecurrentpressurep
1
(n)attheinputtothevocaltract
isthencomputedbyconvolvingz
in
withthepastvaluesoftheglottalvolumevelocityu
g
.Withp
1
known,thepressuredifferenceP
s
−p
1
onthelefthandsideofEq.(44.22)isknown.Equation(44.18)
isdiscretizedbyusingabackwarddifferenceforthetimederivative.Thus,anewvalueoftheglottal
volumevelocityisderived.This,togetherwiththecurrentvaluesofthedisplacementsofthevocal
folds,givesusnewvaluesforthedrivingforcesF
1
andF
2
forthecoupledoscillatorEqs.(44.24a)
and( 44.24b).Thecoupledoscillatorequationsarealsodiscretizedbybackwarddifferencesfortime
derivatives.Thus,thenewvaluesofthedrivingforcesgivenewvaluesforthedisplacementsofthe
vocalfolds.Thenewvalueofvolumevelocityalsogivesanewvalueforp
1
,andthecomputational
cyclerepeats,togivesuccessivesamplesofp
1
,u
g
,andthevocalfolddisplacements.
Theglottalvolumevelocityobtainedinthisway,isconvolvedwiththeimpulseresponseh(n)to
producevoicedspeech.
Ifthespeechsoundcallsforfrication,thechainmatrixofthetractisderivedastheproductoftwo
matrices—fromtheglottistothenarrowestconstrictionandfromtheconstrictiontothelips,as
discussedinthesectiononturbulentexcitation.Thisenablesustocomputethevolumevelocityat
theconstriction,andthusintroduceanoisesourceonthebasisoftheReynoldsnumber.
Finally,toproducenasalsounds,thechainmatrixforthenasaltractisalsocomputed,andthe
outputatthenostrilscomputedasdiscussedinthesectiononchainmatrices.Ifthelipsareopen,
theoutputfromthelipsisalsocomputedandaddedtotheoutputfromthenostrilstogivethetotal
speechsignal.Detailsofthesynthesisproceduremaybefoundin[24].
References
[1]Edwards,H.T.,AppliedPhonetics:TheSoundsofAmericanEnglish,SingularPublishing
Group,SanDiego,1992,Chap.3.
[2]Olive,J.P.,Greenwood,A.,andColeman,J.,
AcousticsofAmericanEnglishSpeech,Springer
Verlag,NewYork,1993.
[3]Fant,G.,
AcousticTheoryofSpeechProduction,MoutonBookCo.,Gravenhage,1960,Chap.
2.1,93-95.
[4]Baer,T.,Gore,J.C.,Gracco,L.C.,andNye,P.W.,Analysisofvocaltractshapeanddimensions
usingmagneticresonanceimaging:Vowels,
J.Acoust.Soc.Am.,90(2),799-828,Aug1991.
c
1999byCRCPressLLC
44
Speech Production Models and
Their Digital...