|
PATTERNS — Similarities between spectral voice patterns
Basic variables (such as pitch, loudness, energy, dynamics, duration
of pauses and utterances) are used to describe speech recordings
in terms of scalar parameters.
In contrast, the multivariate spectral pattern approach to speech
analysis has its main focus on the spectral composition of a
speech recording together with the individual variability of each
spectral component. Spectral patterns enable the application of
a variety of algorithms which proved to be powerful in the field of
Pattern Recognition. This program constructs spectral patterns in
such a way that the total information included in the speech samples
can be decomposed into (1) a static component that represents the
individual, genetically determined characteristics of a the speaker's
voice and (2) a dynamic component which reflects reactive changes
due, for example, to the current situation of the speaker (short-term
fluctuations) or to the speaker's global affective state (long-term
fluctuations). Spectral patterns are derived from 4-8 consecutive
4-second epochs out of a total recording time of 30-60 seconds.
Specificationlist: PATTERNS
------------------------------
I4 FRST 1 Default-value
I4 NSPK 15 Default-value
I4 PROT 0 Default-value
I4 PLOT 0 Default-value
I4 PMAX 0 Default-value
I4 LPRT 6 Default-value
I4 SAVE 0 Default-value
I4 TLOG 0 Default-value
01 FRST Specifies first spectrum to be included
02 NSPK Specifies number of spectra to be used l
03 PROT Controls output to display/printer
04 PLOT Controls graphic output
05 PMAX Specifies maximum number of plot pages
06 LPRT Logical unit number of plot-device
07 SAVE Saves newly constructed spectral pattern in databank
08 TLOG Logarithmic transformation of spectral lines
09 DEMO Examples that illustrate program function
- FRST = p: Specifies first spectrum to be included
- NSPK = q: Specifies number of spectra to be used
- PROT = 0: No print output
= 1: Basic characteristics of spectral voice patterns
= 2: Details on optimization
- PLOT = 0: No plot output
= 1: Spectral voice patterns
= 2: Similarity between spectral voice patterns
- PMAX = q: Maximum number of plot pages
= 0: Unlimited number
- LPRT = q: Logical unit number of plot-device (standard=6;
valid numbers are 46-96)
- SAVE = 0: No effect
= 1: Newly constructed spectral pattern is to be saved
= 2: Existing patterns will be replaced
- TLOG = 0: No effect
= 1: Logarithmic transformation of spectral lines
- DEMO: Voice sound characteristics and spectral patterns
Example
&&START CSELECT=Normative speech study zurich: males (study 600)
STUD=600,ATAG=1,STAT=2,SEX=1,AMIN=20,AMAX=30,PROT=1,RSET=1
POPU=MALES
?
&&START CSELECT=Normative speech study zurich: females (study 600)
SEX=2,RSET=0
POPU=FEMALES
&&START PATTERNS=Normative speech study zurich (study 600)
PROT=1,PLOT=2,LPRT=82,SAVE=0,FRST=3,NSPK=15
|
|
Fig. 24: Voice sound characteristics ("timbre") of a female
speaker as quantified through spectral analyses. Spectral
intensities are plotted along the y-axis on log-proportional
scales and as a function of frequency (x-axis: 7 octaves covering
the frequency range of 64-8192Hz). The shaded area denotes the
variability around mean spectral intensities ("characteristic
variability"). It is this frequency-dependend variability that
makes voices easily distinguishable from each other. The maxima
represent "overtones" at fixed, physically well-defined intervals
over the fundamental frequency F0 (please compare with the
spectrum of the male speaker in Fig. 22).
|