Institute for Response-Genetics (e.V.)

Chairman: Prof. Dr. Hans H. Stassen

Psychiatric Hospital (KPPP), University of Zurich

IFRG Logo
IFRG Emblem
Trithemius

TRANS — Tonal spectral analysis covering 7 octaves

Determines the frequency characteristics of a speech recording through spectral analysis. Spectral analysis results in a set of 168 spectral parameters that cover 7 octaves between 64 Hz and 8192 Hz at a quartertone resolution (24 equidistant tones per octave). Spectra may be stored in the databank and subsequently used for investigations into the voice sound characteristics of male and female speakers (cf. program FDFREQ) or used for the construction of spectral patterns (cf. program PATTERNS, computerized identification of persons by spectral voice patterns).

            Specificationlist:       TRANS
            ------------------------------
            A8 TANA              Undefined
            I4 NREC                      0  Default-value
            I4 NBIT                     16  Default-value
            I4 IBM                       0  Default-value
            I4 PROT                      0  Default-value
            I4 PLOT                      0  Default-value
            I4 MODE                      2  Default-value
            I4 FFT                       1  Default-value
            I4 NORM                   2000  Default-value
            I4 NSEC                      8  Default-value
            I4 NSPK                      4  Default-value
            I4 LG30                     30  Default-value
            I4 LPRT                      6  Default-value
            I4 PROB                      0  Default-value
            I4 PMAX                      0  Default-value
            I4 SAVE                      0  Default-value
 
            01 TAPE Name of input tape
            02 NREC Number of files to be processed
            03 NBIT Specifies number of bits used in A/D-conversion
            04 IBM  Specifies data format (IBM/SPARC vs. X86/X64)
            05 PROT Controls graphic output
            06 PLOT Controls graphic output
            07 MODE Controls data processing
            08 FFT  Specifies type of spectral analysis
            09 NORM Controls normalization of spectra
            10 NSEC Specifies epoch length
            11 NSPK Number of epochs used in the calculation of spectra
            12 LG30 Logical unit number associated with input tape
            13 LPRT Logical unit number of plot-device
            14 PROB Specifies proband to be processed
            15 PMAX Specifies maximal number of plots
            16 SAVE Saves newly constructed segmentation table
            17 DEMO Examples that illustrate program function
 
            - TANA:     Name of the input tape ("NO" means that
                        DSN-name will not be checked)
 
            - NREC = 0: All files
                   = n: First "n" files are to be processed
 
            - NBIT = n: Number of bits used in A/D-conversion (default=16)
 
            - IBM  = 0: Input data are stored in X86/X64-mode   (little endian)
                   = 1: Input data are stored in IBM/SPARC-mode (big endian)
 
            - PROT = 0: No protocol
                   = 1: Short protocol
                   = 2: Detailed protocol
 
            - PLOT = 0: No plots
                   = 1: Spectra (1/4-tones 64-8192HZ)
                   = 2: Spectra (1/8-tones 64-1024HZ, FALLS IFFT>1)
                   = 3: Spectral patterns (1/4-tones 64-8192Hz)
                   = 4: Spectral patterns (1/8-tones 64-1024Hz, if IFFT>1)
                   = 5: Compressed time series
 
            - MODE = 0: Spectral analysis of raw time series
                   = 1: Spectral analysis of manually segmented time series
                   = 2: Spectral analysis of automatically segmented time series
 
            - FFT  = 0: Fast Fourier Transformation (FFT)
                   = 1: Tonal spectral analysis
 
            - NORM = 0: Relative amplitudes (with respect to calibration signal)
                   = p: Amplitudes are normalized with respect to "p" and the
                        actual spectral distribution (standard value: p=2000)
 
            - NSEC = k: Number of consecutive 1-second epochs used for
                        the calculation of spectra
 
            - NSPK = n: Number of epochs included in the calculation of
                        spectral parameters
 
            - LG30 = u: Logical unit number associated with input tape
 
            - LPRT = q: Logical unit number of plot-device (standard=6;
                        valid numbers are 46-96)
 
            - PROB = 0: All probands
                   = n: Only proband "n" is to be processed
 
            - PMAX = 0: Unlimited number of plots
                   = p: Number of plots is limited to "p"
 
            - SAVE = 0: No effect
                   > 0: Computed spectra will be stored in databank
 
 
            - DEMO: Spectral analysis of speech recordings
 
           

Example

 
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud600_1)
             TANA=NO,LG30=35,PLOT=0,PROT=1,LPRT=61,NSPK=4,NSEC=1,PMAX=800,SAVE=1
             MODE=0
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud600_2)
             LG30=36,LPRT=67
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud601)
             LG30=37,LPRT=68
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud602_1)
             LG30=38,LPRT=69
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud602_2)
             LG30=39
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud602_3)
             LG30=40
            &&START TRANS=Spectral analysis of re-formatted bli-tapes (stud602_7)
             LG30=41
 
           
vSpacer voxFig22
Fig. 22: Voice sound characteristics ("timbre") of a male speaker as quantified through spectral analyses. Spectral intensities are plotted along the y-axis on log-proportional scales and as a function of frequency (x-axis: 7 octaves covering the frequency range of 64-8192Hz). The shaded area denotes the variability around mean spectral intensities ("characteristic variability"). It is this frequency-dependend variability that makes voices easily distinguishable from each other. The maxima represent "overtones" at fixed, physically well-defined intervals over the fundamental frequency F0 (please compare with the spectrum of the female speaker in Fig. 24).
Hspacer
[ Mail to Webmaster ] k454910@ifrg.ch
rFrame