Age Dependence of Children ’ s Speech Parameters

This paper deals with the search for agedependent parameters in children’s speech. These parameters are compared in terms of age dependence, and their adequacy for recognizing the age of a speaker is presented, using discrimination analysis.


Introduction
An analysis of the relationship between acoustic-phonetic aspects of speech and the speaker's age may have numerous applications.This paper has been motivated by practical experience in the field of phoniatry and logopaedia.When examining children's pathological speech, there is often an effort to answer the question "What age does particular speech corresponding to", and therefore for example to estimate at what age a child's speech development stopped.
Chronological age is unambiguously given by date of birth.Logopaedic age is the age estimated on the basis of acoustic-phonetic aspects of human speech.

Frequency of a basic glottic tone F0
An analysis was made of separate vowels in syllables /la/, /le/, /li/, /lo/, /lu/ from the words škola, košile, zmrzlina, letadlo and maluje, and then of complete vocal sections of the speech.
The analysis was made using an autocorrelation method in the Praat v. 5.0.15 program [6] with the following parameters time step = 00 ., pitch floor =100 Hz and pitch ceiling = 600 Hz.The resulting values were verified using the Wavwsurfer v. 1.8.5 program [7], and were manually modified, if applicable.The most frequent event was incorrect detection of F 0 , lower by one octave.
In order to make the frequency intervals comply better with the perception of intonation intervals by human hearing, the F 0 values were transferred to a semitone scale, with the beginning at 100 Hz For statistical confirmation of the age dependence of F 0 , a zero hypothesis of H 0 was taken into consideration, which denies such dependence.H 0 can be rejected on the basis of the results of a t-test for the correlated measurements: where The d i variable in this case means the difference between F 0 and the age of speaker No. i.
In our case, H 0 can be rejected for the level of , p < 0001 ., n = 193.The correlation power can be expressed using the Pearson correlation coefficient: For the age dependence of F 0 for vowel /a/: r = 0 43 ., which is a mesoscale satisfactory correlation.For all vocal sections of speech: r = 0 41 .with p < 0001 ., n =113.The F 0 trend is shown in Fig. 1.

F0 variance
The variance of the basic voice frequency is associated with the intonation range of a piece of speech.This parameter reflects the overall tunefulness and melodiousness of the speech typical of pre-school children.
The F0 variance was analysed for all vocal speech sections, and showed a declining tendency with age.The correlation coefficient was r = -061 .(p < 0001 ., n =113).

F1, F2 formants
Formant frequencies correspond with the resonance frequencies of the vocal organ cavities [1].They were estimated for particular vowels using an LPC (linear predictive coding) spectrum via an algorithm by Burg [6].

Spectral centre of gravity
If the complex spectrum is given by S f ( ), where f is the frequency, the centre of gravity is given by divided by the energy Thus, the centre of gravity is the average of f over the entire frequency domain, weighted by the power spectrum.

Central spectral moment
The n-th central spectral moment is given by Thus, the n-th central moment is the average of ( ) f f c n over the entire frequency domain, weighted by the power spectrum.

Spectral standard deviation
The standard deviation of a spectrum is the square root of the second central moment of this spectrum.

Skewness of a spectrum
The (normalized) skewness of a spectrum is the third central moment of this spectrum, divided by the 1.5 power of the second central moment.
Skewness is a measure for how greatly the shape of the spectrum below the centre of gravity differs from the shape above the mean frequency.For white noise, the skewness is zero.

Kurtosis of a spectrum
The (normalized) kurtosis of a spectrum is the fourth central moment of this spectrum, divided by the square of the second central moment, minus 3.
Kurtosis is a measure for how greatly the shape of the spectrum around the centre of gravity differs from a Gaussian shape.For white noise, the kurtosis is -6/5.
The above-mentioned spectral characteristics of sibilant consonants were measured for consonants /s/, /ss/ and /cc/.

Voice onset time
Voice Onset Time (VOT) [5] is the time duration between the release of a plosive and the beginning of vocal cord vibration (Fig. 5).This period is measured in milliseconds (ms).
VOT measurements were performed on syllable /ka/ from word "babička".However, it was not possible to prove any age dependence even of this parameter using the measured values on the level of p < 005 . .

Speech rate
Speech rate has been determined for particular talkers as a reciprocal value of the duration of the entire speech without pauses.Age dependence was also not proved for this parameter.

Overview of age dependent parameters
The table below summarizes the examined phonetic characteristics.Individual attributes are ordered according to the age-correlation rate (column r).The H 0 column contains significance level values where it is theoretically possible to reject the zero hypothesis of age independent parameters.The parameters below the double line cannot be considered age--dependent on the significance level of 5 %.

Discrimination analysis
In this part, we will try to make use of the age-dependent parameters for a simple discrimination analysis.The data classification is based on acceptance or rejection of the hypothesis of data pertinence to a particular class.Four classes were designated (0: 3-5 years, 1: 6-7 years, 2: 8-9 years, 3: 10-12 years).The discrimination function being maximized is as follows [2]: where C i is the covariance matrix, m i is the mean value and ( ) d h i is the probability rate of the results on d data, on the assumption that hypothesis h i applies.
Training was performed using the RANSAC method for the vectors of 16 phonetic parameters.
The classification success rate is shown in Fig. 6, and the percentage enumeration is shown in a confusion matrix (Table 2).

Conclusion
The selected speech characteristics showed various intensities of age dependence.The characteristics based on basic vocal frequency and some spectral properties of consonant /s/ showed a correlation of about 0.5.In the end, it was shown that selected speech attributes enable training of a classifier which provides for classification into age groups with a probability rate of ca 80 % A similar classification method will be tested in the future on the speech og children with speech developmental defects.

Table 1 :
Overview of age dependent features