The power spectrum displays the components of a sound with respect to frequency (x-axis, from zero up to 10 kHz) and relative intensity (y-axis). Voiced sounds are harmonic, having a fundamental and overtones. The individual harmonics, all of which are multiples of the fundamental frequency, are made visible in (narrow-band) spectrum analysis (Figure 1, lower right panel). The relative strength of the overtones diminishes generally with increasing frequency. For the singer, however, the most important factor in their amplitude is the influence of variable resonances, or formants, of the vocal tract. Real-time spectrum analysis displays the sound components at virtually the same time as they are heard. This is of crucial importance to singers and singing teachers, who can use the display to associate characteristic visual patterns with sounds they have painstakingly learned to identify by ear.

By ‘freezing’ the display, the operator can at any moment get a ‘snapshot’ of a moment of this complex, time-varying signal. This snapshot is displayed as a power spectrum (plural: spectra). Two such frozen spectra can furthermore be easily compared in detail. One of the most revealing points of this analysis is the influence of the formants (the two lowest of which determine the vowel) on the harmonics of the voice source. One learns to discern which formant acts on which harmonic, with what effect on the final sound. To investigate this more precisely, it is useful to produce a continuous spectrum, showing the resonance profile of the vocal tract without the gaps that appear between the harmonics of voiced sounds. This can be done using vocal fry–a light popping of the vocal folds at a low and irregular frequency — while precisely preserving the vocal-tract posture of the sung sound being investigated (Figure 1, below). VoceVista allows an exact comparison of the two spectra, both by means of a cursor for reading the frequencies and by direct overlay of the one spectrum on the other. (With VoceVista it is also possible to display the formants by the more familiar, but less precise, method of broadband spectrum analysis.)

fig 1 VoceVista screenshot
Fig. 1

Fig. 1. Spectrogram of bass, E3 (165 Hz), vowel [o], sung, followed by vocal fry. In the panels on the right are power spectra of sung (‘line spectrum’, above) and vocal fry (‘continuous spectrum’, below) moments. The cursors in the right panels indicate at what frequency (444 Hz) the first formant is located among the sung harmonics. The cursors in the left panels indicate the two moments in the spectrogram from which the respective power spectra are taken.

Figure 1 shows an easy pitch in a range where the distance between harmonics is relatively small. The comparison of the harmonic spectrum with that of the formants alone may not yield much surprising information. In Figure 2, below, we see how the formants influence the distribution of the sound components on a pitch an octave higher, near the top of the range in the same voice. It becomes more apparent what a critical role the exact ‘placement’ of the formants plays in the resultant sound. This sound is dominated by the frequency component at the second formant, four times, or two octaves above, the fundamental, giving it the ‘brightness’ desired on a full high note produced by a male opera singer. Here the option of overlaying one spectrum on another is illustrated.

 

Fig 2 VoceVista screenshot
Fig. 2

Fig. 2. Power spectra of bass, E4 (330 Hz), vowel [a]. The sung spectrum (blue) is overlaid with the vocal fry spectrum, showing how the second formant is responsible for the dominance of the fourth harmonic at 1320 Hz, two octaves above the fundamental. The sixth harmonic also gets reinforcement from the third formant.

Power Spectrum (VoceVista)