subglobal1 link | subglobal1 link | subglobal1 link | subglobal1 link | subglobal1 link | subglobal1 link | subglobal1 link
subglobal2 link | subglobal2 link | subglobal2 link | subglobal2 link | subglobal2 link | subglobal2 link | subglobal2 link
subglobal3 link | subglobal3 link | subglobal3 link | subglobal3 link | subglobal3 link | subglobal3 link | subglobal3 link
subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link
subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link
subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link
subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link
subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link

Using VoceVista


VoceVista is a simple-to-use software application for the PC. While it can be used to analyze vocal signals for a variety of applications, such as research and vocal pathology, it was developed primarily for singing teachers to anaylze the singing voice, thereby helping their students progress more rapidly.

This page provides an overview of some of the features of VoceVista. It is divided into the following topics:

Power Spectrum

The power spectrum displays the components of a sound with respect to frequency (x-axis, from zero up to 10 kHz) and relative intensity (y-axis). Voiced sounds are harmonic, having a fundamental and overtones. The individual harmonics, all of which are multiples of the fundamental frequency, are made visible in (narrow-band) spectrum analysis (Figure 1, lower right panel). The relative strength of the overtones diminishes generally with increasing frequency. For the singer, however, the most important factor in their amplitude is the influence of variable resonances, or formants, of the vocal tract. Real-time spectrum analysis displays the sound components at virtually the same time as they are heard. This is of crucial importance to singers and singing teachers, who can use the display to associate characteristic visual patterns with sounds they have painstakingly learned to identify by ear. 

By 'freezing' the display, the operator can at any moment get a 'snapshot' of a moment of this complex, time-varying signal. This snapshot is displayed as a power spectrum (plural: spectra). Two such frozen spectra can furthermore be easily compared in detail. One of the most revealing points of this analysis is the influence of the formants (the two lowest of which determine the vowel) on the harmonics of the voice source. One learns to discern which formant acts on which harmonic, with what effect on the final sound. To investigate this more precisely, it is useful to produce a continuous spectrum, showing the resonance profile of the vocal tract without the gaps that appear between the harmonics of voiced sounds. This can be done using vocal fry--a light popping of the vocal folds at a low and irregular frequency -- while precisely preserving the vocal-tract posture of the sung sound being investigated (Figure 1, below). VoceVista allows an exact comparison of the two spectra, both by means of a cursor for reading the frequencies and by direct overlay of the one spectrum on the other. (With VoceVista it is also possible to display the formants by the more familiar, but less precise, method of broadband spectrum analysis.)

figure 1


Fig. 1. Spectrogram of bass, E3 (165 Hz), vowel [o], sung, followed by vocal fry. In the panels on the right are power spectra of sung ('line spectrum', above) and vocal fry ('continuous spectrum', below) moments. The cursors in the right panels indicate at what frequency (444 Hz) the first formant is located among the sung harmonics. The cursors in the left panels indicate the two moments in the spectrogram from which the respective power spectra are taken.

Figure 1 shows an easy pitch in a range where the distance between harmonics is relatively small. The comparison of the harmonic spectrum with that of the formants alone may not yield much surprising information. In Figure 2, below, we see how the formants influence the distribution of the sound components on a pitch an octave higher, near the top of the range in the same voice. It becomes more apparent what a critical role the exact 'placement' of the formants plays in the resultant sound. This sound is dominated by the frequency component at the second formant, four times, or two octaves above, the fundamental, giving it the 'brightness' desired on a full high note produced by a male opera singer. Here the option of overlaying one spectrum on another is illustrated. 

figure 2


Fig. 2. Power spectra of bass, E4 (330 Hz), vowel [a]. The sung spectrum (blue) is overlaid with the vocal fry spectrum, showing how the second formant is responsible for the dominance of the fourth harmonic at 1320 Hz, two octaves above the fundamental. The sixth harmonic also gets reinforcement from the third formant.


Analyzing Voices on Recordings

Once one has become familiar with typical resonance patterns in a variety of voices, it is generally possible to interpret the 'resonance strategies' used by singers from the sung sound alone, even without having a vocal fry spectrum for comparison. This allows us to examine the recorded literature, observing how both greater and lesser singers cope with the challenge of producing a beautiful sound on a specified pitch and vowel in a given phrase. Figure 3, below, shows the different strategies used by two of our most famous tenors for a difficult and exposed high note. 

Fig. 3.Tenors singing the final phrase of aria "Celeste Aida," taken from commercial recordings. The right panels are power spectra of the final B4-flat (466 Hz), just before the orchestra enters: the cursors mark the singer's formant dominance (P. Domingo, above), and the second formant dominance (L. Pavarotti, below).


Waveforms: the Electroglottograph

One of the displays of VoceVista (Figure 4, below) shows the waveforms during a short segment of time: above, the microphone (audio) signal, and below, the electroglottograph (EGG), which allows one to follow the pattern of vocal fold contact in the glottal cycle. The moment of closing of the glottis (increasing contact) is usually abrupt and can be determined fairly precisely in the EGG signal. The opening (theoretically, the moment of the steepest falling slope) can be more problematical to determine, but an experienced observer can make consistent and useful estimates. 

Fig. 4.Waveform signals of three glottal cycles, taken from spectrogram (left) of sustained G2-flat (89 Hz): above right, microphone signal; below right, electroglottograph. The horizontal cursor is manually adjusted to the estimated moment of glottal opening. The values of the glottal period (11.21ms) and the closed quotient (0.54) are displayed. The two signals have been adjusted to align their moments of glottal closing.

Because of the distance the sound must travel from the glottis, the microphone signal lags slightly behind, but the moment of closing is usually apparent in the mike signal as well. Using a convenient operation of the software, an experienced operator can align the closing moment in EGG and audio, gaining more information from the combination of the two than from either signal alone.

The closed quotient--the percentage of the glottal cycle in which the glottis is closed, preventing the flow of air--is an important variable in the singing voice, giving indications of both 'register' (chest or falsetto vocal-fold vibration pattern -- see Figure 5, below) and vocal effort. By adjusting a horizontal cursor in the EGG signal to the estimated point of opening of the glottis, one can get a readout of this quantity. 

Fig. 5.Spectrogram of mezzo soprano arpeggio, A3 to A4, with waveform signals shown of C4-sharp (above, middle register) and A3 (below, chest register). Note the difference in the closed quotient (CQ) and form of the EGG signal.

Spectrogram and Audio Envelope

The display in the left panels of Figures 1, 3, 4, & 5  is the spectrogram of the microphone signal. This display provides a framework that locates the power spectrum and waveform displays within the sound as it is perceived.  Like the waveform display, it shows time from left to right. The time scale, however, is on the order of the musical phrase, showing up to eight seconds, rather than the few glottal cycles of the high time-resolution waveform display.  The frequency dimension -- the horizontal axis in the power spectrum -- is the vertical axis in the spectrogram  The relative intensity of the various frequency components is indicated by a color scale or, alternatively, by shades of gray.

Many important observations on the singing voice, such as the rate and extent of vibrato, the duration of legato-disrupting consonants, and the steadiness of vowels, can be made directly from the spectrogram. With VoceVista it is in addition possible to obtain a display of the power spectrum or the waveform signals from any point in a frozen spectrogram. All this information is retained in the memory of the computer and can be saved for further detailed analysis, for playback, or for instructional purposes such as matching a resonance pattern. A recommended application of the program would be to monitor continuously a lesson or a taped performance, occasionally freezing the current spectrogram display for closer investigation or future reference.

A final display is that of the envelope of the microphone signal. Although not calibrated, this gives a good indication of sound pressure level and thus the effectiveness of resonance strategies. Close examination of vibrato-related modulations in the audio envelope can often reveal precisely where a harmonic of the voice source is getting strong reinforcement from a formant of the vocal tract.

Software/hardware Requirements and Purchase Information

Technical requirements for VoceVista: multimedia PC with processor of at least 133 MHz, 16 MB RAM, Windows 95 or above. VoceVista-Pro, which offers the waveform signals described above, needs extra hardware in the form of an electroglottograph.

For purchase information regarding VoceVista-Pro contact d.g.miller@vocevista.com.

For the most complete current description of the application of VoceVista as feedback for the singing studio, see Resonance in Singing: Voice Building through Acoustic Feedback, by Donald Gray Miller, Inside View Press, 2008. The book can be purchased at the publisher's website: www.voiceinsideview.com.

©2007 VoceVista