The Wonders of Voice through Voiceprint Technology

Rana King

10 years ago

Last year, more than 30,000 Australians registered their voiceprint with Australian Taxation Office for the collection agency’s biometric authentication rollout. This bold new step was taken by ATO as they sought to reduce the collective hours that takes their call centre staff to verify the identity of their 6 to 8 million callers each year. According to second commissioner Geoff Leeper, “Voice verification will speed up the authentication process and cut the time they need to spend on the phone to the ATO.” This new automated voice authentication system uses both speech recognition (what is said) and voice identification (how it is said).

Since the start of this decade, United States and 70 other nations have been investing in voice recognition technology and creating databases to help their governments identify persons of interests. Companies all over the world too are investing in this new biometric technology that has now gone beyond hand and finger geometry.

What is Voiceprint?

Voiceprint is a visual record of speech, analyzed with respect to frequency, duration, and amplitude. It is a set of measurable characteristics of a human voice that uniquely identifies an individual. These characteristics, which are based on the physical configuration of a speaker’s mouth and throat, can be expressed as a mathematical formula. The term applies to a vocal sample recorded for that purpose, the derived mathematical formula, and its graphical representation. Voiceprints are used in voice ID systems for user authentication.

The data used in a voiceprint is a sound spectrogram. Different speech sounds create different shapes within the graph. Spectrograms also use colors or shades of grey to represent the acoustical qualities of sound

What characteristics make up a unique vocal fingerprint?

What makes your voice unique and ideal for authentication and validation technology is the shape of the vocal cavity and the way the mouth moves when you speak. In sound, there are 3 main characteristics – pitch, volume and timbre. Timbre makes every voice unique. The sum of the cords vibration and the resonators shaping develops a unique set of vocal formants. These formants are the particular frequencies added or subtracted from the vibration which are finally responsible of the uniqueness of each voice.

Technology experts though still are concerned by the overall security of using voiceprint in speaker recognition. Much research efforts has gone into what is called “liveness detection” and “playback detention”. These are ways to ensure that a live or real person is speaking rather than a voice recording or person mimicking another person’s voice.

Again it is a bold step to use voice in as critical as validation and authentication. We take security measures everyday – using keys to our house or cars, username and passwords for our computers and phones. But instead of using something you have or something you know, biometrics uses who you are to identify you – your face, fingerprints, irises or veins, or behavioral characteristics like your voice, or handwriting. Unlike keys and passwords, your personal traits are extremely difficult to lose or forget. And also very difficult to copy. It is not as easy as how James Bond or Ethan Hunt makes it out to be.

The beauty of voiceprint is not just in its technological applications but also in the wonderful images created in each voice spectrogram samples.