Voice biometrics identifies speakers using only their vocal characteristics. The concept is similar to other well-known biometric technologies, such as fingerprint and face recognition. All methods are based on physiological identifiers unique to every individual. In voice biometrics, these identifiers are related to the shape of the vocal tract.
During enrollment of new speakers, the identifiers, also known as features, are extracted from several voice samples and are used to create a voice template, or voiceprint, which is stored in the system's database. The voice template describes the distribution of the features, but does not contain actual voice samples.
During verification, the features are extracted from the test segment and compared with a single voice template or a set of voice templates. The result of this comparison is a numerical score, describing the likelihood that the same speaker who created the voice template is speaking in the test segment. Comparing this numerical score with a threshold yields a binary accept/reject decision. This process can be repeated for several voice templates, providing one-to-many identification results.