Voice Recognition System Identity Verification (VRSiv)

Jun 2021 | Research Papers


A 2017 survey by Unisys Security Index1 showed that Filipinos are deeply concerned about identity theft and high level of concern about data security threats. Ninety three percent (93%) of Filipinos are extremely or very concerned about unauthorised access to, or misuse of, personal information. “Approximately nine in ten Filipinos are seriously concerned about identity theft, credit card/debit fraud and computer security reflecting the very real threat that their personal details may be stolen and sold on the dark web – highlighted by last year’s breach of the Philippine voter registry database”, the Unisys report said.

Various protective measures have been suggested in order to prevent fraud and identity theft but identity thieves always find ways stealing personal information from victims. Passwords for online accounts are not enough protection for identity theft and fraud because most passwords are so simple, that they can be easily guessed (especially based on social engineering methods) or broken by simple dictionary attacks2.

Limitations of password-based protection can be enhanced by incorporating biometric authentication. Biometrics refers to establishing identity based on the physical and behavioral characteristics (also known as traits or identifiers) of an individual such as face, fingerprint, hand geometry, iris, keystroke, signature, voice, etc.3 Biometric authentication is more reliable than traditional passwords because they cannot be lost or forgotten, they require the person to be authenticated to be present and it is difficult to forge. 

Voice biometrics only accounts 3% of the world’s share in the commercial biometric market4. However, it should be noted that speaker identification has a number of advantages and  can be used to authorization access for many services and systems such as voice dialing options, telephone banking, shopping by phone, database access, voicemail, information services, access to restricted zones, access to computers, etc5

Statement of the Problem

In order to circumvent the limitations of password-based protection. We are to implement a voice recognition system that exploits the strengths of voice biometrics. The project aims to resolve the mismatches of client’s identities through the implementation of a voice recognition system. 

Goal Statement

The objective of the project is to create a voice recognition system capable and efficient in identifying mismatches of client’s identities to requested transactions. The AI software will monitor the transactions by cross-referencing the given ID to the respective client’s identity within the system. Consequently, the system’s security will be strengthened. Concurrently, the client’s identity will be safeguarded. So through this program, it will forfend fraudulent transactions and identity theft of clients. Furthermore, the aim is employ a speaker recognition in remittance centers in conjunction with other information protection in order to ensure protection of clients from identity theft and unauthorized transactions in their behalf. 

Literature Review

Scientific benchmarking

Speaker recognition methods can be divided into two main categories6: text-dependent where speaker recognition is performed on the basis of notice specified word or phrase, and text-independent where recognition process is performed based on the characteristics of speech regardless of what is spoken. Text-dependent methods are typically based on DTW (dynamic time warping) or HMM (hidden Markov model)5. Text-independent methods usually applies two approaches5: Vector Quantization (VQ) and Gaussian mixture model (GMM).

CNN-based approach7 and i-vectors approach8 are also getting popular in the recent years as de facto state of the art speaker recognition algorithms9. These two novel algorithms have better accuracy as compared to traditional speaker recognition algorithms as presented in the table below.

Industry Application

In the commercial biometrics market, only 3% is captured by voice biometrics. Several companies offer speaker recognition and identification software but they are limited by their accuracy because of the low distinctiveness and permanence of voice among speakers3 despite its high acceptability as an authentication algorithm among people. However, in conjunction with other biometric technologies such as face recognition, accuracy of authentication and recognition of individuals significantly increases.

Microsoft Azure Speaker Authentication11 is a US only speaker recognition software offered by Microsoft. They offer voice recognition and voice authentication tools and is relatively cheaper than offered by other companies.

Voice Biometric Group12 specializes on fraud detection in call centers. They offers flexible payment schemes based on client expectations.

VoiceIt13 offers the following features: Unlimited API Calls, SLA – 99.95% Uptime, Anti Spoofing Technology, Enhanced Support: Email, Phone, and Slack, Wrappers and SDKs, Liveness Detection