Creating an effective speech recognition system is a complex undertaking. Language, grammar, accents, input devices, voice and ambient noise levels are a few prime factors that impact performance. Word recognition accuracy is only part of the challenge, often overlooked, but critical to performance, are other more subtle factors such as the situational context in which the words were spoken and the consequences of misrecognition. These are major reasons why many commercially available recognizers perform poorly when applied to simulation or command and control environments. Tools to enable users to edit and create their own phraseologies is another important consideration in selecting the right system. Click the link for a more thorough discussion of requirements. The examples used refer to an air traffic control simulation, but the premise is equally relevant to other environments.
Not just another speech engine - Lexix is a suite of speech recognition products and services designed for the unique needs of simulation and system control applications. Lexix achieves consistently high recognition rates under noisy and otherwise very human conditions, but the Lexix advantage goes well beyond a recognition engine and a text-to-speech response system. Lexix makes implementing and maintaining an accurate, robust speech system easy. It employs modern graphical user interfaces, open W3C standards, supports any SAPI 5.x voice and obtains superior results by compensating for inconsistent audio signals and poor microphone technique. The system is speaker independent, supports ultra-large voice models and grammar files, can be updated for accents and additional languages, and provides tools for grammar editing. Lexix can be acquired as a complete system or by individual components.
The Lexix SDK comes with all the tools necessary to implement a robust speech system, adding a Voice User Interface (VUI) to any system controls. Bear in mind that the end-user must define the domain specific grammar, phonetic dictionary and associated synthetic responses for their application and their software must incorporate the executable actions that the Lexix speech system will trigger. Adacel’s support team can help to define grammar, integrate the communication behavior and merge it with the existing entity actions in the 3rd party system. The SDK includes a Lexix ASR run-time, the Lexix Command Audio, sample grammar (ideal for cockpit and UAV control), the Lexix Dialog Editor to support adding new phraseology, an API to integrate Lexix with the simulator or operational device, sample code to serve as a development guide, a Unity3D sample program, documentation, and an integration guide.
Read the Brochure [PDF file]
Lexix ASR is a high accuracy recognizer providing a voice user interface for simulation and control systems. It supports multiple grammars and can assume many roles simultaneously. The ASR is the component that executes the actual recognition task. It processes audio input to produce a recognition hypothesis interpreting actions to be executed by the attached device. Integrated grammar files list supported phrases and a dictionary provides phonetic representations of the words in the phrases. The acoustic model provides the ASR with speaking characteristics of the target user group. Post recognition processing refines the recognition hypothesis with intelligent context based error reduction and in turn generates any text-to-speech response. Lexix has tools to optimize the system to handle noise, unique pronunciations, non-native English accents and out-of-grammar phrases.
Lexix Command Audio (LCA) is a powerful system that automatically optimizes audio input to enhance recognizer performance. In typical speech applications, individual characteristics such as soft voices in noisy environments, microphone placement, voice frequency and register are not compensated for in the input signal. When a computer is the receiving entity, poor signal quality leads to unpredictable results. Worse than the dreaded "say again", the system could determine a false positive, recognizing and implementing an unrelated command, creating chaos. Through Adacel’s patented technology LCA automatically compensates for background noise, microphone placement, poor push-to-talk technique, and eliminates low volume and clipping conditions. The resulting clean signal generates superior recognition performance. LCA can be used real-time or in a batch mode on saved audio files.
Speech recognition systems require grammar files that provide all the phraseology to be recognized. The Lexix Dialog Editor tools enable the user to easily modify or add variations in terminology to the supported grammar and associate them with their system’s functional commands. This includes edits to corresponding computer spoken responses. The editor also permits the addition of local spoken names without any system software changes, for example vehicle callsigns and geographic points. The editing tools also allow the user to implement multiple variations of a supported command linked to the same action. Likewise, multiple phonetic spellings can be added to help the recognition system understand alternate pronunciations and unique or domain specific words. Changes can be made in less than a minute and can be performed and saved offline or while a scenario is running.
Careful attention must be paid to how words are used in a particular domain. The speech system must determine a reasonable result true to the intent of the spoken words, then implement the intended action. The key issue in speech recognition is not the accuracy against supported phrases; it is the system performance when provided with unsupported phraseology that leads to misrecognition and false positive behaviors. Recognizer performance can be improved by tuning voice models for various accents and specific background noise, enhancing grammar with alternate phrases commonly used and adding more phonetic pronunciations of unique words to the dictionary. Also, a Contextual Post Processor using well defined rules can preclude nonsensical results. In addition to customization and optimization for specific applications, Adacel has an experienced engineering team available for integration assistance.
Lexix - Speech Recognition System [PDF file]