AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
![]() We evaluate the performance of the proposed approaches through various phonetic classification and recognition experiments. We propose a hybrid approach that combines rule-based and statistical approaches in a two-step fashion. On the other hand, pure statistical techniques for baseform generation require large amounts of training data, which is not readily available. However, deep linguistic knowledge is required to write all possible rules, and there are some ambiguities in the language that are difficult to capture with rules. As is inherent in phonetic languages, rules generally capture the mapping of spelling to phonemes very well. We also present a technique for generating baseforms (phonetic spellings) for phonetic languages such as Hindi. Following this approach requires less training data. This aligned data is used to obtain the initial acoustic models for the phones of the new language. The training data for the new language is aligned using an existing speech recognition engine for another language. We present a technique for fast bootstrapping of initial phone models of a new language. In this paper we present two new techniques that have been used to build a large-vocabulary continuous Hindi speech recognition system. Presented represents only one path through this continuum of recognition In some sense, the Viterbi (1967) based system Major consumer of CPU time, vector quantization-like approaches thatĮnable one to compute only a small number of Gaussians per frame are Third, since much of theĬomputation in these systems is devoted to acoustic model processing,įast-matching strategies within the acoustic model are important.įinally, since Gaussian evaluation at each state in the system is a Very popular for real-time implementation. Rescore only the N-best resulting hypotheses using better models are Systems that perform a quick search using a simple system, and then Look-ahead and N-best strategies at all levels of the system are key toĪchieving such large reductions in the search space. First, more intelligent pruningĪlgorithms that prune the search space more heavily are required. WER), the systems described in this article can be transformed into With only minorĭegradations in performance (typically, no more than a 25% increase in Researchers is the development of real-time systems. Great sacrifice in word error rate (WER). Of application prototypes, which require near-real-time speed without a Is placed on the flexibility of the system architecture, and the needs It is designed to address research needs, where a premium The approach presented is scalable across a wide range ofĪpplications. Search engine, and demonstrates the efficacy of this approach on a range The search problem, discusses in detail a typical implementation of a Problem spaces in fairly small amounts of memory. Have advanced significantly due to the ability to handle extremely large Large vocabulary continuous speech recognition (LVCSR) systems If the command is not valid it simply discards it ![]() ![]() If the match is found the command is a valid one. The features of this command are then matched against the predefined command features. On getting a command the system will save the input as a. Once the ‗init‘ is recognized the system will then wait for the users commands. Initially the features of each command would be saved in a file. Wav file is being used because it stores the data in the digital form. The voice commands is being recorded and saved as a. To accept the voice commands User Must use a good quality microphone. Hence performing this various processing, text format of equivalent voice command is being displayed. for a particular application particular feature is being extracted. These are totally application dependent i.e. For this particular system processing being done are noise removal, feature extraction and pattern matching. The system accepts voice commands, performs processing on it to recognize the actual command before displaying the corresponding output. To develop a system to recognize system commands through voice and convert it into equivalent text, the system accepts voice commands from user and displays its equivalent text.
0 Comments
Read More
Leave a Reply. |