Sub-field of computational linguistics that develops methodologies and technologies that enables recognition and translation of spoken language into text by computers
About Speech Recognition
Speech recognition is the task of detecting spoken words but there is more to speech recognition than recognizing individual sounds in the audio: sequences of sounds need to match existing words, and sequences of words should make sense in the language. This is called “language modelling.” Language models are typically trained over very large corpora of text, often orders of magnitude larger than the acoustic data.
Whilst speech recognition has been around for decades, recent advances in deep learning finally made speech recognition accurate enough to be useful outside of carefully controlled environments. Speech recognition is built into our phones, our game consoles and our smart watches. It’s even automating our homes.
Common Tools and Libraries
AI Speech Lab
SpeechLab technology is available as a service for both batch and near-real-time processing. Please contact AI Singapore for further information.
Kaldi GStreamer: https://github.com/jcsilva/docker-kaldi-gstreamer-server
Developer's Resource: https://github.com/Picovoice/Porcupine
Azure Cognitive Services
Developer's Resource: http://boofcv.org/index.php?title=Download
Developer's Resource: https://github.com/PaddlePaddle/DeepSpeech
100E Use Cases
- SCDF – Use SpeechLab technology to support verbatim transcription of calls so that call-takers can focus more on listening rather than typing and transcribing into English so that call-takers can better understand the conversation. Transcripts can also be used for further analysis.
- Socibot – AI demonstration platform will use SpeechLab technology to support integration with Azure Cognitive services knowledgebase to answer questions from local Singaporeans more accurately. Socibot also uses Porcupine for wakeword detection to reduce latency and improcve user experience.
- How to start with Kaldi and Speech Recognition
Link to article: https://towardsdatascience.com/how-to-start-with-kaldi-and-speech-recognition-a9b7670ffff6
- Simple guide to Kaldi – an efficient open source speech recognition tool for extreme beginners
Link to article: https://medium.com/@nikhilamunipalli/simple-guide-to-kaldi-an-efficient-open-source-speech-recognition-tool-for-extreme-beginners-98a48bb34756
- Creating voice assistant for games tutorial for Fifa
Link to article: https://towardsdatascience.com/creating-voice-assistant-for-games-tutorial-for-fifa-71cfbe428bd1