Category:speech recognition
-
ffmpeg installation
ffmpeg installation FFmpeg is a set of open source computer programs that can be used to record, convert digital audio and video to streams. Under the LGPL or GPL license. It provides a complete solution for recording, converting, and streaming audio and video. There are four ways to install ffmpeg, namely apt installation, precompiled version […]
-
Based on the Tianwen block compilation environment ASRPRO voice chip program writing tutorial (I) software download and basic program statement chapter
ASRPRO chip is a general-purpose, portable, low-power, high-performance speech recognition chip developed for low-cost offline speech applications. It adopts the third-generation BNPU technology, supports neural networks such as DNN\TDNN\RNN and convolutional operations, and supports speech recognition, voiceprint recognition, speech enhancement, speech detection, etc. It also possesses strong echo cancellation and ambient noise suppression capabilities. This […]
-
springboot integration vosk to achieve simple voice recognition function
vosk open source speech recognition Vosk is the open source speech recognition toolkit.Things that Vosk supports include: Nineteen languages are supported – Chinese, English, Indian English, German, French, Spanish, Portuguese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Persian, Filipino, Ukrainian, Kazakh. Work offline on mobile devices – Raspberry Pi, Android, iOS. Install it with […]
-
Speech coding techniques, AMR, AMR-NB, AMR-WB, EVS summary
I’ve recently become a bit interested in real-time speech coding technology, so I learned a bit about it. At first I heard about AMR-NB narrowband coding, and searched only to find more coding techniques, which are summarized here for future viewing. I. What is AMR, AMR-WB Full name Adaptive Multi-Rate and Adaptive Multi-Rate Wideband, mainly […]
-
OpenAI’s Artificial Intelligence Speech Recognition Model Whisper Explained and Used
1 whisper Introduction OpenAI, the company that owns the ChatGPT language model, has open-sourced the Whisper automated speech recognition system, and OpenAI emphasizes that Whisper’s speech recognition ability has reached the human level. Whisper is a general-purpose speech recognition model trained using a large amount of multilingual and multi-task supervised data, capable of achieving near-human […]
-
Speech recognition in action (python code)
Speech recognition in action (python : pyttsx, SAPI, SpeechLib example code) (I) Table of Contents for this article: I. Basic Principles of Speech Recognition (1) The origin and development of speech recognition (2) Basic principles of speech recognition (3) Speech recognition process (4) Recent developments in speech recognition II. Python Speech Recognition (1), text-to-speech conversion […]
-
Librosa Library – Speech Recognition, Speech Tone Recognition Training and Applications
Many students think that speech recognition is very difficult, but it is not, at first I also think so, but later found that speech recognition is the easiest, because students may not know that Python has an audio processing library Librosa, this library is very powerful, can be audio processing,spectrogramRepresentation, amplitude conversion, time-frequency conversion, feature […]
-
Introduction to javacv
Understand the history and development background of javacv JavaCV is an open source Java framework that provides Java-based interfaces for accessing various computer vision libraries and toolkits such as OpenCV, FFmpeg, etc. JavaCV is designed to provide Java developers with fast, simple and reliable image and video processing capabilities. The history of JavaCV dates back […]
-
Parameter description of the speech recognition model whisper
I. Introduction to whisper: Whisper is a general purpose speech recognition model. It is trained on a large dataset of various audios and is also a multitasking model that performs multilingual speech recognition, speech translation and language recognition. II. Parameters of whisper 1、-h, –help Viewing the parameters of whisper 2、–model {tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large-v1,large-v2,large} Select the model to […]
-
STM32F103 Driving LD3320 Speech Recognition Module
STM32F103 Driving LD3320 Speech Recognition Module LD3320 Speech Recognition Module IntroductionModule Pin DefinitionsSTM32F103ZET6 development board and module wiringtest codeResults LD3320 Speech Recognition Module Introduction Based on LD3320, voice recognition/voice control/human-machine dialogue functions can be easily realized in any electronic products, even the simplest system with 51 as the main controller. Add VUI (Voice User Interface) […]