speech recognition - Develop Pile

Category：speech recognition

ffmpeg installation

Time：2024-5-28

ffmpeg installation FFmpeg is a set of open source computer programs that can be used to record, convert digital audio and video to streams. Under the LGPL or GPL license. It provides a complete solution for recording, converting, and streaming audio and video. There are four ways to install ffmpeg, namely apt installation, precompiled version […]

Tags: artifact, speech recognition
Based on the Tianwen block compilation environment ASRPRO voice chip program writing tutorial (I) software download and basic program statement chapter

Time：2024-5-10

ASRPRO chip is a general-purpose, portable, low-power, high-performance speech recognition chip developed for low-cost offline speech applications. It adopts the third-generation BNPU technology, supports neural networks such as DNN\TDNN\RNN and convolutional operations, and supports speech recognition, voiceprint recognition, speech enhancement, speech detection, etc. It also possesses strong echo cancellation and ambient noise suppression capabilities. This […]

Tags: embedded hardware, one-chip computer, speech recognition
springboot integration vosk to achieve simple voice recognition function

Time：2024-5-9

vosk open source speech recognition Vosk is the open source speech recognition toolkit.Things that Vosk supports include: Nineteen languages are supported – Chinese, English, Indian English, German, French, Spanish, Portuguese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Persian, Filipino, Ukrainian, Kazakh. Work offline on mobile devices – Raspberry Pi, Android, iOS. Install it with […]

Tags: back end, speech recognition, spring boot
Speech coding techniques, AMR, AMR-NB, AMR-WB, EVS summary

Time：2024-4-16

I’ve recently become a bit interested in real-time speech coding technology, so I learned a bit about it. At first I heard about AMR-NB narrowband coding, and searched only to find more coding techniques, which are summarized here for future viewing. I. What is AMR, AMR-WB Full name Adaptive Multi-Rate and Adaptive Multi-Rate Wideband, mainly […]

Tags: artificial intelligence (ai), audio development, ffmpeg, speech recognition
OpenAI’s Artificial Intelligence Speech Recognition Model Whisper Explained and Used

Time：2024-3-26

1 whisper Introduction OpenAI, the company that owns the ChatGPT language model, has open-sourced the Whisper automated speech recognition system, and OpenAI emphasizes that Whisper’s speech recognition ability has reached the human level. Whisper is a general-purpose speech recognition model trained using a large amount of multilingual and multi-task supervised data, capable of achieving near-human […]

Tags: ai digital human technology, deep learning, pytorch, speech recognition, whisper
Speech recognition in action (python code)

Time：2024-2-14

Speech recognition in action (python : pyttsx, SAPI, SpeechLib example code) (I) Table of Contents for this article: I. Basic Principles of Speech Recognition (1) The origin and development of speech recognition (2) Basic principles of speech recognition (3) Speech recognition process (4) Recent developments in speech recognition II. Python Speech Recognition (1), text-to-speech conversion […]

Tags: artificial intelligence (ai), deep learning, development language, python, python applications, speech recognition
Librosa Library – Speech Recognition, Speech Tone Recognition Training and Applications

Time：2024-2-10

Many students think that speech recognition is very difficult, but it is not, at first I also think so, but later found that speech recognition is the easiest, because students may not know that Python has an audio processing library Librosa, this library is very powerful, can be audio processing,spectrogramRepresentation, amplitude conversion, time-frequency conversion, feature […]

Tags: artificial intelligence (ai), deep learning, speech recognition
Introduction to javacv

Time：2024-2-5

Understand the history and development background of javacv JavaCV is an open source Java framework that provides Java-based interfaces for accessing various computer vision libraries and toolkits such as OpenCV, FFmpeg, etc. JavaCV is designed to provide Java developers with fast, simple and reliable image and video processing capabilities. The history of JavaCV dates back […]

Tags: java, javacv, opencv, sound and video, speech recognition, video codec
Parameter description of the speech recognition model whisper

Time：2024-1-9

I. Introduction to whisper: Whisper is a general purpose speech recognition model. It is trained on a large dataset of various audios and is also a multitasking model that performs multilingual speech recognition, speech translation and language recognition. II. Parameters of whisper 1、-h, –help Viewing the parameters of whisper 2、–model {tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large-v1,large-v2,large} Select the model to […]

Tags: colloquial (rather than literary) pronunciation of a chinese character, openai, parameters, whisper, writing style
STM32F103 Driving LD3320 Speech Recognition Module

Time：2023-10-10

STM32F103 Driving LD3320 Speech Recognition Module LD3320 Speech Recognition Module IntroductionModule Pin DefinitionsSTM32F103ZET6 development board and module wiringtest codeResults LD3320 Speech Recognition Module Introduction Based on LD3320, voice recognition/voice control/human-machine dialogue functions can be easily realized in any electronic products, even the simplest system with 51 as the main controller. Add VUI (Voice User Interface) […]

Tags: electronic module testing, one-chip computer, speech recognition, stm32, stm32 column, transducers