Member-only story

Audio Transcription in Python

Convert an audio file into a text file using Speech Recognition

5 min readAug 22, 2022

Here you will save the audio from a microphone to a file and then transcribe the audio file to text using speech recognition.

Audio is any sound in a range that the human ear can hear. Audio, voice, or music is recorded using a microphone and then saved to an audio file. The audio file stores the audio signals in a digital format. Speech Recognition processes audio data to identify spoken words and converts them into a readable text file.

Basic steps of Speech Recognition(Inspired by: Understanding the CMU Sphinx Speech Recognition System)

The bit layout of the audio data excluding metadata is called the audio coding format. The audio coding format can be uncompressed or compressed to reduce the file size, often using lossy compression.

A codec performs encoding and decoding of the raw audio data while the encoded data is stored in a container.

The different audio file formats based on size, sound quality, and compatibility are broadly divided into

Uncompressed audio formats are WAV, AIFF, or PCM. WAV and AIFF are widely supported audio formats used in compact discs. Uncompressed audio formats…

Audio Transcription in Python

Convert an audio file into a text file using Speech Recognition

Written by Renu Khandelwal

Responses (1)