Once upon a time, an angry student came to Master Coffee. She complained about “stupid rules disallowing taking videos as notes during lectures in the digital age”, and Master Coffee offered her a “smarter than the rules” solution – Transcribe audio to text in class.
Yes, it’s the digital age. There are many technologies to solve stupid problems and restrictions. The rules didn’t say it’s illegal to transcribe audio in class, so let Master Coffee walk you through a simple example – Let’s go.
CODE DOWNLOAD
I have released this under the MIT license, feel free to use it in your own project – Personal or commercial. Some form of credits will be nice though. 🙂
VIDEO TUTORIAL
TRANSCRIBE AUDIO DEMO
Click the above button to start, and speak something into the mic. Take note that it requires the Speech Recognition API, and it is only supported in some browsers at the time of writing.
1) THE HTML
<textarea id="result"></textarea>
<input type="button" id="toggle" value="Loading" onclick="transcribe.toggle()" disabled>
<textarea>
To output the speech-to-text result.<input type="button">
A button to toggle start/stop transcribe.
2) THE JAVASCRIPT
var transcribe = {
// (PART A) PROPERTIES & FLAGS
hres : null, // html textarea
htog : null, // html toggle button
sr : null, // speech recognition object
listening : false, // speech recognition in progress
// (PART B) INIT
init : () => {
// (B1) GET HTML ELEMENTS
transcribe.hres = document.getElementById("result");
transcribe.htog = document.getElementById("toggle");
transcribe.htog.value = "Click to start";
transcribe.htog.disabled = false;
// (B2) INIT SPEECH RECOGNITION
const SR = window.SpeechRecognition || window.webkitSpeechRecognition;
transcribe.sr = new SR();
transcribe.sr.lang = "en-US";
transcribe.sr.continuous = true;
transcribe.sr.interimResults = false;
// (B3) OUTPUT RESULT
transcribe.sr.onresult = e => {
let said = e.results[e.results.length-1][0].transcript.trim();
said = said.charAt(0).toUpperCase() + said.slice(1) + ".";
document.getElementById("result").value += said + "\n";
};
// (B4) ON ERROR
transcribe.sr.onerror = e => {
console.error(e);
transcribe.htog.value = "ERROR";
transcribe.htog.disabled = true;
alert("Make sure a mic is attached and permission is granted.");
};
},
// (PART C) TOGGLE START/STOP RECOGNITION
toggle : () => {
if (transcribe.listening) {
transcribe.sr.stop();
transcribe.htog.value = "Click to start";
} else {
transcribe.sr.start();
transcribe.htog.value = "Click to stop";
}
transcribe.listening = !transcribe.listening;
}
};
// (PART D) START
window.addEventListener("load", transcribe.init);
Keep calm, drink some coffee, and it is easier to trace in this order:
- (B & D) On window load, we initialize the “transcriber app” with
transcribe.init()
. - (B) Long-winded but straightforward initialize process.
- (B1) Get the HTML text area and toggle button.
- (B2) Create a new speech recognition object.
- (B3) On successful transcribing, output the text into the HTML text area.
- (B4) On errors, show the error.
- (C) Self-explanatory. Toggle start and stop speech recognition.
MICROPHONE PERMISSION
Take note, the user needs to give the microphone permission for this to work. If the user denies access permission, the only way is to enable it manually – In most browsers, click on the icon beside the URL and allow “microphone”.
THE END – TRANSCRIBE RESTRICTION
That’s all for this short tutorial and sharing. Just a few “small notes” to end this one:
- How accurate the transcription is depends on the browser and platform. Although, you can change the language in (B2) and even tweak the output in (B3).
- A small worry I have is with (B3)
let said = e.results[e.results.length-1][0].transcript
. This array seems to grow infinitely long, and can be a potential problem if you transcribe for hours. - You may want to stop at a certain limit and save the results to persistent storage –
if (e.results.length==100) { STOP TRANSCRIBE & SAVE }
. - If you want to save the transcribed text somewhere, check out my tutorial on storing data in Javascript.
- Lastly, there is seemingly no way to transcribe an audio file directly… Play the audio file, and put your microphone beside the speaker.
CHEAT SHEET