On Friday, I saw a write-up on using the Web Speech API for capturing and transcribing speech. I was inspired to see how this could be incorporated into Ext JS, so I started exploring.

First, I discovered that you can already add the x-webkit-speech attribute on a text field, and it will automatically create an audio-capture-ready text field in Chrome. All you have to do is hook up listeners to handle particular events. While this was promising, I found a big problem: if you try this in Canary, you’ll be greeted with a nasty deprecation warning. Apparently, Chrome will eventually be ditching the input-support in favor of full-on use of the JavaScript API.

No matter, that’s more flexible anyway. Based on this conclusion, I dove into the API and created an Ext JS wrapper that supports interactions with the Web Speech API. You can try out an example, and grab the source on GitHub.

About the API

The API has a fair amount to it, so I’ll highlight some of the configuration options, as well as describe a bit about the results, which can be a bit confusing.

  • continuous: If false, you get a one-shot stab at capturing audio. Once the audio is no longer detected, the capture More >