Monthly Archives: October 2011

“silicon alley”

People who have done tech in NYC for a long time think of the phrase “Silicon Alley” as referring to NYC during the dot-com bubble, from roughly 1995-2000. If you hear someone use the phrase to refer to today’s NYC tech scene, you can be pretty sure they know very little about the topic.

Advertisements

Client-side mobile speech recognition

Imagine if on your iPhone you had to type a whole paragraph, and then wait a few seconds for it to get sent to Apple’s server, and then get the text back to see if any words were mistyped or miscorrected. 

That is how speech recognition today works on mobile devices. It is all done server side (to try out a state-of-the-art example, download Dragon Dictation app on your iPhone, or try the built-in speech rec on an Android device). Perhaps Apple’s Siri will improve this (I hope!). But until speech recognition gets to be very close to 100% accuracy, the best way to improve the user experience will be to show each word and sentence as you speak and let you correct as it goes without waiting for the back-and-forth to web servers.

Open source projects like CMU’s PocketSphynx seem to provide sophisticated client side mobile speech rec. My understanding is that modern mobile devices don’t have the resources (processing/memory) to allow client-side speech rec to get nearly the accuracy levels as you can on the server side. At least not yet.