“Sensing” Voice

In my vision of future state for enterprise applications, the 6th element is "Sensing".   

I've used this term to capture how future applications will create new value for users by sensing relevance, context and personal preferences through analytics of voice, video, text, location, attention or other ambient and declarative data from the user.  The ability to capture, store, index, search and analyze voice recordings is fundamental to this future state vision.

Nuance, BBN, TellMe/Microsoft, Nexidia, CallMiner, Utopy, SER, IBM and others have invested to improve STT, TTS, ASR and Speech Analytics technologies that are all critical to this "sensing" end-state.

Recently, Microsoft announced at the Mobile World Congress in Barcelona - Microsoft Recite -  a Voice capture and search application for Windows Mobile devices.

To get a sense of the UX and VUI, check out this video clip…

 


With Recite users can record voice messages and then, based on a voice interface, search for specific terms or phrases to find earlier messages… and I suspect with time… earlier conversations. 

It appears as if Recite uses some type of voice pattern matching or phonetic search engine.  There is not translation from speech to text and the accuracy improves with longer search phrases.  Both of these characteristics points to phonetic processing.  

You can download the app here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s