Vista comes with speech recognition as part of the package you buy when you buy Vista. So what is speech recognition and can it be used to improve your life?
Many people have envisioned a time when the computer is so intelligent, it can understand commands and respond back with natural speech, a computer so intelligent it is able to hold a conversation. Well this sort of system is still a long way off. However if you happen to own a Microphone and a copy of Windows Vista, you can try the speech recognition out for yourself, and you will see – its actually not half bad, infact it actually works!
This is my experience with the speech recognition that comes with Vista. The very first thing you should do, is run through the training, if you don’t do this the computer will battle to understand correctly when you are dictating. However once you have run through the training 2 times or more, the computer starts getting better at understanding what you are trying to say.
There are 2 ways that you can use speech recognition to command your PC.
1. You can use it to navigate around windows. In other words you can say commands like “start” – which will bring up the start menu or “Maximize” which will Maximize a window or perhaps “switch application” which will allow you to switch between running applications. I like to think of these as command mode. And this is where the speech recognition performs extremely well, and is so impressive, the computer usually understands exactly what you want it to do.
2. Using your Microphone to dictate a letter. Firstly forget about using speech recognition in your instant messengers, like Skype or MSN. It simply will not work, so you might have thought you could sit around all day, and talk to your friends over skype with your voice and save your fingers… sorry not going to happen. Instead dictation is only available in popular programs like Microsoft Word, Wordpad or Notepad. Email programs also seem to be supported.
The overall quality of dictating is very good. I would say its about 7-8 out of 10. I would also say that a lot of the errors made are from the speaker not hitting the right tone a lot of the time. This is a problem with speech, there are some words in english that sound rather similar for example: hear and here. This is where AI comes into play – and the speech engine is certainly not dumb. It can usually get the context correct of such examples. Also it seems to understand and capitalize names and proper nouns so you don’t have to do a whole lot of correction. Another great benefit is that the spelling of long and complicated words always comes out correctly, since the speech engine naturally does not intentionally mis-spell words.
Having said all those nice things about Microsoft Vista Speech Recognition, I am now going to tell you that its simply not practical enough, and its roughly about 1/3rd the speed of typing the content. This as you would imagine, is simply too slow. There are times too when manual intervention is required, after about 10-15 minutes you just get the feeling that the whole exersize would end up taking a whole day to type a letter.
I would be really proud though if I was part of the development for the speech recognition, it is a marvel of modern technology, and is really a master piece. I would imagine that within the next 2-3 years it will be really ready for production usage, and right now its just been born.
For the critics of the speech system, remember that the engine when it says listening…. is listening for you… to say…. anything…. and that anything factor…. how it can transform….. what ever you want to say to what you did say…… out of a near infinitive choice of options…. you can see that in milli-seconds its translated to text…. this is a near miracle, and its taken some great contibuting minds to get that right.