KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: "We're" and "were"
Topic Summary: How to improve recognition
Created On: 06/04/2012 01:22 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 "We're" and "were"   - SusanG - 06/04/2012 01:22 PM  
 "We're" and "were"   - Alan Cantor - 06/04/2012 02:02 PM  
 "We're" and "were"   - SusanG - 06/04/2012 02:17 PM  
 "We're" and "were"   - Lunis Orcutt - 06/04/2012 02:44 PM  
 "We're" and "were"   - Alan Cantor - 06/04/2012 03:25 PM  
 "We're" and "were"   - SusanG - 06/04/2012 03:39 PM  
 "We're" and "were"   - brainybanana - 06/05/2012 06:27 PM  
 "We're" and "were"   - Alan Cantor - 06/05/2012 10:20 PM  
 "We're" and "were"   - Chucker - 06/06/2012 08:38 AM  
 "We're" and "were"   - Alan Cantor - 06/06/2012 04:02 PM  
 "We're" and "were"   - maxr - 06/08/2012 03:03 AM  
Keyword
 06/04/2012 01:22 PM
User is offline View Users Profile Print this message

Author Icon
SusanG
Junior Member

Posts: 21
Joined: 06/04/2012

Hi-

I have DNS 11.5 and generally the recognition is outstanding. However, when I dictate "we're" more often than not DNS types "were" even when I'd think the context would be clear. Sometimes it even types "where" instead and Word will flag it.

I pronounce the former as "weer" and the latter as "wur" and I've confirmed this by playing back my dictation.

Is there a way to get DNS to distinguish between the two more reliably?

Thanks,

Susan

 

 

 06/04/2012 02:02 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 1481
Joined: 12/08/2007

Are you dictating words in isolation, or long phrase? DNS won't always get it right, but you increase the likelihood of better accuracy if you dictate entire phrases, without pausing.

 

Training the words individually is unlikely to have much of an effect.

 

When I dictate "We're going to where we were before" in one fell swoop, DNS didn't get it 100% right, but it certainly had the right idea: "We are going to where we were before."

 06/04/2012 02:17 PM
User is offline View Users Profile Print this message

Author Icon
SusanG
Junior Member

Posts: 21
Joined: 06/04/2012

Hi Alan,

I get "we are" a lot, too, and that's "wrong" -- not really, of course, but based on what I need it to do it is. And yes, I do dictate in phrases for context.

It used to have problems with "will" and "we'll," too, but that's cleared up considerably. These two are really the only ones it's still confusing on a regular basis.

Thanks,

Susan

 06/04/2012 02:44 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22579
Joined: 10/01/2006

          Welcome to the World's Most Popular Speech Recognition Forum

 

In theory, NaturallySpeaking should be able to differentiate between “were” and “we're” from your use of the word in a phrase which DNS compares in its internal tables (think of it as a pseudo-grammar checker) but we also experience this recognition error a little too frequently and haven't managed to come up with a workaround either.



-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 



 06/04/2012 03:25 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 1481
Joined: 12/08/2007

I suggest creating a writing sample that contains sentences that include several examples of the phrases you dictate, with emphasis on phrases that contain the words "we're" and "were."

 

Five or ten pages is probably plenty. Save it as a plain text file.

 

Create a new profile,and skip training. When DNS prompts you to allow it to analyze documents and email, decline the offer.

 

After you have created the profile, choose "Learn from specific documents," which is on the "Vocabulary" menu. Let DNS analyze your writing sample.

 

Test your new profile. How does it work?

 

If the problem is resolved, or mostly resolved, switch back to your original profile, and export your custom words and commands. Review the word list before proceeding. There will be one word (or phrase) per line. If you find words you don't need or want, delete them.

Finally, switch to your new profile and import your custom words and commands.

 06/04/2012 03:39 PM
User is offline View Users Profile Print this message

Author Icon
SusanG
Junior Member

Posts: 21
Joined: 06/04/2012

Hi Alan,

That sounds like a plan! I'll try it next chance and report back.

 

Thanks!

 

Susan

 

 06/05/2012 06:27 PM
User is offline View Users Profile Print this message

Author Icon
brainybanana
Top-Tier Member

Posts: 339
Joined: 08/27/2010


Susan, what you outline is one of the most intractable problem with DNS! Often, when dictating, I am frustrated by the inappropriate use of possessive pronouns by DNS. Nonetheless, it's always difficult trying to remember not to treat DNS anthropomorphically. The suggestion made by Alan is excellent.

Also, it is not beyond the realms of possibility that the recognition error is being caused by your microphone/soundcard combination. If you are using the microphone that came with DNS it is not beyond the realms of possibility that is the real culprit. Also, regional accent is not to be overlooked. Being Irish, some words, no matter how pronounced by me, DNS will not recognise as a result of my accent; while those who speak with more finesse have no recognition problems with these words!




-------------------------
DNS 12.0 Professional, Windows 7, Intel Core i7 2630QM, 16GB of RAM. Second-Generation SpeechWare 6-in-1.

 06/05/2012 10:20 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 1481
Joined: 12/08/2007

Brainy Banana and Susan,

When I teach somebody to use speech recognition software, a lesson I try to impart is that users need to have realistic expectations about what the technology can do, what it cannot do. NaturallySpeaking is not intelligent in any sense of the word: underlying the ability of the software to "recognize" speech is not an understanding of language, but the application of sophisticated mathematics. Put another way, the human brain and the digital computer use completely different ways to interpret sounds and translate sounds into strings of words. It's astonishing that programs like NaturallySpeaking work as well as they do, given the software is as dumb as a doorknob.

With that in mind, one should expect the software to make silly mistakes. Success in using speech recognition technology is, at least in part, a matter of knowing how to recover gracefully and quickly from the inevitable errors, without getting (overly) frustrated!

 06/06/2012 08:38 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9667
Joined: 10/10/2006

Originally posted by: Alan Cantor Brainy Banana and Susan, When I teach somebody to use speech recognition software, a lesson I try to impart is that users need to have realistic expectations about what the technology can do, what it cannot do. NaturallySpeaking is not intelligent in any sense of the word: underlying the ability of the software to "recognize" speech is not an understanding of language, but the application of sophisticated mathematics. Put another way, the human brain and the digital computer use completely different ways to interpret sounds and translate sounds into strings of words. It's astonishing that programs like NaturallySpeaking work as well as they do, given the software is as dumb as a doorknob. With that in mind, one should expect the software to make silly mistakes. Success in using speech recognition technology is, at least in part, a matter of knowing how to recover gracefully and quickly from the inevitable errors, without getting (overly) frustrated!

Alan,

 

Bravo!!!

 

I'm glad to see someone else is positing the difference between human speech and computer speech recognition.

 

One additional point in emphasizing the difference between the way the human brain works the way that computer-based speech recognition works. Speech recognition attempts to convert speech to text. When to people enter into a conversation, there is no results box displaying in your mind. That is, when you talk to another person, neither you nor the other person sees words flashing before your eyes. Words are transparent in human conversation. Context in human speech is based on perception and meaning. Context in speech recognition is based on the relationship of each word to every other word in an utterance. In this sense, for human beings, transcription is instantaneous and we are much better at understanding what another person saying because of this even if we don't know what a particular word or words mean. Human speech recognition does not require sophisticated thrashing to find the "BestMatch". We do this instantly and automatically. The human brain functions on the basis of what are called autonomous ego functions (i.e., memory, motility, perception, and judgment). These do not require conscious thought. For example, how many times have you driven through a stoplight and then looked in your rearview mirror to see if the light was green. Our conscious perception in such cases is based on our perceptual focus and print i.e., on the traffic around us and the other circumstances that are more important for conscious concentration. Even in this context, we don't consciously think about the perceptual input of everything around us. It's all subconscious and automatic. Speech recognition simply cannot do this. Perhaps someday when better forms of speech rhythms and artificial intelligence programming is incorporated into speech recognition, the results will be much more like humans speech. However, even in this context, computer speech recognition will never absolutely equal the amazing characteristics and capabilities of the human brain.

 

Many users wonder why Dragon gives them some bizarre results that don't seem to make sense. This is simply because, as you put it so aptly, they are anthropomorphizing speech recognition. Even a three-year-old child just learning to use speech understands basically what others are saying to them significantly better than contemporary speech recognition. As long as users misinterpret speech recognition as being equivalent to carrying on a conversation with another person, they're going to continue to be baffled and confused when it doesn't work the way they expect it to. This is simply because their expectations are inappropriate to the context of how speech recognition works.

 

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: We don't make Dragon NaturallySpeaking, We make it easier!

Don't anthropomorphize speech recognition, it hates that - Unknown



-------------------------



 06/06/2012 04:02 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 1481
Joined: 12/08/2007

I'm glad to see someone else is positing the difference between human speech and computer speech recognition. 

 

Chucker, some of the things I know about speech recognition have come about from reading your posts on this forum!

 

 06/08/2012 03:03 AM
User is offline View Users Profile Print this message


maxr
Senior Member

Posts: 113
Joined: 12/08/2009

Nice one Chuck! Bravo to you too. Perfect speech recognition will require the development AI as intelligent as we are because of all the conscious and subconscious mechanisms at work when interpreting speech. However, I think we can get 90% there with just a little more environment cues. Just knowing a few basic things will make a tremendous difference even without ever touching AI or handling these hard machine learning issues. Just knowing if the user is looking at the PC (rather than distracted, on the phone, talking to someone else, etc) will make a tremendous difference. Face tracking and lip reading via a webcam will likely solve a lot of misrecognition issues. As will simpler operating systems that remove a lot of clutter and complexity. A good byproduct of metro, for instance, is having just one app at a time. This will greatly simplify the speech interface. Right now we have to do kungfu basically to navigate an operating system that was never intended to be speech friendly. Simplification will narrow the scope and provide a better voice experience. I don't think we are that far off from having really great voice recognition and bridging the divide from computer speech and human speech interaction. I think people will mistake relatively simple tricks for AI before long. 

 

Max Roth

Maker of DynamicKeyboardOne for Naturally Speaking



-------------------------

ErgoArchitect Assistive Technologies - 

ShowNumbers Plus! Addon to Naturally Speaking - 

www.ergoarchitect.com 

KnowBrainer Speech Recognition » NaturallySpeaking Speech Recognition » "We're" and "were"

Statistics
27356 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 0 users logged in.
The most users ever online was 2028 on 04/05/2013 at 07:36 PM.
There are currently 92 guests browsing this forum, which makes a total of 92 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2013 FuseTalk™ Inc. All rights reserved.