![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: Different accuracy in different programs Topic Summary: Accuracy is different in different programs Created On: 03/01/2007 09:14 PM Status: Post and Reply |
|
![]() |
|
Although I really like Word 2007, there is no doubt that dictating in it is less accurate than in Word 2000. This has nothing to do with the lack of full support for corrections in Word 2007. I have also noted this in other programs. Why is this and can something be done about it?
johnx |
|
|
|
![]() |
|
Are using NaturallySpeaking 9.5 or an earlier version of NaturallySpeaking? ------------------------- Change "No" to "Know" w/KnowBrainer 2020 |
|
|
|
![]() |
|
johnx, Word 2007 requires a lot more memory than Word 2000 did. Do you have at least 2 Gigabytes RAM? Martin ------------------------- Martin Markoe - BANNED USER: This user has been banned from these forums |
|
|
|
![]() |
|
I have the latest build of DNS, sp1, and have 1 meg of ram. However, even before I used DNS in Word 2007, I noticed this problem. My best recognition has been in Word 2000. I get noticeable differences in other programs.
johnx |
|
|
|
![]() |
|
Quote: I have the latest build of DNS, sp1, and have 1 meg of ram. However, even before I used DNS in Word 2007, I noticed this problem. My best recognition has been in Word 2000. I get noticeable differences in other programs. John, I don't question whether or not this occurs for you. However, consider the following. DNS loads entirely into memory and functions according to the way that you have the Options settings set up (DNS Options). Therefore, it functions exactly the same regardless of what application that you are dictating into. Personally, I have never seen this in any version of DNS from 4.0 right straight through 9.5. The only possible explanation would be related to dictating into a text editor window that does not fully support Select-and-Say, and if this is the case then something is wrong with your installation of DNS or there's something else going on underneath the hood on your system. Basically, DNS's code always executes exactly the same way in every application. That you would get different degrees of accuracy doesn't make any sense in terms of the way that the code functions. The only thing that might help me understand what's going on is to see several examples of these differences. Chuck Runquist Former DNS SDK & Senior Technical Solutions PM for DNS If you hear the sound of hoofbeats, think horses not zebras. ------------------------- VoiceComputer: the only global speech interface. |
|
|
|
![]() |
|
Chuck,
You are probably right. There is no explanation for this and I hate to say it, but maybe I am imagining it - after all, I think I imagined I was in love when I got married. Could it be that some programs take up more memory and this has an effect on accuracy?
johnx |
|
|
|
![]() |
|
My experience is similar to johnx, except that in my case, dicating into an Access 2003 application is much less accurate than Word 2003. I definitely see a difference in the accuracy of DNS with different applications.
Fred ------------------------- fred.albert Dragon 10.1 Pro (UK English), KB 2008, Acer Veriton 2800, Pentium D, 3.4, 2GB ram, Revolabs xTag USB wireless microphone, Win XP Pro, Office 2003
|
|
|
|
![]() |
|
Thanks Fred. I thought I was going crazy. But I worked in both Word 2000 and Word 2007 this week to see if I could see a difference, and I did. When writing an e-mail in Outlook express, this problem is more pronounced. But like Chuck said, this doesn't make any sense. There must be some sort of explanation for it.
johnx |
|
|
|
![]() |
|
John, I too have had the same experience. I have excellent accuracy in Word/Dragon Pad and poor accuracy in Outlook (especially). It has improved in the few months I have been using Outlook.
David (DNS 9.1 Medical) |
|
|
|
![]() |
|
I am trying to see if creating a new user for Word 2007 works better. Right now, I use DNS in Word 2000 and also have Word 2007 loaded when I don't need to dictate. Cumbersome, but it works for me. Maybe the problem is that the user profile for one program doesn't work well for another.
johnx |
|
|
|
![]() |
|
I can also verify that this is true. Dictating into Thunderbird is not as accurate as dictating into Word 2003.
|
|
|
|
![]() |
|
Well that's quite easy to explain - although to my knowlege it has not been explained before...
The reason is, in any given text document, NaturallySpeaking is fully aware of the context around the caret position -- but only if the text document is being worked on in an application that is fully "Select and Say" enabled. NaturallySpeaking is also context aware in applications that are only partly "Select and Say" enabled, however only as long as you only dictate but don't use your mouse or keyboard in the given window.
As soon as NaturallySpeaking loses the context of your dictation (i.e. as soon as you use mouse or keyboard in a non-"Select and Say" enabled window), the accuracy level goes down considerably, since in this case, NaturallySpeaking doesn't know anything anymore about the text around the caret (current text cursor) position.
Try the following:
1. Open a Word, WordPad or DragonPad window, and type the following two lines using the keyboard (or copy from here and paste it into your document)
- There are two prisoners per - He used to go to the shop to buy and
German users can try
- Ich fahre mit dem - Er gibt mir einen
Now use the mouse to position the text cursor behind the word "per" in the first line. Dictate "sell". NaturallySpeaking sure enough is going to type "cell". (German users dictate "Rat", and NaturallySpeaking is going to produce "Rad").
Similarly, use the mouse to position the text cursor behind the word "and" in the second line. Dictate "cell". NaturallySpeaking is going to type "sell". (German users dictate "Rad", and NaturallySpeaking is going to type "Rat").
2. Open a Thunderbird (or Notepad) window, and again type same two lines using the keyboard
- There are two prisoners per - He used to go to the shop to buy and
Now use the mouse to position the text cursor behind the word "per" in the first line. Dictate "sell". NaturallySpeaking probably is going to type "sell".
Similarly, use the mouse to position the text cursor behind the word "and" in the second line. Dictate "cell". NaturallySpeaking probably will type "cell".
You can make up similar examples for yourself with other homonyms, e.g.
- I used a (board) - He was (bored)
or
- The item costs 1 (cent) - The shipment is going to be (sent)
In all such cases, NaturallySpeaking is going to type everything correct each and every time (especially the words in brackets, although they are homonyms that can't be distinguished at all by their sound), even if you move around in your document by hand, dictate into and edit your text here and there, and even if you (what you shouldn't normally do) dictate only single words at a time.
This however only works as long as you dictate in an application that is fully select and say enabled.
The same basically is valid for applications that are only partially select and say enabled, as for example Thunderbird or Notepad -- but only as long as you do not click one single time, and do not use the keyboard for one single key press in your current dictation. As soon as you click, or press one single key when dictating in such applications, the accuracy level will go down.
I hope this explanation will clear up some of the questions asked in this thread,
Cheers David.P ------------------------- Sennheiser MKH Mic |
|
|
|
![]() |
|
David, this has been explained before, I guess it was you, but on the old forum, wasn't it? Anyway, marvellous work actually, showing how some parts of DNS function internally, and always worth a read. But, as for your introducing the term "homonyms" in this context, let me please add some more observation, based on a linguistic point of view, because one has to be very critical with terminology to understand the problems that may arise from DNS not being able to discern the context when homophones are encountered (some people around here call them "speech homophones", which doesn't make sense, just a hendiadyoin) during the recognition process. Couldn't have been better explained than in what I've found just now and which I've pasted in at the bottom. For those who don't understand the German examples which David has given, let me explain that the pairs of words he took all sound alike (homophones), although spelled differently - and meaning differently of course -, such as "Rad" (wheel) and "Rat" (advice), which would both be transribed /ra:t/ using the IPA.
"Homonyms (in Greek homoios = identical and onoma = name) are words that have the same phonetic form (homophones) or orthographic form (homographs) but unrelated meaning. In derivation, homonym means the same name, homophone means the same sound, and homograph means the same letters. from www.homonym.org
Rüdiger Wilke
-------------------------
|
|
|
|
![]() |
|
Hi Rüdiger, Quote: this has been explained before, I guess it was you, but on the old forum, wasn't it? Nah -- I just found out about this in the last days, so... But you're right insofar as we had another (trailblazing) thread where it was found that even when doing single-word correction in the correction menu or spell dialog box, DNS *does* take into account the context of the entire utterance, and changes the language model accordingly. This time however, the new find is that DNS, when you are dictating, accounts for the caret context REGARDLESS of the extent of any dictated utterance, yet it even takes account for such context that has not been dictated, but only typed, pasted, or been already in the document before starting dictation. THIS is the latest sensation (and there's some more to come, so be prepared.... ![]() Cheers David.P ------------------------- Sennheiser MKH Mic |
|
|
|
![]() |
|
This is a very cool discovery. I did not realize that NaturallySpeaking surveyed the text around the utterance. I thought it only took context into effect within an utterance. Just goes to show there is a lot about the software still to learn.
The examples are great, too. ------------------------- |
|
|
|
![]() |
|
Quote: But you're right insofar as we had another (trailblazing) thread where it was found that even when doing single-word correction in the correction menu or spell dialog box, DNS *does* take into account the context of the entire utterance, and changes the language model accordingly. David, I sort of remember, but just vaguely - as an aside, I do like your attitude of putting things in such an unassuming manner. But, help me, please, how did you find that or when the language model was changed? Rüdiger Wilke
-------------------------
|
|
|
|
![]() |
|
A few months ago I had carried out another experiment which proved that even with "single word selection and correction", the word's surrounding context IS taken into account by DNS, and the n-grams of the language model ARE being changed. Here's how:
1.) I dictate "I used a board". 2.) Then I select ONLY(!) "board", hit the correction hotkey to bring up the Spell Dialog Box and I change "board" to "bored".
The corrected sentence now of course says "I used a bored", which is stupid, but we carry on.
Because "I used a board" has a much higher statistical probability of occurrence within an n-gram than a nonsense utterance as "I used a bored", it is hard to convince NaturallySpeaking of the opposite, and therefore, steps 1.) and 2.) have to be repeated about five to ten times.
After this however, NaturallySpeaking is going to be persuaded -- and sure enough it produces "I used a bored" every single time when I say "I used a board".
Now one could argue that I only changed the general probability of the single word "bored" vs. "board" without NaturallySpeaking having taken into account any context of the utterance, nor the n-grams around "bored" or "board".
However, this is not the case, since, even after the above experiment, NaturallySpeaking still correctly types all sorts of sentences containing "board". This means that NaturallySpeaking still correctly types "board" in a general "board" context (e.g. "hammer, nails and board", "He's sitting on the board" etc.) and not "bored"!
ONLY if I dictate something that actually contains THE VERY SAME CONTEXT of above step 1.) -- like "To make use of a board" or "usage of a board", then, EVERY SINGLE TIME, NaturallySpeaking types "bored" not "board".
Another example that works really well is if you force NaturallySpeaking into producing "His mother aloud" every time you say "His mother allowed" -- again this can be done by selecting and only correcting the single word "allowed". Context of "allowed" other than "his mother" will not be affected -- NaturallySpeaking still will correctly type "allowed" in a general "allowed" context.
This proves that - although only doing single word correction without giving DNS any context in the Spell Dialog Box - the Language Model and the n-Gram assignment in the context of the word "board/bored" or "allowed/aloud" actually DID change.
Note that on the other hand, it takes the same number of corrections when using the "complete utterance correction" method, to convince NaturallySpeaking into producing "I used a bored" when you say "I used a board". This shows that the "single word correction method" is just as effective as the "complete utterance correction method".
Still, the overall best way of correction still is to simply place the cursor anywhere inside (or behind) an utterance, and then say "correct that".
Cheers David.P ------------------------- Sennheiser MKH Mic |
|
|
|
![]() |
|
David, that's really interesting and worth experimenting with, at least for me, so I'll look into sometime when I get around to it. You mentioned one has to repeat entering non-standard contexts several times (5-10) until the changes take effect in the language model obviously. Did you, per chance, test also at which point user files would need saving, at least one would be asked so on closing out of DNS, thus indicating the files have been changed? For those who are interested in this, and who have the Pro Version, I've provided a little script for testing this. The SpeakerModified method does exactly this, testing whether the files have been changed, and if so, you would be asked if you want to save. Sub Main
Rüdiger Wilke
-------------------------
|
|
|
|
![]() |
|
Hi Rüdiger,
Not systematically. It could be noted however that after every experiment, when closing the user, NaturallySpeaking would ask me to save the user files, thus indicating that they were changed.
Since this (by itself) doesn't prove that the Language Model has been updated accordingly, I only took it for an additional indication.
The proof for the Language Model update was, as described, to be able to say "his mother allowed" and have NaturallySpeaking produce "his mother aloud" -- and on the other hand still being able to dictate "I am allowed" etc. without problem.
Cheers David.P ------------------------- Sennheiser MKH Mic |
|
|
FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.