KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: Better accuracy in supported applications?
Topic Summary:
Created On: 04/26/2020 12:37 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 Better accuracy in supported applications?   - xxtraloud - 04/26/2020 12:37 PM  
 Better accuracy in supported applications?   - dilligence - 04/26/2020 02:55 PM  
 Better accuracy in supported applications?   - Lunis Orcutt - 04/26/2020 06:53 PM  
 Better accuracy in supported applications?   - Stephan Kuepper - 04/27/2020 03:33 AM  
 Better accuracy in supported applications?   - xxtraloud - 04/27/2020 09:05 AM  
 Better accuracy in supported applications?   - Mav - 04/27/2020 10:02 AM  
 Better accuracy in supported applications?   - monkey8 - 04/27/2020 10:58 AM  
 AllBetter accuracy in supported applications?   - dilligence - 04/27/2020 12:26 PM  
 AllBetter accuracy in supported applications?   - Lunis Orcutt - 04/27/2020 03:01 PM  
 AllBetter accuracy in supported applications?   - dilligence - 04/27/2020 07:39 PM  
 Better accuracy in supported applications?   - Mav - 04/28/2020 02:19 AM  
 Better accuracy in supported applications?   - Ag - 04/28/2020 07:01 PM  
 Better accuracy in supported applications?   - R. Wilke - 04/28/2020 01:14 PM  
 Better accuracy in supported applications?   - Ag - 04/28/2020 07:05 PM  
 Better accuracy in supported applications?   - Mav - 04/29/2020 02:29 AM  
 Better accuracy in supported applications?   - R. Wilke - 04/29/2020 01:39 AM  
 Better accuracy in supported applications?   - xxtraloud - 05/02/2020 05:12 AM  
 Better accuracy in supported applications?   - monkey8 - 05/02/2020 07:36 AM  
 Better accuracy in supported applications?   - David.P - 01/25/2021 10:16 AM  
 Better accuracy in supported applications?   - bmac - 01/25/2021 11:17 AM  
 Better accuracy in supported applications?   - David.P - 01/25/2021 11:52 AM  
Keyword
 04/26/2020 12:37 PM
User is offline View Users Profile Print this message


xxtraloud
Top-Tier Member

Posts: 261
Joined: 12/14/2010

Is it just my impression or the recognition accuracy is substantially better in supported applications? For example when I dictate in Microsoft Word the accuracy is close to 99%. But when I dictate in software which is not supported, for example Thunderbird, it seems the recognition is way lower, probably around 80%. I tend to dictate in very similar style and with similar content, so I was wondering if these differences are due to some kind of compatibility issue. Although it seems strange because the recognition should be exactly the same.

 

I would like to know what the experts think about this.



-------------------------

Win 10 - DPI 15 - AT 8 pro + Andrea USB

 04/26/2020 02:55 PM
User is offline View Users Profile Print this message

Author Icon
dilligence
Top-Tier Member

Posts: 1432
Joined: 08/16/2010

I've got that impression as well. But I have to say that Thunderbird is in a league of its own when it comes to horrible accuracy (that's why I'm using it in most of my demonstration videos :-)).



-------------------------


Auto Box© Demo now available



 04/26/2020 06:53 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39368
Joined: 10/01/2006

We don't use Thunderbird but haven't noted lower accuracy in Dragon friendly or unfriendly applications. However, there are times when Dragon will refuse to accept certain words at the beginning of our dictation. When we run into this problem, we employ the KnowBrainer Type <dictation> command which forces Dragon to take it's best guess. When forced, Dragon seems very good at guessing. An example of being forced to use the Type <dictation> command is when we try to dictate “DPI” or “DPG”, at the beginning of a phrase. All we get is “???”. We could obviously train those abbreviations but it's faster, easier and not to mention more fun, to use KnowBrainer to make Dragon it's little bitch brute force dictation



-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1



 04/27/2020 03:33 AM
User is offline View Users Profile Print this message

Author Icon
Stephan Kuepper
Top-Tier Member

Posts: 2174
Joined: 10/04/2006

Recognition is much better in supported applications. That's why Nuance created the dictation box, which doesn't get much love around here. People seem to prefer all sorts of workarounds instead of simply adding a short "transfer text" command to their dictation.

-------------------------

www.egs-vertrieb.de - www.spracherkennungscloud.de

 04/27/2020 09:05 AM
User is offline View Users Profile Print this message


xxtraloud
Top-Tier Member

Posts: 261
Joined: 12/14/2010

Originally posted by: Stephan Kuepper Recognition is much better in supported applications. That's why Nuance created the dictation box, which doesn't get much love around here. People seem to prefer all sorts of workarounds instead of simply adding a short "transfer text" command to their dictation.

What is the reason for this difference in accuracy? The engine speech is the same, why does it produce very difference results?



-------------------------

Win 10 - DPI 15 - AT 8 pro + Andrea USB

 04/27/2020 10:02 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 440
Joined: 10/02/2008

From my experience with the inner workings of what makes up a "supported" application, I don't see any technical reason why this should be the case.

If you talk about accuracy then we're talking about dictation and not voice commands, right?

With voice commands there can be many reasons why they don't work, but with dictation you can immediately see what Dragon made of your speech, so I suggest we stick to this.

Whenever you start dictating, at the moment when the volume picked up by the microphone is above the background noise level, Dragon tries to find out the contents of the target window (usually the focused window).

If Dragon knows how to get this information from a given window, this window type is "supported" and the "full text control" indicator is being lit.

Otherwise, you get the dictation window or Dragon resorts to emulating the keyboard for text transfer.

In case of a supported window type, Dragon asks for the current text, the current selection/cursor position and the visible part of the text.

This information is used to build navigation grammars (so you can say 'select this and that') and sometimes words from the text are added as temporary words to the vocabulary (e.g. names from emails when replying).

But apart from that, no other information distinguishes a supported window from an unsupported one.

When Dragon doesn't know if the cursor is at the beginning of the text or after a period, it doesn't know whether to capitalize the first letter of the first recognized word.

But there is nothing influencing the accuracy at which this first word or the rest of the utterance is being recognized.

In order to get objective results you could try saying the exact same sentence into an empty supported and into an unsupported window (because Dragon cannot know what's in there, it doesn't matter for the recognition if there's text in there or not).

You should get the same results.

Regards,

mav

 

 04/27/2020 10:58 AM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3828
Joined: 01/14/2008

Originally posted by: xxtraloud 

What is the reason for this difference in accuracy? The engine speech is the same, why does it produce very difference results

 

Sorry Mav your hypothesis is not correct.  You can and do indeed get better accuracy from a supported select and say application versus one that only accepts text.  The reason is something called "left context".  In an application that has full text control (select and say), Dragon has access to the existing content and can pass the current context to the recognizer.  The recognizer uses the context to help decide the correct recognition result

 

I’ll give you a small example to demonstrate.

 

A hurricane can wreck a nice beach.

Dragon can recognize speech.

 

If you consider an application that already has the orange text in it when the user turns the mic on, the orange text becomes left context.  The audio for “can wreck a nice beach” is very similar to “can recognize speech” but the context of weather vs speech recognition application can change the result.

 

This is an example to demonstrate the idea.  Don’t assume that if you try this in a Word vs Notepad++ that you will see these exact results.  In practice of course left context is often quite a bit larger than 1 or 2 words.

 

So the conclusion from this is, that although Rob doesn't know it, not only do dictation box tools like speechproductivity allow you to dictate into non-select and say edit boxes they also make the text more accurate.  Likewise DragonCapture and the dictation box which you have all copied. Also as Stephan says this is one of the reasons why Nuance created the dictation box.  Although previous versions of Dragon before the dictation box allows you to dictate into non-select and say edit boxes the accuracy was not as good as it would be going via the dictation box.

 

Finally why is "left context" so important, simply because context is probably the single most factor that makes Dragon as accurate as it is. If you give the recogniser only "right context" then it has a lot less context to work on.



-------------------------



 04/27/2020 12:26 PM
User is offline View Users Profile Print this message

Author Icon
dilligence
Top-Tier Member

Posts: 1432
Joined: 08/16/2010

 

So the conclusion from this is, that although Rob doesn't know it, not only do dictation box tools like speechproductivity allow you to dictate into non-select and say edit boxes they also make the text more accurate.  

 

 

Finally why is "left context" so important, simply because context is probably the single most factor that makes Dragon as accurate as it is. If you give the recogniser only "right context" then it has a lot less context to work on.

 

 

That sure is interesting news. I knew the recognition in my boxes was better :-).

 

Pondering on that some and correct me if I'm wrong but, does that mean that the KB force dictation (open ended) commands are less accurate than using a dictation box? 



-------------------------


Auto Box© Demo now available

 04/27/2020 03:01 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39368
Joined: 10/01/2006

This has been an informative thread and it now looks like I could be wrong. Drat  However, no matter where I dictate, my accuracy still hangs around 99% and I never optimize my user profile. I'm also a former DJ (past life) and have been manipulating my personal vocabulary over 2 decades. Additionally, my average dictation phrase is typically between 7 and 20 words. This might just be my personal edge.

 

With the previous information contained within this thread… the KnowBrainer Type <dictation> command should potentially be slightly less accurate for end-users with 98% or less accuracy. This isn't something I can test but the Type <dictation> command not only seems as accurate, for me personally, it additionally increases my efficiency. 98% of my dictation is split between Dragon friendly second-generation Microsoft Edge, Microsoft Word and Microsoft Outlook.  I use the Type <dictation> command; not because Microsoft Word needs this function but because Type <dictation> additionally moves my cursor to the end of the paragraph; saving a step.



-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

 04/27/2020 07:39 PM
User is offline View Users Profile Print this message

Author Icon
dilligence
Top-Tier Member

Posts: 1432
Joined: 08/16/2010

"and have been manipulating my personal vocabulary over 2 decades. Additionally, my average dictation phrase is typically between 7 and 20 words. This might just be my personal edge."

 

Actually I think these are the two most important things we can learn from this post. They are so easy to forget sometimes but they probably have the most effect of all.

 

The difference in accuracy between Select-and-Say and Open Ended is mostly theoretical I think. The Type <dictation> command is effective for short dictations in speech unfriendly applications. For longer dictations however a fast dictation box with full Select-and-Say control is probably a better choice.

 

But of course both solutions remain workarounds..... 

 

If DPI 16 will allow full Select-and-Say control in each and every application that surely would be best for everyone :-) 



-------------------------


Auto Box© Demo now available

 04/28/2020 02:19 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 440
Joined: 10/02/2008

Monkey, regarding your "left context" argument, you are right.

Dragon can use the current text in the window for this as well and not just for navigation grammars. That's why I suggested comparing an empty supported window to an arbitrary unsupported window.

I guess it heavily depends on your environment whether this left context has a noticeable effect on your accuracy. If you usually fill short text fields in a rather form-based application, then most of the time you'll start with empty fields.

If you create longish documents in a text processor, then the document contents at the beginning of each utterance will have a greater influence.

So long story short: Your mileage may vary, as always

 

Oh, and @diligence:

Select and say in every text control will never happen, I'm afraid. For that to work, there had to be a uniform and generally accepted and implemented way for one application to query the text and cursor position for every edit control you can imagine.

If an application developer uses standard controls (Win32 Edit or Rich Edit, .NET TextBox/RichTextBox, ...), then these standard controls are supporting select-and-say out of the box, but there's a myriad of cases where custom editors are being used which don't make the required information accessible (or which are too small to justify a business case for Nuance to implement select-and-say for them).

 

 



 04/28/2020 07:01 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 647
Joined: 07/08/2019

monkey8 mentions left context and right context, but there are more sorts of context.

 

monkey8's left and right context applies to dictating in a linear manner, without going backwards:



E.g. these blue words that I said a while back and which are already transferred to the application
 while these red words are the things that I'm currently saying, which might be in the DragonCapture box and have not yet been transferred.



But when you already have a app/file


Here's a big chunk of text that's already in the file. ... 
Here's some later text that is also already in the app/file. 


and now you want to edit, e.g. to add the red text below


Here's a big chunk of text that's already in the file. 
Here is some new text.  Here's some later text that is also already in the app/file. 


You can look at both the text before and after the text you are inserting. 


--


Actually, I just found some notes from a speech recognition class I took in the 1980s.  The term "right context" in that class referred most specifically to recognizing patterns like postfix - not just "prefix command <...>" but also "<...> postfix", and also "<...>infix<...> " and combinations thereof such as "<...> infix <...> infix <...> suffix"


Of course, recognizing such context-sensitive grammars is distinctly harder than recognizing stuff in a linear manner. Most programming languages are designed to only require limited lookahead.  But since humans can do stuff context-sensitive, IMHO computers should too.


--


I suspect that the original Dragon HMM (Hidden Markov Model) was very much left to right, linear, using mostly left context when computing state transition probabilities.


It's not that hard to have lookahead. I wonder if Dragon is currently doing so.


--


This gives me hope of improved recognition for apps that are not fully "speech ready".


While it might always be best if the app informs the recognition engine of all text and cursor position, there might be intermediate points that do not require the apps to be modified that much:


E.g. the recognition engine might simply be able to look at all of the words in the apps text buffer - it might not know where it is, so it can't have an exact state transition, but at least it might know if the user is more likely to be talking about "cache" than "cash".


E.g. there might not be a nice API for the recognition engine to look at the app's text buffer(s) - but perhaps recognition engine can OCR a bitmap of the screen. That is already being used for some keyboard shortcut user automation programs on Mac OS. I'm not aware of anybody doing this for speech yet, but I wouldn't be surprised. At the moment this is mostly being done to recognize menus and commands, but there's no reason it should not also be used for dictation.


E.g. many apps nowadays are really just rendering HTML. Really just presenting a subset of the HTML.  If the recognition engine can determine what HTML content is being displayed... In fact, the recognition engine could be positioned in the middle, e.g. as a plug-in in a web browser or HTML rendering library that is being used by an app, and might be able to parse the entire HTML content, not just the fraction that is being displayed.


E.g. the recognition engine may remember all of the words actually sent to the app, and what corrections might have been made. Again, it does not necessarily know where those corrections have been made, so it might not be able to figure out which cache/cash should be used where in a sentence like "building a hardware cache that is 128 way associative might take a lot of money cash" if these words are edited into place rather than uttered in continuous dictation. But it might be better than nothing.


--

Early in the days of WIRED Magazine, Nicholas Negroponte, head of the MIT Media Lab, wrote an opinion piece saying "wouldn't it be better if this content was properly annotated XML rather than being displayed in the FAX-lik formatting used here?".  My first reaction was that he was right - it sure would be nice if everybody used properly annotated content. But my second reaction was, if a human can look at a fax and easily figure out the from and to addresses, why should a computer not be able to do so.  Well, with ML we are much closer to that being possible.



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.



 04/28/2020 01:14 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7703
Joined: 03/04/2007

Talking about "left content", the way it works is to make the visible part of the edit control within a Select-and-Say application become part of the internal dictation buffer, which you might also consider as some kind of "internal document". To accomplish this, particular methods are always called at the beginning of each utterance, to keep the buffer updated, also accounting for potentially user-related (keyboard) input.

 

This goes to show by Dragon always managing to get capitalisation and spacing correctly at the beginning of the follow-up dictation, in a Select-and-Say application.

Whether it holds also true for engaging the language model would be up for grabs, however. I have had my share of testing this, but have never come to terms with it.



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage



 04/28/2020 07:05 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 647
Joined: 07/08/2019

While we're at it: where do you find the recognition metrics like "I am 99% accurate" or "I am only 80% accurate in Thunderbird"?

I suppose for a properly speech-enabled app the recognition engine could keep track of how many things have been corrected. Or even for DictationBox or DragonCapture.


But I'm not sure how that would work for a non-speech-enabled app. It can't tell the difference between a correction of a speech recognition error or an edit of something that was recognized correctly but which I decided later needed to be changed.


But in any case, people throw around numbers, and I would like to know where they get them from?








-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.

 04/29/2020 02:29 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 440
Joined: 10/02/2008

Correction itself is another complex topic altogether.

For non-supported windows, Dragon can not even know if the recognized text has been transferred to the target window correctly, let alone monitoring changes in the text after it has been written into the target window. That's why you can say "correct that" even in an unsupported window, but only as long as you haven't used the keyboard, clicked the mouse or switched to a different window.

For supported target windows, to my knowledge real correction is only being performed when you explicitely trigger it (e.g. by saying "correct xyz" or by selecting something and saying "write that"). Simply overtyping is not considered a correction in the sense of "Dragon has recognized something different from what I wanted".

When you perform a correction in a supported target window, Dragon not only has to modify the text, but also the corresponding utterance in the background recognition session.

This means when you have a spoken 3 words in succession but want to correct the second word, then Dragon has to split the audio information into three parts (one for each word) and then remember that the audio for the middle word should produce different results as well as adapting the statistics for these words in combination.

If you just overtype, then usually Dragon cannot notice it until you start the next utterance because that's the moment when the text contents are mirrored back to Dragon.

In that case Dragon tries to find out where its internal dictation buffer differs from the text it just got.

If only a single word has been changed (by overtyping), then Dragon also has to split the recorded audio for the utterance this word occured in, but now this textual change is being interpreted as not being a recognition error but a change of mind on the speaker's side. Depending on some settings, the audio for the modified word is being discarded. Because you can perform arbitrary modifications on the text in between two spoken utterances, the text can be completely different from what Dragon remembered, so interpreting this new text as a correction for everything that has been said in this session doesn't make sense.

I guess there is some internal threshold on the amount of changes in the text Dragon uses to decide what to do.

You can watch this yourself if you play back a phrase where you used correction by voice vs. playing back a phrase where you manually changed a word.

The big problem in finding out exactly what's happening is that Dragon's vocabularies are a black box more or less. Sure, you can access the words and find out which ones have been added by the user, but all the statistics or the acoustic model are not accessible.

So if you correct a phrase you'd expect Dragon to increment the statistics for this trigram, but you cannot be sure and a single correction probably won't show an effect immediately.

Regards,

mav

 04/29/2020 01:39 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7703
Joined: 03/04/2007

But in any case, people throw around numbers, and I would like to know where they get them from?


Wild guessing. Don't take it too seriously.


-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 05/02/2020 05:12 AM
User is offline View Users Profile Print this message


xxtraloud
Top-Tier Member

Posts: 261
Joined: 12/14/2010

it has been very interesting to read this topic and indeed my statistics for accuracy are just guesses.

Indeed my style of dictation might be somewhat challenging because I never managed to utter long sentences, so this makes Dragon's job more difficult in unsupported application where there is no left context.

-------------------------

Win 10 - DPI 15 - AT 8 pro + Andrea USB

 05/02/2020 07:36 AM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3828
Joined: 01/14/2008

Originally posted by: Ag monkey8 mentions left context and right context, but there are more sorts of context.

 

and now you want to edit, e.g. to add the red text below

 

Here's a big chunk of text that's already in the file. Here is some new text. Here's some later text that is also already in the app/file. 


You can look at both the text before and after the text you are inserting.     

 

Ag,

 

You will notice a major difference between the left context example I gave above in orange and the example you have given here in red where you are saying that with the red text Dragon would consider left and right context (before and after as you put it). The difference is that the left context example I gave above was in the same sentence! You are inserting a new sentence here between 2 existing sentences so let me answer your point by asking you this question.

 

Do you think Dragon would consider left and right context text when they are in another sentence, you can have multiple choice?

 

a) Dragon would consider it in the same way as if the text was in the same sentence. This is what you seem to be saying at the moment.

b) Dragon would consider it as context but not give it the same weight as it would give to text in the same sentence.

c) Dragon would not consider context from another sentence at all.

 

Originally posted by: xxtraloud it has been very interesting to read this topic and indeed my statistics for accuracy are just guesses. Indeed my style of dictation might be somewhat challenging because I never managed to utter long sentences, so this makes Dragon's job more difficult in unsupported application where there is no left context.

 

It was a good question/great observation. Unfortunately you have a disadvantage from those who can speak in longer sentences. It is an issue that lots of ventilated patients get we have as clients.



-------------------------



 01/25/2021 10:16 AM
User is offline View Users Profile Print this message

Author Icon
David.P
Top-Tier Member

Posts: 617
Joined: 10/05/2006

Just saw this last year's thread only now.

 

I believe that this contributes to clarifying the original question about whether (and why) there is better accuracy in supported applications. It also goes hand-in-hand to what Lindsay was saying about left and right context further above.

 


(Click to enlarge)



-------------------------

Sennheiser MKH Mic
Visual & Acoustic Feedback + Automatic Mic Control



 01/25/2021 11:17 AM
User is offline View Users Profile Print this message

Author Icon
bmac
Top-Tier Member

Posts: 822
Joined: 10/02/2006

Originally posted by: xxtraloud it has been very interesting to read this topic and indeed my statistics for accuracy are just guesses. Indeed my style of dictation might be somewhat challenging because I never managed to utter long sentences, so this makes Dragon's job more difficult in unsupported application where there is no left context.

 

Forward disclaimer -- started with version 1 of Dragon but am truly an amateur when it comes to understanding how it works. So, this thread has been very helpful and enlightening to me. 

 

The above quote & and also from Lunis touched on what I believe has the most bearing on accuracy -- giving Dragon the ability to do proper context checking. When I demonstrate the capabilities of Dragon, I always start by reading a short article and the accuracy is most usually 100%. When I dictate, I pause way too often to collect my thoughts and I imagine that has a negative effect on overall accuracy. 

 

It would be interesting to see from those who have posted above and really understand the backend to also prepare a priority list of the most important speech dictation attributes a Dragon user can use to improve overall accuracy.



-------------------------

Bill
DPI 15.61, KB 2017, SpeechStart +, customized Desktop PC (AMD 12 core 9th gen Ryzen 9 3900x@3.8 GHz, 32 GB DDD4 SSD), Crucial 1 TB m.2 PCIe NVMe SSD HDD, Windows 10 Pro 64-bit, Philips SpeechMike Air 4000, Philips speech Mike 3500, SpeechWare 3-in-1 TableMike, MS Office 365 Professional and Microsoft Surface Pro 2017, Windows 10, 512 GB SSD, 16 GB ram, i7

 01/25/2021 11:52 AM
User is offline View Users Profile Print this message

Author Icon
David.P
Top-Tier Member

Posts: 617
Joined: 10/05/2006

Originally posted by: xxtraloud

it has been very interesting to read this topic and indeed my statistics for accuracy are just guesses. Indeed my style of dictation might be somewhat challenging because I never managed to utter long sentences, so this makes Dragon's job more difficult in unsupported application where there is no left context.

Originally posted by: bmac

When I dictate, I pause way too often to collect my thoughts and I imagine that has a negative effect on overall accuracy. 

 

Jim and Bill,

 

From the above-linked post you can take that Dragon has full left side context (in Select-and-Say applications) even when you only dictate single words. So while you will not have maximum accuracy dictating in shorter utterances, Dragon is still using the full left side context (a maximum of three words), and as much of the three word right side context as your utterance provides.

 

Originally posted by: bmac 

It would be interesting to see from those who have posted above and really understand the backend to also prepare a priority list of the most important speech dictation attributes a Dragon user can use to improve overall accuracy.

 

Dictating in longer utterances indeed will improve accuracy asymptotically. It is however not compulsory for acceptable accuracy, due to Dragons advanced context awareness.



-------------------------

Sennheiser MKH Mic
Visual & Acoustic Feedback + Automatic Mic Control



Statistics
32177 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 1 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 320 guests browsing this forum, which makes a total of 321 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2021 FuseTalk™ Inc. All rights reserved.