![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: I just learned that Nuance took my suggestion Topic Summary: Just not in the version I am using. Created On: 05/21/2023 04:55 PM Status: Post and Reply |
|
![]() |
![]() |
- Dusty_Dingo | - 05/21/2023 04:55 PM |
![]() |
![]() |
- ax | - 05/21/2023 07:47 PM |
![]() |
![]() |
- Dusty_Dingo | - 05/21/2023 09:44 PM |
![]() |
![]() |
- Mav | - 05/22/2023 08:15 AM |
![]() |
![]() |
- MDH | - 05/22/2023 09:08 PM |
![]() |
![]() |
- Mav | - 05/23/2023 03:46 AM |
![]() |
![]() |
- Dusty_Dingo | - 05/22/2023 09:30 PM |
![]() |
![]() |
- ax | - 05/23/2023 03:19 AM |
![]() |
![]() |
- R. Wilke | - 05/23/2023 04:10 AM |
![]() |
![]() |
- ax | - 05/23/2023 01:02 PM |
![]() |
![]() |
- Ag | - 05/26/2023 01:29 AM |
![]() |
![]() |
- Ag | - 08/25/2023 01:38 AM |
![]() |
|
I just learned that Dragon Professional Anywhere has a feature that can "lock" on any window, so that you can continue dictating while that window is in the background. Not sure how useful this feature is in practice.
However, I had given Nuance a suggestion to include a feature where Dragon can "lock" on a window, so that dictation can continue in the background. At the time DPI 14 had just released.
Why could not they include this feature in DPI 15/16!? :'( |
|
|
|
![]() |
|
"Anchoring" (the speech focus) or "out of focus" dictation is the lingo.
And it's most definitely NOT "any window". All browser windows are excluded at this time. Bits and pieces of this topic litter all over this forum. A big chunk is below: https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36887 |
|
|
|
![]() |
|
Haha! Like many features Nuance implements this also sounds very half-baked (looking at the new "multiscreen" mouse control).
Thank you for the suggestions! Autobox was something I have never heard about before! It seems so strange that third-party utilities offer better functionality than Dragon. It is what it is. |
|
|
|
![]() |
|
Anchoring the speech focus is used a lot with radiologists. All day long they don't do anything but look at images and dictate what they see.
The problem is: They don't just look, but they pan and zoom the image, measure distances, scroll through multiple images of a sequence, in short: They click into their PACS/image viewer.
But their dictation is never supposed to land in the window they clicked into, but always in their EMR.
This is where anchoring the speech focus is really essential.
It's true, though, that Nuance delivered something half-baked. Anchoring speech focus does not work with all target windows and once a window instance is being closed, the anchor is released. If you imagine working on several word documents in sequence, you anchor speech focus to the first document and after it's saved and closed you will forget to anchor to the new document window, I guarantee!
Cheers mav |
|
|
|
![]() |
|
Mav,
Lindsay wrote a program many years ago to address this very problem for radiologists.
MDH ------------------------- |
|
|
|
![]() |
|
Lindsay wrote a program many years ago to address this very problem for radiologists.
MDH
Shameless self-ad: Our voice4medicine/voice4legal/voice4customer solution also has a feature (in fact, several ones) to make Dragon's output appear in a given target window, regardless of focus. On top of that, for "Cloud Dragon" (DMO/DPA/DLA) or on-prem DMD we offer a tool "DMOAutoFocus" which is able to automate the task of anchoring speech focus to a given window type (not just window instance like DMO does).
Cheers mav |
|
|
|
![]() |
|
It is so interesting that there are third-party utilities that do this!
Like Ebox - first time I am hearing about this. Sounds interesting. For me personally it is useful in the legal work I do. When going through caselaw and articles, it is very helpful to be looking through the browser, downloading content, and dictating notes in a separate document in the back. It is annoying to have half of a window occupied for dictation and the document operating in the other half of the screen. Also constantly switching focus is just inefficient. It would be very easy to just be in the browser collecting the documents, cases, etc., and then process everything in the notes afterwards. |
|
|
|
![]() |
|
Hey you mean ChatGPT isn't taking over all your daily grind of clerical work yet? Next year! Or next month?
https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=25&threadid=37131
P.S., Add to above the solutions from Mav's outfit. The anchoring to a window class is an interesting twist ... |
|
|
|
![]() |
|
Have a look at the "Delayed Mode" in DragonCapture as per this demo video (starting at 10:10):
DragonCapture Tutorial 2 - Basics & Main Options on Vimeo
Mentioning this because Lunis typically only refers to "Instant Mode" when recommending DragonCapture. (Wondering whether he even knows it exists.)
Ax, in all fairness, you are making it look as if the "doubling letters" phenomenon was an integral part of DragonCapture, forgetting to bear in mind that it takes a certain combination of Dragon settings and dictation habits/contexts to appear, which I for one had never happen because I have always disabled "allow pauses in formatted phrases" by default, and for a reason. ------------------------- The New Game in Town: DragonConnect |
|
|
|
![]() |
|
^^^
5. Some might point out that DragonCapture "doesn't do formatting". I will hasten to say that formatting is a loaded subject. At best it sprinkles a little "icing on the cake". At worst it takes away from the cake. Does "colouring" in DragonPad transfer into Word or Wordpad? Probably. Into an HTML-aware browser field? Depends. Across Citrix or other VDI? Most unlikely. What about "WhatsApp" with its own pidgin formatting tags? It's best to leave formatting out of this discussion on an acceptable "out-of-focus" dictation box. Basically from the video, one can easily conclude that DragonCapture delivers the goods OP is after.
Ax, in all fairness, you are making it look as if the "doubling letters" phenomenon was an integral part of DragonCapture, forgetting to bear in mind that it takes a certain combination of Dragon settings and dictation habits/contexts to appear, which I for one had never happen because I have always disabled "allow pauses in formatted phrases" by default, and for a reason.
1. Here I don't think you are right. I do not make any inference that DragonCapture introduces bugs. However, I do again highlight the fact that DC "exposes" the bugs that Dragon itself couldn't fix, but chose to hide by disabling the features it had prior to V13. These features were in some ways "turned back on" by DragonCapture.
2. "Doubling letters" is quite a different beast from the "123 123456" bug Ag et al (including myself) carried on and on about ad nauseum not long ago. I couldn't care less about the "123 123456" bug, which can be overcome easily with vocabulary prefixes, and desktop users can avoid it entirely by tweaking certain formatting settings (to which cloud users can't access). "Doubling letters" is far more serious. Ag at one point even "flaunted" the letter doubling as he is "amused" by the "seetttings". I don't find it funny at all. It remains a most serious bug on desktop Dragon.
3. As user mdl pointed out years ago, Nuance merely suppressed letter doubling in non-supported apps by mostly disabling unformatted dictation into these apps "natively". As DragonCapture restores that functionality, it also restores the manifestation of this bug. I suspect this only comes into play while in instant mode with AutoPaste on. But Ag could tell us more on the circumstances. I think he uses DragonCature in instant mode to force-dictate into browser windows, as opposed to relying on any Nuance plug-in. Is that so, Ag?
4. That's why I personally wanted Ag to bite the bullet on V16 so I'd know whether doubling letter was fixed by Nuance under the hood. Ag, however, wants to bide his time until 16.1, which means I might have to wait a bit to find out ... or maybe I will take the plunge regardless. We shall see.
P.S., The degree of accuracy RW demonstrated in the video was truly amazing. I don't get that level on my DMO, although it can be close if I don't get sloppy. Then again I don't articulate the way RW does. However, using DMO's Clinical Administration Vocabulary to dictate generic communication is a lost cause for me ... thus eventually I will need to embrace V16, unless someone spruces up Whisper into something with a usable interface for Windows.
I don't think the accuracy in the video was entirely down to articulation. Desktop Dragon's ability to "curate vocabulary/style" and RW's profile maintenance over the years probably helped.
P.S. #2, During the last several years of using cloud Dragon, whether the embedded flavour or DMO, I have not once encountered letter doubling, even across Citrix using client side DMO through "Basic Text Control". This bug is downright unacceptable. My last speculation, which hopefully Ag can clarify for us, is that using DragonCapture in "delayed mode" with AutoPaste off, "letter doubling" doesn't occur.
Can you verify that, Ag? |
|
|
|
![]() |
|
Can you verify that, Ag?
Sorry, only just noticed this question for me.
Now, I almost never use DragonCapture in delayed mode with AutoPaste off -- except for when I am trying to squelch all dictation. So I can't really talk to this ...
I may give it a try, but since I now have code that automatically turns AutoPaste on in most apps/Windows/text box contexts and turns AutoPaste off where CmDiK errors happen, it's a bit of a pain. ------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
|
![]() |
|
In the thread started by me Programmatic way to (not) say "No Space" @ax is very persuasive that I should be using some out-of-focus / anchor based pseudo-dictation box.
Actually, I don't need any persuading, I just need access to something that uses true anchoring on the versions of Dragon Professional that I have access to.
Or as @ax says, I need to know the secret sauce. The stuff that is probably documented in the Dragon SDK... I have been trying to avoid investing time/money in.
---
I'm only saying the above explain why I'm just making this very late addition to this current thread thread about anchoring: This thread talks about how radiologists frequently want to dictate notes into one window, the window to which the the user has anchored Dragon's speech focus. ("Anchor speech focus" is apparently Dragon's name for this feature https://www.nuance.com/asset/en_us/collateral/healthcare/misc/misc-dragon-medical-one-for-government-fast-tips-en-us.pdf
It's not just radiologists. For that matter, it's not just speech recognition users.
Back in the days when I was doing a very large amount of "web.research" on various products and patents
back when I was not using speech recognition
I was constantly switching between various web browser windows and OneNote.
I *really* wanted to have text focus in OneNote, with mouse focus elsewhere.
Actually, I wanted to have text focus for ordinary text in OneNote. I wanted ^V paste to also focus into OneNote. But I wanted mouse clicks, ^C copy, and pretty much any key with a modifier like control/alt/shift to be focused elsewhere, except for ^V.
Since I didn't have that, I had AutoHotKey code to save/restore mouse position while swapping tasks. worked, but was a little bit annoying. I think split focus would have been quite useful, whether for dictation or for the distinctions about typing and mouse clicking that I make in the previous paragraph.
BTW, I think I just coined a new term: Dragon's "anchor speech focus" is just one possible way of "splitting focus" for different input types. Such "split focus" is a way to avoid having to constantly switch focus/tasks with save/restore of mouse position. ------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.