KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: I just learned that Nuance took my suggestion
Topic Summary: Just not in the version I am using.
Created On: 05/21/2023 04:55 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 I just learned that Nuance took my suggestion   - Dusty_Dingo - 05/21/2023 04:55 PM  
 I just learned that Nuance took my suggestion   - ax - 05/21/2023 07:47 PM  
 I just learned that Nuance took my suggestion   - Dusty_Dingo - 05/21/2023 09:44 PM  
 I just learned that Nuance took my suggestion   - Mav - 05/22/2023 08:15 AM  
 I just learned that Nuance took my suggestion   - MDH - 05/22/2023 09:08 PM  
 I just learned that Nuance took my suggestion   - Mav - 05/23/2023 03:46 AM  
 I just learned that Nuance took my suggestion   - Dusty_Dingo - 05/22/2023 09:30 PM  
 I just learned that Nuance took my suggestion   - ax - 05/23/2023 03:19 AM  
 I just learned that Nuance took my suggestion   - R. Wilke - 05/23/2023 04:10 AM  
 I just learned that Nuance took my suggestion   - ax - 05/23/2023 01:02 PM  
 I just learned that Nuance took my suggestion   - Ag - 05/26/2023 01:29 AM  
 I just learned that Nuance took my suggestion   - Ag - 08/25/2023 01:38 AM  
Keyword
 05/21/2023 04:55 PM
User is offline View Users Profile Print this message

Author Icon
Dusty_Dingo
Power Member

Posts: 46
Joined: 11/20/2016

I just learned that Dragon Professional Anywhere has a feature that can "lock" on any window, so that you can continue dictating while that window is in the background. Not sure how useful this feature is in practice.

 

However, I had given Nuance a suggestion to include a feature where Dragon can "lock" on a window, so that dictation can continue in the background. At the time DPI 14 had just released.

 

Why could not they include this feature in DPI 15/16!? :'(

 05/21/2023 07:47 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 777
Joined: 03/22/2012

"Anchoring" (the speech focus) or "out of focus" dictation is the lingo. 

 

And it's most definitely NOT "any window".  All browser windows are excluded at this time.  Bits and pieces of this topic litter all over this forum.  A big chunk is below:

https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36887

Is it potentially useful?  Could be very ... as much as fog lights are useful.  When you need it, it is indispensable.  But one doesn't need it often.  It's primarily a "note-taking" tool for those occasions that call for such.

Do most DMO users use it?  Not by my observation, because it is not that useful "out of the box".

Why not?  Because Nuxxce tends to be "half-@$$" in doing something ... other than devoting wholeheartedly to the task of getting $$ out of ...

Why is it half-@ss?  Because as you click around somewhere else on the screen, while dictating happily into an "anchored" window, wouldn't you need to edit things on the fly here and there?  The simplest editing command, "Undo", invokes "Ctrl-Z" or equivalent.  But that voice command is executed on the window which currently has focus, i.e., the one you are clicking up and down on.  So you'd still have to mouse away to activate the note-taking window in order to apply "Undo".  This defeats the purpose of anchoring, at least for that moment.

Can you overcome?  Yes, by using something along the line of AutoHotKey, to redirect "Ctrl-Z" to that anchored window (with a myriad of ways of identifying its WinTitle).  Practically speaking, I have to live with 2 "Undo" commands, one being the built-in "Undo it/that", which is the Nuance default that acts on the active window, and the other my custom command "Undo text", which "sends" Ctrl-Z to my anchored window - out of focus, while I am in that mode.

You get the drift.  Now some navigational commands such as "Go to top", "Go to end", or "Insert before/after xxxx" do execute on anchored window by default.  But to truly make anchoring "pretty", you'd have to add small custom commands for going to end/beginning of line, etc.

I primarily use Win32Pad suggested to us by Matt Chambers as my anchor.  It is a bit more involved than a 1 or 2-liner AHK script to use its Line Number feature out of focus in "anchored" mode.  But doable.

Basically, it takes end user customization to make anchoring "perform to spec".

Overall, I'd say this "lock"/anchoring is over-rated, especially given its default manner of implementation.  It is a niche feature.  If you've got a workflow that must have it ... and you are "crazy" enough for it, look up "Ebox", mentioned half in jest, somewhere over this forum.



 05/21/2023 09:44 PM
User is offline View Users Profile Print this message

Author Icon
Dusty_Dingo
Power Member

Posts: 46
Joined: 11/20/2016

Haha! Like many features Nuance implements this also sounds very half-baked (looking at the new "multiscreen" mouse control).

Thank you for the suggestions! Autobox was something I have never heard about before! It seems so strange that third-party utilities offer better functionality than Dragon. It is what it is.

 05/22/2023 08:15 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 723
Joined: 10/02/2008

Anchoring the speech focus is used a lot with radiologists.

All day long they don't do anything but look at images and dictate what they see.

 

The problem is: They don't just look, but they pan and zoom the image, measure distances, scroll through multiple images of a sequence, in short: They click into their PACS/image viewer.

 

But their dictation is never supposed to land in the window they clicked into, but always in their EMR.

 

This is where anchoring the speech focus is really essential.

 

It's true, though, that Nuance delivered something half-baked.

Anchoring speech focus does not work with all target windows and once a window instance is being closed, the anchor is released.

If you imagine working on several word documents in sequence, you anchor speech focus to the first document and after it's saved and closed you will forget to anchor to the new document window, I guarantee!

 

Cheers

mav

 05/22/2023 09:08 PM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2336
Joined: 04/02/2008

Mav,

 

Lindsay wrote a program many years ago to address this very problem for radiologists.

 

MDH



-------------------------


 05/23/2023 03:46 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 723
Joined: 10/02/2008

Originally posted by: MDH Mav,

 

 

 

Lindsay wrote a program many years ago to address this very problem for radiologists.

 

 

 

MDH

 

 

Shameless self-ad:

Our voice4medicine/voice4legal/voice4customer solution also has a feature (in fact, several ones) to make Dragon's output appear in a given target window, regardless of focus.

On top of that, for "Cloud Dragon" (DMO/DPA/DLA) or on-prem DMD we offer a tool "DMOAutoFocus" which is able to automate the task of anchoring speech focus to a given window type (not just window instance like DMO does).

 

Cheers

mav

 05/22/2023 09:30 PM
User is offline View Users Profile Print this message

Author Icon
Dusty_Dingo
Power Member

Posts: 46
Joined: 11/20/2016

It is so interesting that there are third-party utilities that do this!

Like Ebox - first time I am hearing about this. Sounds interesting. For me personally it is useful in the legal work I do. When going through caselaw and articles, it is very helpful to be looking through the browser, downloading content, and dictating notes in a separate document in the back.

It is annoying to have half of a window occupied for dictation and the document operating in the other half of the screen. Also constantly switching focus is just inefficient.

It would be very easy to just be in the browser collecting the documents, cases, etc., and then process everything in the notes afterwards.
 05/23/2023 03:19 AM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 777
Joined: 03/22/2012

Hey you mean ChatGPT isn't taking over all your daily grind of clerical work yet?  Next year!  Or next month?

Mav is quite right that this "out of focus" dictation style is bread-and-butter for radiologists, pathologists, and perhaps legal professionals who prefer to work the way you describe.  For clinicians, it is not as indispensable.  I don't use it as often as I thought I would, even though I customized Win32Pad to the nines as a substrate for anchoring, meaning I could do all editing by voice while anchored (its "Go to Line Number" feature is a bonus).  Still, clicking around with the mouse is usually faster.

Let me clarify the "3rd Party" landscape for you as best as I understand it, outside of Nuance's own "dictation anchoring" utility for cloud Dragon.

1. DNS.Comfort from Sonic Labs in Germany - apparently has been available for the better part of a decade.  But nary a review from anybody who frequents this forum, AFAIK.  I suspect it isn't cheap.  Is it free of the kind of limitations and idiosyncrasies that Nuance's anchoring has?  Anybody's guess.

2. Lindsay (of PCbyVoice fame)'s "Reporter" - mentioned by MDH above, also has been available for years, as an add-on.  But not yet "publicly" released.  There could be many reasons.  Do keep in mind that his outfit is an authorized UK reseller and a major Dragon support node.  By that I mean there are things he has to be mindful of ... I am just speculating.

3. R. Wilke's DragonCapture, which could be adapted to take focus.  He has previously expressed a disinterest in making this a "thing", or a dedicated feature.  I think he did not wish to expand support footprint.  It's probably not worth it for him, especially with other endeavours on the go.

4. Speech Productivity's Auto Box (or similar), which "simulates" it to a degree.  It doesn't do true out-of-focus dictation.  It is popular as the "next best thing".  And it adds a bunch of additional features, which you can digest from its product video.

5. The so-call "EBox" is a semi-jest from yours truly, to that one-of-a-kind Ag.  First of all it is no "box".  It's just a workflow.  Secondly, it probably beats all of the boxes mentioned above, including my customized DMO "anchor box" - if all you are looking for is "out of focus" dictation.

I was in fact proposing the "EBox" workflow as a replacement for DragonCapture (yes you read that right) - minus the letter doubling, etc. (which is a Dragon thing that flows out of the DragonCapture "conduit" unmodified).  But to pull it off flawlessly you need a $30/mth subscription to TeamsViewer.  Definitely not worth it, unless one already subscribes to it for other reasons.  DragonCapture would be far more cost-effective.

But if you just have to have an "anchored" workflow and doesn't need real-time "auto transfer" (through clipboard syncing), then the "Ebox" workflow may fill that niche for you, on those rare occasions.

"EBox" gives you the following advantages as compared to the real boxes above, for self-evident reasons:

A.  Can anchor everything, including browser windows, which none of the above boxes can, including Nuance's own - I love to be proven wrong on this by the proprietors of these boxes.  Unfortunately I am quite certain on this.  But would love to eat my words!

B.  Your editing commands by default go into the Ebox, without the kind of silliness I ran into with Nuance's own.

The downsides to the "Ebox" workflow?  The most glaring is that you'd have to juggle two microphones ... perhaps wear both the Shokz Gen 1 with its boom on the left, and the Shokz Gen 2 with its boom on the right?  Just kidding.  It helps when one mic is used in PTT style.  BTW, I used Nuance's Embedded Dragon with VoiceMacro for 2 or 3 years, which was very much an approximation of the "Ebox" workflow.

At the end of the day, the embedded "Ebox" at its simplest is nothing more than a virtual Voice Recorder, with real-time editing.

 

https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=25&threadid=37131

 

P.S., Add to above the solutions from Mav's outfit.  The anchoring to a window class is an interesting twist ...  



 05/23/2023 04:10 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8104
Joined: 03/04/2007

Have a look at the "Delayed Mode" in DragonCapture as per this demo video (starting at 10:10):

 

DragonCapture Tutorial 2 - Basics & Main Options on Vimeo

 

Mentioning this because Lunis typically only refers to "Instant Mode" when recommending DragonCapture. (Wondering whether he even knows it exists.)

 

Ax, in all fairness, you are making it look as if the "doubling letters" phenomenon was an integral part of DragonCapture, forgetting to bear in mind that it takes a certain combination of Dragon settings and dictation habits/contexts to appear, which I for one had never happen because I have always disabled "allow pauses in formatted phrases" by default, and for a reason.



-------------------------


The New Game in Town: DragonConnect



 05/23/2023 01:02 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 777
Joined: 03/22/2012

^^^

1. Thanks for posting the video link, RW.  This is the first time I have seen it.  You had the same link on your website.  But did you know that Vimeo blocks the direct linking to the video from your website?  At least for me.  I verified that it is the same link.  Copying and pasting the video URL into a browser works.  That might be one of the reasons why it wasn't getting the exposure it needed.

2. Literally, if desktop Dragon is sliced bread, DragonCapture is the peanut butter and jam.  To OP: look no further, DragonCapture in delayed mode is the answer to your quest, as far as I can see.  Transparently and reasonably priced, code-signed, and well supported.  Granted, its author can be a little crusty at times, depending on which side of the bed he gets up from on any given day ... but everybody is a working progress.

3. Now in delayed mode, you may have to use DC's own custom command to "undo" changes ... much like what I have had to.  But RW already programmed that in for you and all that is included in the price.  I had to do such essential usability tweaking myself for the DMO "anchoring".  The only two things DC lacks compared to Win32Pad are a dark mode and line numbering.  But perhaps enough demand will prevail upon the author to add those.

4. You can be darn sure I wouldn't be getting desktop Dragon myself without getting DragonCapture to go with it.  I am not saying this to "compliment anyone" ... but rather just to state a plain fact as I see it.

 

5. Some might point out that DragonCapture "doesn't do formatting".  I will hasten to say that formatting is a loaded subject.  At best it sprinkles a little "icing on the cake".  At worst it takes away from the cake.  Does "colouring" in DragonPad transfer into Word or Wordpad?  Probably.  Into an HTML-aware browser field?  Depends.  Across Citrix or other VDI?  Most unlikely.  What about "WhatsApp" with its own pidgin formatting tags?  It's best to leave formatting out of this discussion on an acceptable "out-of-focus" dictation box.   Basically from the video, one can easily conclude that DragonCapture delivers the goods OP is after.


To OP: above is all I can say about DragonCapture vis-a-vis your desired workflow.  Just that its author cares comparatively little about marketing.  Did I mention the links in RW's own website don't quite work?  Probably some anti-redirection feature at play by Vimeo.


Now directing myself to Herr Wilke's comment:



Originally posted by: R. Wilke

Ax, in all fairness, you are making it look as if the "doubling letters" phenomenon was an integral part of DragonCapture, forgetting to bear in mind that it takes a certain combination of Dragon settings and dictation habits/contexts to appear, which I for one had never happen because I have always disabled "allow pauses in formatted phrases" by default, and for a reason.

 

 

1. Here I don't think you are right.  I do not make any inference that DragonCapture introduces bugs.  However, I do again highlight the fact that DC "exposes" the bugs that Dragon itself couldn't fix, but chose to hide by disabling the features it had prior to V13.  These features were in some ways "turned back on" by DragonCapture.

 

2. "Doubling letters" is quite a different beast from the "123 123456" bug Ag et al (including myself) carried on and on about ad nauseum not long ago.  I couldn't care less about the "123 123456" bug, which can be overcome easily with vocabulary prefixes, and desktop users can avoid it entirely by tweaking certain formatting settings (to which cloud users can't access).  "Doubling letters" is far more serious.  Ag at one point even "flaunted" the letter doubling as he is "amused" by the "seetttings".  I don't find it funny at all.  It remains a most serious bug on desktop Dragon.

 

3. As user mdl pointed out years ago, Nuance merely suppressed letter doubling in non-supported apps by mostly disabling unformatted dictation into these apps "natively".  As DragonCapture restores that functionality, it also restores the manifestation of this bug.  I suspect this only comes into play while in instant mode with AutoPaste on.  But Ag could tell us more on the circumstances.  I think he uses DragonCature in instant mode to force-dictate into browser windows, as opposed to relying on any Nuance plug-in.  Is that so, Ag?

 

4. That's why I personally wanted Ag to bite the bullet on V16 so I'd know whether doubling letter was fixed by Nuance under the hood.  Ag, however, wants to bide his time until 16.1, which means I might have to wait a bit to find out ... or maybe I will take the plunge regardless.  We shall see.

 

 

P.S., The degree of accuracy RW demonstrated in the video was truly amazing.  I don't get that level on my DMO, although it can be close if I don't get sloppy.  Then again I don't articulate the way RW does.  However, using DMO's Clinical Administration Vocabulary to dictate generic communication is a lost cause for me ... thus eventually I will need to embrace V16, unless someone spruces up Whisper into something with a usable interface for Windows.

 

I don't think the accuracy in the video was entirely down to articulation.  Desktop Dragon's ability to "curate vocabulary/style" and RW's profile maintenance over the years probably helped.

 

 

P.S. #2, During the last several years of using cloud Dragon, whether the embedded flavour or DMO, I have not once encountered letter doubling, even across Citrix using client side DMO through "Basic Text Control".  This bug is downright unacceptable.  My last speculation, which hopefully Ag can clarify for us, is that using DragonCapture in "delayed mode" with AutoPaste off, "letter doubling" doesn't occur. 

 

Can you verify that, Ag?



 05/26/2023 01:29 AM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 1150
Joined: 07/08/2019

Originally posted by: ax My last speculation, which hopefully Ag can clarify for us, is that using DragonCapture in "delayed mode" with AutoPaste off, "letter doubling" doesn't occur. 

 

Can you verify that, Ag?

 

Sorry, only just noticed this question for me.

 

Now, I almost never use DragonCapture in delayed mode with AutoPaste off --  except for when I am trying to squelch all dictation.    So I can't really talk to this ... 

 

I may give it a try, but since I now have code that automatically turns AutoPaste on in most  apps/Windows/text box  contexts and turns AutoPaste off  where CmDiK  errors happen, it's a bit of a pain.



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.

 08/25/2023 01:38 AM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 1150
Joined: 07/08/2019

In the thread started by me Programmatic way to (not) say "No Space"   @ax  is very persuasive that I should be using   some out-of-focus / anchor based   pseudo-dictation box.

 

Actually, I don't need any persuading, I just need access  to something that uses true anchoring  on  the versions of Dragon Professional  that I have access to.

 

Or as @ax says,  I need to know the secret sauce.  The stuff that is probably  documented in the Dragon SDK...  I have been trying to avoid investing  time/money in.

 

---

 

I'm only saying the above explain why I'm just making this very late addition to  this current thread thread about anchoring:

This thread talks about how radiologists  frequently want to  dictate notes into one window, the window to which the the user has anchored Dragon's speech focus.   ("Anchor speech focus"  is apparently Dragon's name for this feature https://www.nuance.com/asset/en_us/collateral/healthcare/misc/misc-dragon-medical-one-for-government-fast-tips-en-us.pdf

 

It's not just radiologists.   For that matter, it's not just speech recognition users.

 

Back in the days when I was doing a very large amount of "web.research"  on various products and patents

 

back when I was not using speech recognition

 

I was constantly switching between various web browser windows and OneNote.

 

I  *really*  wanted to have text focus in OneNote,  with mouse focus  elsewhere.

 

Actually, I wanted to have  text focus for ordinary text in OneNote. I wanted ^V paste to  also focus into OneNote.  But I wanted mouse clicks, ^C copy,  and pretty much any key with  a modifier like control/alt/shift to be focused elsewhere,  except for ^V.

 

Since I didn't have that,  I had AutoHotKey code  to save/restore  mouse position while swapping tasks.  worked, but was a little bit annoying.  I think  split focus would have been  quite useful,   whether for dictation or for the distinctions about typing and mouse clicking that I make in the previous paragraph.

 

BTW,   I think I just coined a new term: Dragon's "anchor speech focus"  is just one possible way of "splitting focus"  for  different input types.   Such "split focus" is a way to avoid having to constantly switch focus/tasks with save/restore of mouse position.



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.



Statistics
32617 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 2 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 111 guests browsing this forum, which makes a total of 113 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.