KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: VoiceMacro
Topic Summary: Let's dig
Created On: 07/31/2021 04:39 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 VoiceMacro   - ax - 07/31/2021 04:39 PM  
 VoiceMacro   - ax - 07/31/2021 04:56 PM  
 VoiceMacro   - ax - 07/31/2021 05:27 PM  
 VoiceMacro   - ax - 07/31/2021 05:44 PM  
 VoiceMacro   - ax - 07/31/2021 05:59 PM  
 VoiceMacro   - ax - 07/31/2021 06:21 PM  
 VoiceMacro   - ax - 07/31/2021 06:40 PM  
 VoiceMacro   - R. Wilke - 08/01/2021 06:27 AM  
 VoiceMacro   - ax - 08/01/2021 03:09 PM  
 VoiceMacro   - kkkwj - 08/01/2021 04:43 PM  
 VoiceMacro   - ax - 08/01/2021 05:16 PM  
 VoiceMacro   - ax - 12/30/2021 04:21 PM  
 VoiceMacro   - Ag - 01/13/2022 05:56 PM  
 VoiceMacro   - ax - 01/15/2022 10:25 PM  
 VoiceMacro   - michaelbeijer - 01/17/2022 09:34 AM  
 VoiceMacro   - ax - 01/17/2022 04:24 PM  
 VoiceMacro   - michaelbeijer - 05/06/2022 01:18 PM  
 VoiceMacro   - ax - 05/06/2022 06:20 PM  
 VoiceMacro   - michaelbeijer - 06/02/2022 04:00 PM  
 VoiceMacro   - ax - 06/02/2022 04:59 PM  
 VoiceMacro   - michaelbeijer - 06/02/2022 05:51 PM  
 VoiceMacro   - ax - 06/02/2022 06:05 PM  
 VoiceMacro   - michaelbeijer - 06/02/2022 06:25 PM  
 VoiceMacro   - michaelbeijer - 06/02/2022 06:01 PM  
 VoiceMacro   - kkkwj - 06/13/2022 05:17 PM  
 VoiceMacro   - dilligence - 06/13/2022 08:36 PM  
 VoiceMacro   - ax - 09/15/2022 11:31 PM  
 VoiceMacro   - kkkwj - 09/16/2022 11:21 AM  
 VoiceMacro   - PG LTU - 09/27/2022 10:47 AM  
Keyword
 07/31/2021 04:56 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

First off, let's get "smooth scroll" out of the way.

 

Setting a variable "loop_large_scroll" is probably unnecessary.  But doing so lends to "on-the-fly" change of scrolling distance.  Again not so useful as you can use a voice command "stop scrolling" to abort all running macros, thereby stopping scrolling.

 

One can add a small pause in front of the mouse scroll increment of +/- 1 (smallest), and making it even slower than in this video.

 

A scroll step of 5 is plenty fast for me.  You can always up that to your heart's content.

 

Two caveats about variables:

 

1. They are case sensitive.

 

2. They are "space-sensitive" when you declare them!  I.e., "variable" and "variable " are two different variables!!  Oh well.

 

Otherwise VM provides built-in basic conditional structures such as loop and if/else and toggle.  The toggling and setting of "toggle state" (from a separate macro) is quite useful.

 

P.S., I moved the scrolling macro below to a different profile and changed "Abort all running macros" to "Abort all macros from this profile" so as to preserve my listening indicator OSD when I terminate scrolling.  Alternatively, one can probably combine "IgnoreCommands" with "IgnoreExceptions" to use voice to stop scrolling.

 

 



 07/31/2021 05:27 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Next up, basic Recognizer Settings:

Wordlist weight: utterance 100% against my custom dictionary (which is simply the list of defined commands in the active profile)

Dictionary weight: utterance 0% (channelling VM author: a low, but not too low digit preferred) against a "standard" dictionary - which is EXACTLY what I need as I am using VM as a Command-ONLY tool.

 

See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently).  Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.


Recognition threshold: 85 - 90% is suitable for me as I prefer specificity over sensitivity.

 

I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.

 

 



 07/31/2021 05:44 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Carry on - NATO alphabet speller, which is an annoyingly weak link for my otherwise fit-for-purpose cloud Dragon in its designated browser sessions.

 

Here VM shines through.

 

Refer to VM Author's own explanation on this:


1. "RecCommand" is one of VM's reserved variables, if I am not mistaken.  It's a bit like desktop Dragon's "heardword", sort of.

2. My command name for NATO speller is in fact:
[upper-case;][alfadog;bravo;charlie;delta;echo;foxtrot;golfcourse; ... you get the idea]

In a loud hospital environ, even with recognition threshold set to 85-90%, letters represented by alpha, golf, india, and papa can still get falsely output to the screen due to ambient noise pollution.  So I had to append a syllable to each. [will also try with a low but non-zero dictionary weight, as per feedback, to see about any difference]

 

 



 07/31/2021 05:59 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Numeral Enunciation:

For desktop Dragon users, this is almost certainly gratuitous.  But I need the functionality outside of DME browser sessions.


VM Author's comment:

 

Prepending with words such as "numeral" increases recognition reliability given that single digits are single-syllable phrases.

 

No similar compact macros exist for pressing F1 - F12, however, unless one resorts to 12 If statements, which I don't see any advantage of.  As far as I can figure out, one needs a separate macro for each F(unction) key press.  It is doable to combine, say, F1, Shift-F1, and Alt-F1 into one macro through string manipulation of RecCommand and 1 or 2 conditional statements.

 

 



 07/31/2021 06:21 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

VM Control:

That big fat Caps Lock key is basically sitting there wasting prime real estate.  I re-purpose it as much as possible (might have been another one of those tips I gleaned from PG).  It is already the primary push-to-talk key for my Dragon Medical Embedded, programmed through AHK.  I naturally use Shift-CapsLock to toggle listening on/off for VoiceMacro.

 

In the odd times when native CapsLock function gets triggered, pressing "Ctrl-CapsLock" will toggle it off.

 

The "RunOtherMacro" just switches over to a different profile so I can run AHK programs with different, machine-dependent paths.  This is for portability purposes.  Otherwise gratuitous.  

 

My main work profile is named "Production".  By relegating machine-dependent elements to a separate profile, I make sure I can sync my main "Production" profile by exporting and importing the XML file associated with it - among the different machines I use.

 

 

 

 

In the voice command "stop listening", one can use the thoughtfully implemented "Set Toggle State" to reset existing running toggles to an "off" state, so that keyboard toggle doesn't go out of sync.  Finally, a voice prompt on whether the toggle is on or off quickly gets old.  I find it much more useful to customize AHK 2-liners to pop an icon into the tray so I know when VM is listening or not. 

 

(Channelling VM author: an alternative to inserting custom tray icon in order to indicate a listening state is to take advantage of the built-in OSD functionality - which I am now using instead, and I have revised the screenshots accordingly).

 

 

 


The AHK 2-liner (call it "Listening_Tray_Indicator.ahk") to insert a tray icon looks like this: 

ListeningTrayIconFile:= "Name of your tray icon file" 
Menu,Tray,Icon,%ListeningTrayIconFile% 


Exiting (terminating) the tray icon when stops listening is simply this: 

DetectHiddenWindows, On 
PostMessage,0x111,65307,,,Listening_Tray_Indicator.ahk


Of course, this improvization of a tray indicator icon could be avoided if VM's own icon would differentiate into a bright colour when listening is activated.  That could be a "feature request" I suppose.

 

Finally, see VM Author's own pro tip on how to control VM listening and heeding commands through "wake-up words".

 

P.S., VoiceMacro comes preloaded with a slew of demo macros in its "Demo" profile.  A fast way of learning the ropes is simply by modifying them.  Some of them have "Only when target window active" checked.  Uncheck that when modifying and executing macros intended for any active window/process.



 07/31/2021 06:40 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Now a few generic comments from the VM Author regarding SAPI and that sort of thing:

 

SAPI compatibility (also see RW's clarification in post following)

More on SAPI

WSR "self-learning"

Dragon will always be better at "free dictation" (which we take for granted)

Author's comment on "pseudo-list commands" (my paraphrase)

 

Moreover, combining group affixes with profile switches / window targeting will likely deliver well-organized application-specific command deployment.  I am speculating here as I haven't tested out the possibilities myself.

 

Just going by the few examples I outlined above, it is easy to conclude that VoiceMacro is CAPABLE.  It goes without saying that VM also has the basic mouse control and coordinate focusing capabilities built-in.

 

In fact, even sans the voice component, VM's keyboard hotkey implementation aspect alone can probably give something like Macro Express a run for its money.  

 

Anyway, I hope by scratching the surface, it helps someone.


My take at the end of the day is simply this: well-crafted portable software help us all, and are deserving of support, including promotion.



 08/01/2021 06:27 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8058
Joined: 03/04/2007

Just one particular, and more random comment on this, leaving it here for you to make the VM community aware of it:

Dragon is compatible with SAPI 4, in certain parts and in some ways, but not with SAPI 5.

-------------------------

 08/01/2021 03:09 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Thanks RW for clarifying dependencies under the hood.


Furthermore, I have updated my posts above with feedback/comments from VM author.



Two general suggestions from the author to anyone interested:

1. If one is using VM for multiple apps, it is highly suggested to use one profile per application/window (as intended) and make use of VM's auto-switch profile feature, this way one does not run into conflicts, or have profiles active in applications that are not supposed to be controlled via VM.

2. Generally, it is not recommended to set the "Dictionary weight" to 0, as per above VM author comment in red.


The author is quick to address questions posted to his VM-specific forum.  And he has a discord channel (https://discord.gg/9UTJqXJ) for plugin developers. 

 

Moreover, author welcomes bug/crash reports through email/forum or preferably VM's built-in crash reporter. 

 

Lastly, the latest builds are usually more stable, incorporating more bug fixes (they are not mere "betas").



 08/01/2021 04:43 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Nice thread, ax! It must have taken a lot of work to create it so that others could learn about VM. Thank you.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 08/01/2021 05:16 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Glad you appreciate my effort to introduce this, Kevin!

It did take up a good chunk of time - mainly to make sure I don't feel like I am misleading anyone.

My way of "paying it forward", so-to-speak.

As you may know, I have been looking precisely for such a small-footprint "Swiss Army knife" tool for a while ("swiss army knife" was VM author's own descriptor, incidentally).  One of my first posts (after a long hiatus) back to this forum was "fishing for some ideas", so as to complement the cloud dragon I have been made to use for work.  VM's portability and capabilities are basically tailor-made for what I need to do.

Hope it helps others as it is helping me.



 12/30/2021 04:21 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Originally posted by: ax Next up, basic Recognizer Settings: Wordlist weight: utterance 100% against my custom dictionary (which is simply the list of defined commands in the active profile) Dictionary weight: utterance 0% (channelling VM author: a low, but not too low digit preferred) against a "standard" dictionary - which is EXACTLY what I need as I am using VM as a Command-ONLY tool.

 

See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently).  Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.

 

Recognition threshold: 85 - 90% is suitable for me as I prefer specificity over sensitivity.

 

I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.

 

 

 

A field update after months of production use:

 

Despite what I wrote above with respect to a recommended "non-zero low Dictionary weight", in my case, the 0% "Dictionary weight" specificity still works out the best, in conjunction with 85 to 90% of "Recognition threshold" sensitivity.


Even 1% "Dictionary weight" is inferior to 0% as judged by my own real life usage.


Yes, after a while (weeks to months), if VoiceMacro is left running in an "always-listening" mode, gratuitious noises can be falsely recognized as commands, especially with some of my 2-syllable commands (no, not the best practice to have 2-syllable commands, I acknowledge... but nevertheless it's a matter of striking a balance between efficiency and convenience).

The solution is simply to delete Speech Recognition Profile in Windows

 

NB: if the default Windows Speech Recognition Profile, usually named after the current login, is the only one in use, then you can't delete it until you start up a second one, which could be named "temp".  After a "temp" profile is created and your default profile deleted (needs to press "Apply" button for the change to stick), you can go on to re-create another profile, which Windows will default back to your login name.


Because I am using VoiceMacro as a command-only tool, deleting any existing Windows "Speech Recognition Profiles" is rather inconsequential.  At most I would train the new profile once with the standard Windows speech recognizer training screen (for 5 minutes tops), and then can be done with it as the newly reset profile would have the same accuracy as the previous.

At home, where there is relatively less noise compared to the hospital, I haven't even had to delete/reset the Speech Recognition Profiles yet.  The 0% "Dictionary weight" has served its purpose so far.

 

P.S., the recommendation to use a "non-zero" value for Dictionary Weight by VoiceMacro author was most likely based on user experience such as described here (partly in Deutsch - nothing Google Translate couldn't handle with aplomb).  But that user's mother was quite debilitated and didn't have any "sovereign control" over noises in her environment.  Nor could she be expected to know how to "reset" speech Recognition Profile in Windows" once in while as necessary.



 01/13/2022 05:56 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 1035
Joined: 07/08/2019

I only just noticed,  6 months later,  @ax  taking my name in vain -  or at least my  chemical symbol "Ag". :-)

 

Originally posted by: ax 

Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):

 

Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?

 

 

A: Seriously, how would one expect a dodo north of the 49th parallel to be able to answer that (and one who has never even beholden either up close)?

 

Hey, @ax, *some* Canadians  were or are in the Army reserve. 

 

Neither AR-15 nor Glock are on https://en.wikipedia.org/wiki/List_of_equipment_of_the_Canadian_Army.

 

I was sad when Canada switched to the C7 member of the M-16 family.

 

For my money, as an assault rifle the AK-47 is better suited to Canadian conditions, ranging from Arctic to muskeg to boreal forest.  If you want distance, accuracy, and the ability to stop polar/grizzly/pizzly bears or moose, the good old FN FAL (Canadian C1).  Not as good as a real hunting rifle, but better than the AK-47 and much better than the AR-15.

 

 

:-)

 



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.



 01/15/2022 10:25 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Originally posted by: Ag

 

I only just noticed,  6 months later,  @ax  taking my name in vain -  or at least my  chemical symbol "Ag". :-)

 

 

Originally posted by: ax 

 

Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):

 

 

 

Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?

 

 

 

 

Excusez mon "certifiable mindset", for transgressing your sterling call sign, Capitaine!  Number myself a fan of the Socratic style of discourse.

 

The only alphanumerically-monikered "equipment" that regretably became a requisite in my daily existence are stamped with "N95" ... not a huge enthusiast of this style of "nom de guerre".

 

At least "Glock" sounds colloquial ... not that I have anything else (knowledge or desire) to add to this subject.



 01/17/2022 09:34 AM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

Wow, great post! 

I have been using VoiceMacro extensively (and very happily) for the past few weeks in my actual work, having uninstalled Dragon again recently for the millionth time in disgust. I'm a technical/patent translator, and so use speech recognition mainly to control my translation software (memoQ), and to dictate the occasional bit of text. VM is much lighter/quicker on my computer and doesn't bring things to a crawl like Dragon invariably does.

I haven't had much time to add commands, but here is what I currently have:

(basically all the stuff I need to do when working: add selected terms to termbase, run concordance search, insert matches from termbases/translation memories, search in termbases, etc.)



-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 



 01/17/2022 04:24 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Originally posted by: michaelbeijer

 

..., and to dictate the occasional bit of text. 

 

 

Nice to hear you were able to dictate some prose with it.  That's something I haven't tried myself.  Here is hoping that the new and improved MS Speech Recognizer on the horizon will "kick it up a notch" in that regard.



 05/06/2022 01:18 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

Here's a quick test of me dictating some flowing text with VoiceMacro: https://www.youtube.com/watch?v=AE5Y1Pcu5o4



-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 



 05/06/2022 06:20 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Your video definitely piqued my interest and indeed the output you showed was quite "not bad".

Questions if you don't mind:

1. It seems that you were using Windows 11's "Voice Access" as a recognizer for VoiceMacro.  Am I wrong?  I mean VM itself doesn't do any recognition but rather harnesses an underlying Redmond recognizer, which is what, "Microsoft Recognizer 8.0" under Windows 10?  But are you "channeling" recognition through Win 11's Voice Access somehow?

2. If indeed, how did you manage to hitch VM (VoiceMacro) to VA (Voice Access) as a recognizer?  In fact in your video, VoiceMacro's recognition history window consistently showed something different from the text spit out by Voice Access.  Or am I missing something?

3. Can you customize/add vocabulary in some way?

What you demonstrated certainly handled generic prose better, by leaps and bounds, than the 1-trick "Dragon Medical Embedded" used by myself would at the same task.

Thanks for sharing Michael!



 06/02/2022 04:00 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

Hi Ax,

No, my copy of Windows 11 doesn't have Voice Access yet. VoiceMacro is using "Microsoft Speech Recognizer 8.0 for Windows (English - UK)". 

I'm running:

Windows 11 Pro
Version: 21H2
OS build: 22000.675

Note that I am also using the microphone built into my Logitech ("logi") webcam; so I'm not using any fancy microphones.

I am actually finding VoiceMacro insanely useful lately, and am using it in ALL my work, which usually consists of working in my main CAT tool (= translation software), where I use it to do all kinds of crazy things.

I am also using it to do all my dictation with, in emails, etc. By, bye, big heavy cumbersome Dragon!

 

 






-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 

 06/02/2022 04:59 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Thanks for sharing the good news and your screenshots, Michael!

Did you then import or construct a dictionary set of your own custom vocabulary then?
 06/02/2022 05:51 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

No, I didn't do any kind of customization at all. In fact, I have no idea how to do it. I don't really understand Windows speech engines, since there seem to be various versions of it. 

 

For example, if I press Win+H, this little thing pops up:

Within, VoiceMacro, if I do:

 

Menu > Windows Speech Recogniser > Recognizer dictionary

 

and then: "Add new word"

 

I can train new words.

 

However, any training I do here doesn't seem to have any effect on the Win+H dictation route. It only works if I use the old-fashioned 

 

WSR dictation thingee:

 

However, I have VoiceMacro set so I can say: "Wake up", and the Win+H dialogue will appear and I can dictate flowing text.

 

So, apparently – at least until the new Voice Access is finally released – Windows 11 has 2 different speech recognitions systems:

 

(1) a device-based speech recognition feature (the old WSR "Listening" dialogue)

(2) cloud-based (online) speech recognition technologies (what appears when you press Win+H)

 

see e.g.: https://privacy.microsoft.com/en-gb/privacystatement#mainspeechinkingtypingmodule



-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 



 06/02/2022 06:05 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Now it makes sense, Michael.  I sure as heck don't know the ins and outs of various MS recognizers.  For that we would have to wait for tier 1 experts such as Lindsay to chime in.

 

But I am fairly certain that "Windows-H" brings up NOT the resident Recognizer 8.0, which underpins WSR and VoiceMacro, but it brings up "online dictation", which might just be Cortana's half-sister (or transgendered step-brother).  No wonder it worked out so well for you ... because the prose dictation is handled by a semblance of cloud dragon, in a Redmond sheep's clothing.

 

Interesting and innovative workflow you got, nonetheless!

 

P.S., just as I was posting the above, I see that you came to the same conclusion.



 06/02/2022 06:25 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

Yes, my current system is a bit of a Frankenstein, but it actually works better than using Dragon + KnowBrainer/Vocola, which is what I used to use. Plus, its free.



-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 

 06/02/2022 06:01 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 273
Joined: 12/07/2014

Okay, I think I am starting to figure it out:

The cloud dictation system (Win+H) is much better, but cannot be trained.

The old-fashioned system (WSR "Listening" dialogue) is absolute garbage, but can be trained.

If a command isn’t working in VoiceMacro, you could train the old WSR system it uses to recognise the specific command word(s). Not that I have ever needed to do this, mind you. It works flawlessly at recognising short command phrases without any training whatsoever.

However if you are using Windows+H to dictate flowing text and want to teach it a specific word, you are screwed.



-------------------------

Dragon Professional 16 + Speech Productivity + KnowBrainer
Win 11 – 64-bit, i9, 64GB RAM
Logitech webcam mic 


 


 



 06/13/2022 05:17 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Yes, the trend for the past decade has been toward non-trainable speech systems that work with a wide variety of speakers. WSR, aka Microsoft Speech Recognizer 8.0, for the desktop, creates a cursory training profile on Windows machines. Then the "new" Microsoft Speech Platform 11.0 joined the desktop and server flavors into one product. You can't train it either, but it has separate recognizers for different languages, so that's a plus for developers of international products. Since 11.0 (circa 2012, I think), Microsoft went to Azure (non-trainable again), Google Cloud, Dragon Anywhere cloud, and so on. Even Lunis recommends not training Dragon, so the new speech recognizers must be getting pretty smart. It helps *a lot* to be working with grammar-supported command utterances. Free-form recognition is much more difficult.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/13/2022 08:36 PM
User is offline View Users Profile Print this message

Author Icon
dilligence
Top-Tier Member

Posts: 1650
Joined: 08/16/2010

The Windows cloud dictation system indeed works very well (albeit it's a bit slow). Particularly when it comes to recognizing words/phrases that are not particularly known to Dragon®, but are known on the Internet. Similar to specific recognitions in "Hey Google".

 

However, the big downside is that Select-and-Say capability is not available.... Although I do seem to remember that in one of the previous Windows insider builds there was some brief Select-and-Say capability in the Edge address bar with Voice Access (could not reproduce that at the time in other browsers...). 



-------------------------

https://speechproductivity.eu


Turbocharge your Dragon productivity with 40+ Power Addons

 09/15/2022 11:31 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 695
Joined: 03/22/2012

Originally posted by: ax

 

... VoiceMacro is "EDC" in a manner desktop Dragon in its current incarnation could not claim to be.

...

 

Adding a "smidgen" to my own "sage" assessment.  Using DMO tonight for a bit, and trialing/think about my workflow, I realize that even with cloud Dragon's step-by-step command capabilities, it can't replace VoiceMacro in terms of what latter currently does for me.

As an "always-on" command utility while my DMO/DME is configured in PTT, VoiceMacro's convenience value just can't be superceded by even desktop Dragon, as configurable by an average Joe like myself (notwithstanding that the hands-free programmer crowd have their supa-dupa tools, which do come across stunning in what seems achievable).

The ability to just randomly utter "2c" and execute reliable double-clicking ... No.  The $100/mth (if one pays the "market rate") DMO can't do that, without causing problems/interference, as I currently reckon.

 

 

......

 

P.S., the attraction of runing one voice app as opposed to 2 is just too great.  Plus I couldn't run VoiceMacro on hospital's VDI system due to some language setting issues.

 

I reluctantly gave up the slick "2C" and embraced DMO's more cumbersome "single/left click" and "double click".  In step-by-step, assigning "to see" to mouse action (through AHK workaround as there is no direct call to mouse click in step-by-step) doesn't lead to command recognition even with the stipulated spoken form.  Ditto "1C".

 

Probably not unexpected.

 

Anyway, that was the price to pay.  Otherwise I transferred the majority of my VM macros to DMO on the 3rd day of using latter.



 09/16/2022 11:21 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

I like that abbreviation for a double click! It reminds me of the 2ic abbreviation for second-in-command.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 09/27/2022 10:47 AM
User is offline View Users Profile Print this message

Author Icon
PG LTU
Top-Tier Member

Posts: 2248
Joined: 03/21/2007

Another excellent use for VoiceMacro for me has been a command to "cancel dictation" and a similar command to "cancel and resume dictation."

It's a little tricky because you either have to anchor the Dragon results box showing preliminary results or otherwise figure out where it is on the screen, but once you know, you just send the mouse there, right click and select the menu choice to cancel recognition.

Cancelling also turns off the mic - and to me anyway, *appears to flush the speech buffer* making canceling and resuming the dictation by turning the mic back on (the 2nd command) the quickest way to resume quick dictation when the preliminary results are showing in the tooltip but not landing on your screen quickly anymore.

Any of you know how to issue that command via the API (so I don't have to mouse move to the tool tip)? There is the Dragon mic option constant called "dgnmicoptionChangeStateImmediately" which, after pausing and resuming the mic, "sets the microphone pause count to zero (cancel all pending pauses)." Does it do the same thing or offer any help? LMK your thoughts, pls and thx,

PG

-------------------------




PG





Remember folks, my comments and this forum are for entertainment value only, please, no wagering or other reliance on the contents herein.  I permit no commercial use of my ideas (whether expressions or embodiments) without my written consent.

Statistics
32528 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 0 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 388 guests browsing this forum, which makes a total of 388 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.