KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: VoiceMacro
Topic Summary: Let's dig
Created On: 07/31/2021 04:39 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 VoiceMacro   - ax - 07/31/2021 04:39 PM  
 VoiceMacro   - ax - 07/31/2021 04:56 PM  
 VoiceMacro   - ax - 07/31/2021 05:27 PM  
 VoiceMacro   - ax - 07/31/2021 05:44 PM  
 VoiceMacro   - ax - 07/31/2021 05:59 PM  
 VoiceMacro   - ax - 07/31/2021 06:21 PM  
 VoiceMacro   - ax - 07/31/2021 06:40 PM  
 VoiceMacro   - R. Wilke - 08/01/2021 06:27 AM  
 VoiceMacro   - ax - 08/01/2021 03:09 PM  
 VoiceMacro   - kkkwj - 08/01/2021 04:43 PM  
 VoiceMacro   - ax - 08/01/2021 05:16 PM  
 VoiceMacro   - ax - 12/30/2021 04:21 PM  
 VoiceMacro   - Ag - 01/13/2022 05:56 PM  
 VoiceMacro   - ax - 01/15/2022 10:25 PM  
 VoiceMacro   - michaelbeijer - 01/17/2022 09:34 AM  
 VoiceMacro   - ax - 01/17/2022 04:24 PM  
 VoiceMacro   - michaelbeijer - 05/06/2022 01:18 PM  
 VoiceMacro   - ax - 05/06/2022 06:20 PM  
Keyword
 07/31/2021 04:39 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

On this rather glorious Civic Day weekend, time to pull a bit of civic duty.

 

Thanks to the ever-clever PG, I found out about VM - right here where a plethora of useful minds seem to congregate.  VM Author does have his own forum, and generally prefers questions to be posted directly there.

 

That being said, I obtained "permission" (out of courtesy) from VoiceMacro author to review and explore VM here, in large part because this is the only forum from anywhere that I frequent (haven't touched my FB account in 5+ years, even though it still has an "app" on my 6 year-old Blackberry).

 

Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):

 

Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?

 

A: Seriously, how would one expect a dodo north of the 49th parallel to be able to answer that (and one who has never even beholden either up close)?  Anyhow, the most he would ever know are the 3 letters: E-D-C.

 

While desktop Dragon +/- Knowbrainer outclasses VoiceMacro + WSR in sheer oomph, as the former represents the definitive, all-encompassing voice-activated RPA tool on Windows, "wieldy" desktop Dragon is often not (so much).  VoiceMacro is "EDC" in a manner desktop Dragon in its current incarnation could not claim to be.

 

VoiceMacro is PORTABLE (!!!!) - in a no-installation-required sense.  Having said that, the program is recommended to be run with Admin credentials in order to "maximize potentials".  I use it portably without any Admin credential and without installation - just like all my AHK scripits.  It suits my purposes just fine.

 

It is capable.  We shall scratch the surface and see.

 

And it is stable (relatively).  So far it froze once on me when I was running AHK scripts from wrong/phantom directories, whereupon I had to terminate VM ungracefully. [Edit: Author would welcome any crash report, hopefully reproducible, preferably through VM's crash reporter].

 

Otherwise I have now used VoiceMacro for a grand total of 2 weeks.  I was so impressed with it that I donated the second day I used it.  It is "donationware".

 

One "con" I wish to point out even though it is probably no "con" at all to many others.  All things being equal, I would rather have a certified version.  Signing software has fallen out of vogue with some of the largest portable, free software developers of late, Notepad++ and AutoHotKey being just 2 of them, out of desires not to coddle an "extortionist culture".  They do this SH256 hash thing.  Absence of a "certification" has not deterred me from using these indispensable portable wares, so far.

 

But given my own lack of imagination (in a Kierkegaardian sense with respect to the "timid world view" of your average petit bougeoir / worker aristocrat), I would have preferred a separate version of "VoiceMacro Pro" - even if identical to the donationware, but charges money for the "privilege" of running a piece of "certified" software, which appeals to suckers like me.  Again, this preference reflects my own professional pusillanimity more than anything else.

 

[Edited]: It turns out not to be accurate to say that VoiceMacro uses "WSR", but rather both share an underlying "Microsoft Speech Recognizer Engine".  Refer to pro explanation here for clarification.



 07/31/2021 04:56 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

First off, let's get "smooth scroll" out of the way.

 

Setting a variable "loop_large_scroll" is probably unnecessary.  But doing so lends to "on-the-fly" change of scrolling distance.  Again not so useful as you can use a voice command "stop scrolling" to abort all running macros, thereby stopping scrolling.

 

One can add a small pause in front of the mouse scroll increment of +/- 1 (smallest), and making it even slower than in this video.

 

A scroll step of 5 is plenty fast for me.  You can always up that to your heart's content.

 

Two caveats about variables:

 

1. They are case sensitive.

 

2. They are "space-sensitive" when you declare them!  I.e., "variable" and "variable " are two different variables!!  Oh well.

 

Otherwise VM provides built-in basic conditional structures such as loop and if/else and toggle.  The toggling and setting of "toggle state" (from a separate macro) is quite useful.

 

P.S., I moved the scrolling macro below to a different profile and changed "Abort all running macros" to "Abort all macros from this profile" so as to preserve my listening indicator OSD when I terminate scrolling.  Alternatively, one can probably combine "IgnoreCommands" with "IgnoreExceptions" to use voice to stop scrolling.

 

 



 07/31/2021 05:27 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Next up, basic Recognizer Settings:

Wordlist weight: utterance 100% against my custom dictionary (which is simply the list of defined commands in the active profile)

Dictionary weight: utterance 0% (channelling VM author: a low, but not too low digit preferred) against a "standard" dictionary - which is EXACTLY what I need as I am using VM as a Command-ONLY tool.

 

See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently).  Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.


Recognition threshold: 85 - 90% is suitable for me as I prefer specificity over sensitivity.

 

I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.

 

 



 07/31/2021 05:44 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Carry on - NATO alphabet speller, which is an annoyingly weak link for my otherwise fit-for-purpose cloud Dragon in its designated browser sessions.

 

Here VM shines through.

 

Refer to VM Author's own explanation on this:


1. "RecCommand" is one of VM's reserved variables, if I am not mistaken.  It's a bit like desktop Dragon's "heardword", sort of.

2. My command name for NATO speller is in fact:
[upper-case;][alfadog;bravo;charlie;delta;echo;foxtrot;golfcourse; ... you get the idea]

In a loud hospital environ, even with recognition threshold set to 85-90%, letters represented by alpha, golf, india, and papa can still get falsely output to the screen due to ambient noise pollution.  So I had to append a syllable to each. [will also try with a low but non-zero dictionary weight, as per feedback, to see about any difference]

 

 



 07/31/2021 05:59 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Numeral Enunciation:

For desktop Dragon users, this is almost certainly gratuitous.  But I need the functionality outside of DME browser sessions.


VM Author's comment:

 

Prepending with words such as "numeral" increases recognition reliability given that single digits are single-syllable phrases.

 

No similar compact macros exist for pressing F1 - F12, however, unless one resorts to 12 If statements, which I don't see any advantage of.  As far as I can figure out, one needs a separate macro for each F(unction) key press.  It is doable to combine, say, F1, Shift-F1, and Alt-F1 into one macro through string manipulation of RecCommand and 1 or 2 conditional statements.

 

 



 07/31/2021 06:21 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

VM Control:

That big fat Caps Lock key is basically sitting there wasting prime real estate.  I re-purpose it as much as possible (might have been another one of those tips I gleaned from PG).  It is already the primary push-to-talk key for my Dragon Medical Embedded, programmed through AHK.  I naturally use Shift-CapsLock to toggle listening on/off for VoiceMacro.

 

In the odd times when native CapsLock function gets triggered, pressing "Ctrl-CapsLock" will toggle it off.

 

The "RunOtherMacro" just switches over to a different profile so I can run AHK programs with different, machine-dependent paths.  This is for portability purposes.  Otherwise gratuitous.  

 

My main work profile is named "Production".  By relegating machine-dependent elements to a separate profile, I make sure I can sync my main "Production" profile by exporting and importing the XML file associated with it - among the different machines I use.

 

 

 

 

In the voice command "stop listening", one can use the thoughtfully implemented "Set Toggle State" to reset existing running toggles to an "off" state, so that keyboard toggle doesn't go out of sync.  Finally, a voice prompt on whether the toggle is on or off quickly gets old.  I find it much more useful to customize AHK 2-liners to pop an icon into the tray so I know when VM is listening or not. 

 

(Channelling VM author: an alternative to inserting custom tray icon in order to indicate a listening state is to take advantage of the built-in OSD functionality - which I am now using instead, and I have revised the screenshots accordingly).

 

 

 


The AHK 2-liner (call it "Listening_Tray_Indicator.ahk") to insert a tray icon looks like this: 

ListeningTrayIconFile:= "Name of your tray icon file" 
Menu,Tray,Icon,%ListeningTrayIconFile% 


Exiting (terminating) the tray icon when stops listening is simply this: 

DetectHiddenWindows, On 
PostMessage,0x111,65307,,,Listening_Tray_Indicator.ahk


Of course, this improvization of a tray indicator icon could be avoided if VM's own icon would differentiate into a bright colour when listening is activated.  That could be a "feature request" I suppose.

 

Finally, see VM Author's own pro tip on how to control VM listening and heeding commands through "wake-up words".

 

P.S., VoiceMacro comes preloaded with a slew of demo macros in its "Demo" profile.  A fast way of learning the ropes is simply by modifying them.  Some of them have "Only when target window active" checked.  Uncheck that when modifying and executing macros intended for any active window/process.



 07/31/2021 06:40 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Now a few generic comments from the VM Author regarding SAPI and that sort of thing:

 

SAPI compatibility (also see RW's clarification in post following)

More on SAPI

WSR "self-learning"

Dragon will always be better at "free dictation" (which we take for granted)

Author's comment on "pseudo-list commands" (my paraphrase)

 

Moreover, combining group affixes with profile switches / window targeting will likely deliver well-organized application-specific command deployment.  I am speculating here as I haven't tested out the possibilities myself.

 

Just going by the few examples I outlined above, it is easy to conclude that VoiceMacro is CAPABLE.  It goes without saying that VM also has the basic mouse control and coordinate focusing capabilities built-in.

 

In fact, even sans the voice component, VM's keyboard hotkey implementation aspect alone can probably give something like Macro Express a run for its money.  

 

Anyway, I hope by scratching the surface, it helps someone.


My take at the end of the day is simply this: well-crafted portable software help us all, and are deserving of support, including promotion.



 08/01/2021 06:27 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Just one particular, and more random comment on this, leaving it here for you to make the VM community aware of it:

Dragon is compatible with SAPI 4, in certain parts and in some ways, but not with SAPI 5.

-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 08/01/2021 03:09 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Thanks RW for clarifying dependencies under the hood.


Furthermore, I have updated my posts above with feedback/comments from VM author.



Two general suggestions from the author to anyone interested:

1. If one is using VM for multiple apps, it is highly suggested to use one profile per application/window (as intended) and make use of VM's auto-switch profile feature, this way one does not run into conflicts, or have profiles active in applications that are not supposed to be controlled via VM.

2. Generally, it is not recommended to set the "Dictionary weight" to 0, as per above VM author comment in red.


The author is quick to address questions posted to his VM-specific forum.  And he has a discord channel (https://discord.gg/9UTJqXJ) for plugin developers. 

 

Moreover, author welcomes bug/crash reports through email/forum or preferably VM's built-in crash reporter. 

 

Lastly, the latest builds are usually more stable, incorporating more bug fixes (they are not mere "betas").



 08/01/2021 04:43 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 865
Joined: 11/05/2015

Nice thread, ax! It must have taken a lot of work to create it so that others could learn about VM. Thank you.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and fat mouse

 08/01/2021 05:16 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Glad you appreciate my effort to introduce this, Kevin!

It did take up a good chunk of time - mainly to make sure I don't feel like I am misleading anyone.

My way of "paying it forward", so-to-speak.

As you may know, I have been looking precisely for such a small-footprint "Swiss Army knife" tool for a while ("swiss army knife" was VM author's own descriptor, incidentally).  One of my first posts (after a long hiatus) back to this forum was "fishing for some ideas", so as to complement the cloud dragon I have been made to use for work.  VM's portability and capabilities are basically tailor-made for what I need to do.

Hope it helps others as it is helping me.



 12/30/2021 04:21 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Originally posted by: ax Next up, basic Recognizer Settings: Wordlist weight: utterance 100% against my custom dictionary (which is simply the list of defined commands in the active profile) Dictionary weight: utterance 0% (channelling VM author: a low, but not too low digit preferred) against a "standard" dictionary - which is EXACTLY what I need as I am using VM as a Command-ONLY tool.

 

See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently).  Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.

 

Recognition threshold: 85 - 90% is suitable for me as I prefer specificity over sensitivity.

 

I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.

 

 

 

A field update after months of production use:

 

Despite what I wrote above with respect to a recommended "non-zero low Dictionary weight", in my case, the 0% "Dictionary weight" specificity still works out the best, in conjunction with 85 to 90% of "Recognition threshold" sensitivity.


Even 1% "Dictionary weight" is inferior to 0% as judged by my own real life usage.


Yes, after a while (weeks to months), if VoiceMacro is left running in an "always-listening" mode, gratuitious noises can be falsely recognized as commands, especially with some of my 2-syllable commands (no, not the best practice to have 2-syllable commands, I acknowledge... but nevertheless it's a matter of striking a balance between efficiency and convenience).

The solution is simply to delete Speech Recognition Profile in Windows

 

NB: if the default Windows Speech Recognition Profile, usually named after the current login, is the only one in use, then you can't delete it until you start up a second one, which could be named "temp".  After a "temp" profile is created and your default profile deleted (needs to press "Apply" button for the change to stick), you can go on to re-create another profile, which Windows will default back to your login name.


Because I am using VoiceMacro as a command-only tool, deleting any existing Windows "Speech Recognition Profiles" is rather inconsequential.  At most I would train the new profile once with the standard Windows speech recognizer training screen (for 5 minutes tops), and then can be done with it as the newly reset profile would have the same accuracy as the previous.

At home, where there is relatively less noise compared to the hospital, I haven't even had to delete/reset the Speech Recognition Profiles yet.  The 0% "Dictionary weight" has served its purpose so far.

 

P.S., the recommendation to use a "non-zero" value for Dictionary Weight by VoiceMacro author was most likely based on user experience such as described here (partly in Deutsch - nothing Google Translate couldn't handle with aplomb).  But that user's mother was quite debilitated and didn't have any "sovereign control" over noises in her environment.  Nor could she be expected to know how to "reset" speech Recognition Profile in Windows" once in while as necessary.



 01/13/2022 05:56 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 775
Joined: 07/08/2019

I only just noticed,  6 months later,  @ax  taking my name in vain -  or at least my  chemical symbol "Ag". :-)

 

Originally posted by: ax 

Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):

 

Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?

 

 

A: Seriously, how would one expect a dodo north of the 49th parallel to be able to answer that (and one who has never even beholden either up close)?

 

Hey, @ax, *some* Canadians  were or are in the Army reserve. 

 

Neither AR-15 nor Glock are on https://en.wikipedia.org/wiki/List_of_equipment_of_the_Canadian_Army.

 

I was sad when Canada switched to the C7 member of the M-16 family.

 

For my money, as an assault rifle the AK-47 is better suited to Canadian conditions, ranging from Arctic to muskeg to boreal forest.  If you want distance, accuracy, and the ability to stop polar/grizzly/pizzly bears or moose, the good old FN FAL (Canadian C1).  Not as good as a real hunting rifle, but better than the AK-47 and much better than the AR-15.

 

 

:-)

 



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.



 01/15/2022 10:25 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Originally posted by: Ag

 

I only just noticed,  6 months later,  @ax  taking my name in vain -  or at least my  chemical symbol "Ag". :-)

 

 

Originally posted by: ax 

 

Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):

 

 

 

Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?

 

 

 

 

Excusez mon "certifiable mindset", for transgressing your sterling call sign, Capitaine!  Number myself a fan of the Socratic style of discourse.

 

The only alphanumerically-monikered "equipment" that regretably became a requisite in my daily existence are stamped with "N95" ... not a huge enthusiast of this style of "nom de guerre".

 

At least "Glock" sounds colloquial ... not that I have anything else (knowledge or desire) to add to this subject.



 01/17/2022 09:34 AM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 252
Joined: 12/07/2014

Wow, great post! 

I have been using VoiceMacro extensively (and very happily) for the past few weeks in my actual work, having uninstalled Dragon again recently for the millionth time in disgust. I'm a technical/patent translator, and so use speech recognition mainly to control my translation software (memoQ), and to dictate the occasional bit of text. VM is much lighter/quicker on my computer and doesn't bring things to a crawl like Dragon invariably does.

I haven't had much time to add commands, but here is what I currently have:

(basically all the stuff I need to do when working: add selected terms to termbase, run concordance search, insert matches from termbases/translation memories, search in termbases, etc.)



-------------------------

VoiceMacro / AutoHotkey
Win 11 – 64-bit, i7, 32 GB RAM
Logitech webcam mic 




 01/17/2022 04:24 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Originally posted by: michaelbeijer

 

..., and to dictate the occasional bit of text. 

 

 

Nice to hear you were able to dictate some prose with it.  That's something I haven't tried myself.  Here is hoping that the new and improved MS Speech Recognizer on the horizon will "kick it up a notch" in that regard.



 05/06/2022 01:18 PM
User is offline View Users Profile Print this message

Author Icon
michaelbeijer
Top-Tier Member

Posts: 252
Joined: 12/07/2014

Here's a quick test of me dictating some flowing text with VoiceMacro: https://www.youtube.com/watch?v=AE5Y1Pcu5o4



-------------------------

VoiceMacro / AutoHotkey
Win 11 – 64-bit, i7, 32 GB RAM
Logitech webcam mic 




 05/06/2022 06:20 PM
User is offline View Users Profile Print this message

Author Icon
ax
Top-Tier Member

Posts: 386
Joined: 03/22/2012

Your video definitely piqued my interest and indeed the output you showed was quite "not bad".

Questions if you don't mind:

1. It seems that you were using Windows 11's "Voice Access" as a recognizer for VoiceMacro.  Am I wrong?  I mean VM itself doesn't do any recognition but rather harnesses an underlying Redmond recognizer, which is what, "Microsoft Recognizer 8.0" under Windows 10?  But are you "channeling" recognition through Win 11's Voice Access somehow?

2. If indeed, how did you manage to hitch VM (VoiceMacro) to VA (Voice Access) as a recognizer?  In fact in your video, VoiceMacro's recognition history window consistently showed something different from the text spit out by Voice Access.  Or am I missing something?

3. Can you customize/add vocabulary in some way?

What you demonstrated certainly handled generic prose better, by leaps and bounds, than the 1-trick "Dragon Medical Embedded" used by myself would at the same task.

Thanks for sharing Michael!



Statistics
32285 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 0 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 389 guests browsing this forum, which makes a total of 389 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.