![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: VoiceMacro Topic Summary: Let's dig Created On: 07/31/2021 04:39 PM Status: Post and Reply |
|
![]() |
![]() |
- ax | - 07/31/2021 04:39 PM |
![]() |
![]() |
- ax | - 07/31/2021 04:56 PM |
![]() |
![]() |
- ax | - 07/31/2021 05:27 PM |
![]() |
![]() |
- ax | - 07/31/2021 05:44 PM |
![]() |
![]() |
- ax | - 07/31/2021 05:59 PM |
![]() |
![]() |
- ax | - 07/31/2021 06:21 PM |
![]() |
![]() |
- ax | - 07/31/2021 06:40 PM |
![]() |
![]() |
- R. Wilke | - 08/01/2021 06:27 AM |
![]() |
![]() |
- ax | - 08/01/2021 03:09 PM |
![]() |
![]() |
- kkkwj | - 08/01/2021 04:43 PM |
![]() |
![]() |
- ax | - 08/01/2021 05:16 PM |
![]() |
![]() |
- ax | - 12/30/2021 04:21 PM |
![]() |
![]() |
- Ag | - 01/13/2022 05:56 PM |
![]() |
![]() |
- ax | - 01/15/2022 10:25 PM |
![]() |
![]() |
- michaelbeijer | - 01/17/2022 09:34 AM |
![]() |
![]() |
- ax | - 01/17/2022 04:24 PM |
![]() |
![]() |
- michaelbeijer | - 05/06/2022 01:18 PM |
![]() |
![]() |
- ax | - 05/06/2022 06:20 PM |
![]() |
|
On this rather glorious Civic Day weekend, time to pull a bit of civic duty.
Thanks to the ever-clever PG, I found out about VM - right here where a plethora of useful minds seem to congregate. VM Author does have his own forum, and generally prefers questions to be posted directly there.
That being said, I obtained "permission" (out of courtesy) from VoiceMacro author to review and explore VM here, in large part because this is the only forum from anywhere that I frequent (haven't touched my FB account in 5+ years, even though it still has an "app" on my 6 year-old Blackberry).
Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):
Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?
A: Seriously, how would one expect a dodo north of the 49th parallel to be able to answer that (and one who has never even beholden either up close)? Anyhow, the most he would ever know are the 3 letters: E-D-C.
While desktop Dragon +/- Knowbrainer outclasses VoiceMacro + WSR in sheer oomph, as the former represents the definitive, all-encompassing voice-activated RPA tool on Windows, "wieldy" desktop Dragon is often not (so much). VoiceMacro is "EDC" in a manner desktop Dragon in its current incarnation could not claim to be.
VoiceMacro is PORTABLE (!!!!) - in a no-installation-required sense. Having said that, the program is recommended to be run with Admin credentials in order to "maximize potentials". I use it portably without any Admin credential and without installation - just like all my AHK scripits. It suits my purposes just fine.
It is capable. We shall scratch the surface and see.
And it is stable (relatively). So far it froze once on me when I was running AHK scripts from wrong/phantom directories, whereupon I had to terminate VM ungracefully. [Edit: Author would welcome any crash report, hopefully reproducible, preferably through VM's crash reporter].
Otherwise I have now used VoiceMacro for a grand total of 2 weeks. I was so impressed with it that I donated the second day I used it. It is "donationware".
One "con" I wish to point out even though it is probably no "con" at all to many others. All things being equal, I would rather have a certified version. Signing software has fallen out of vogue with some of the largest portable, free software developers of late, Notepad++ and AutoHotKey being just 2 of them, out of desires not to coddle an "extortionist culture". They do this SH256 hash thing. Absence of a "certification" has not deterred me from using these indispensable portable wares, so far.
But given my own lack of imagination (in a Kierkegaardian sense with respect to the "timid world view" of your average petit bougeoir / worker aristocrat), I would have preferred a separate version of "VoiceMacro Pro" - even if identical to the donationware, but charges money for the "privilege" of running a piece of "certified" software, which appeals to suckers like me. Again, this preference reflects my own professional pusillanimity more than anything else.
[Edited]: It turns out not to be accurate to say that VoiceMacro uses "WSR", but rather both share an underlying "Microsoft Speech Recognizer Engine". Refer to pro explanation here for clarification. |
|
|
|
![]() |
|
First off, let's get "smooth scroll" out of the way.
Setting a variable "loop_large_scroll" is probably unnecessary. But doing so lends to "on-the-fly" change of scrolling distance. Again not so useful as you can use a voice command "stop scrolling" to abort all running macros, thereby stopping scrolling.
One can add a small pause in front of the mouse scroll increment of +/- 1 (smallest), and making it even slower than in this video.
A scroll step of 5 is plenty fast for me. You can always up that to your heart's content.
Two caveats about variables:
1. They are case sensitive.
2. They are "space-sensitive" when you declare them! I.e., "variable" and "variable " are two different variables!! Oh well.
Otherwise VM provides built-in basic conditional structures such as loop and if/else and toggle. The toggling and setting of "toggle state" (from a separate macro) is quite useful.
P.S., I moved the scrolling macro below to a different profile and changed "Abort all running macros" to "Abort all macros from this profile" so as to preserve my listening indicator OSD when I terminate scrolling. Alternatively, one can probably combine "IgnoreCommands" with "IgnoreExceptions" to use voice to stop scrolling.
|
|
|
|
![]() |
|
Next up, basic Recognizer Settings:
See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently). Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.
I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.
|
|
|
|
![]() |
|
Carry on - NATO alphabet speller, which is an annoyingly weak link for my otherwise fit-for-purpose cloud Dragon in its designated browser sessions.
Here VM shines through.
Refer to VM Author's own explanation on this:
|
|
|
|
![]() |
|
Numeral Enunciation:
Prepending with words such as "numeral" increases recognition reliability given that single digits are single-syllable phrases.
No similar compact macros exist for pressing F1 - F12, however, unless one resorts to 12 If statements, which I don't see any advantage of. As far as I can figure out, one needs a separate macro for each F(unction) key press. It is doable to combine, say, F1, Shift-F1, and Alt-F1 into one macro through string manipulation of RecCommand and 1 or 2 conditional statements.
|
|
|
|
![]() |
|
VM Control:
In the odd times when native CapsLock function gets triggered, pressing "Ctrl-CapsLock" will toggle it off.
The "RunOtherMacro" just switches over to a different profile so I can run AHK programs with different, machine-dependent paths. This is for portability purposes. Otherwise gratuitous.
My main work profile is named "Production". By relegating machine-dependent elements to a separate profile, I make sure I can sync my main "Production" profile by exporting and importing the XML file associated with it - among the different machines I use.
In the voice command "stop listening", one can use the thoughtfully implemented "Set Toggle State" to reset existing running toggles to an "off" state, so that keyboard toggle doesn't go out of sync. Finally, a voice prompt on whether the toggle is on or off quickly gets old. I find it much more useful to customize AHK 2-liners to pop an icon into the tray so I know when VM is listening or not.
(Channelling VM author: an alternative to inserting custom tray icon in order to indicate a listening state is to take advantage of the built-in OSD functionality - which I am now using instead, and I have revised the screenshots accordingly).
Of course, this improvization of a tray indicator icon could be avoided if VM's own icon would differentiate into a bright colour when listening is activated. That could be a "feature request" I suppose.
Finally, see VM Author's own pro tip on how to control VM listening and heeding commands through "wake-up words".
P.S., VoiceMacro comes preloaded with a slew of demo macros in its "Demo" profile. A fast way of learning the ropes is simply by modifying them. Some of them have "Only when target window active" checked. Uncheck that when modifying and executing macros intended for any active window/process. |
|
|
|
![]() |
|
Now a few generic comments from the VM Author regarding SAPI and that sort of thing:
SAPI compatibility (also see RW's clarification in post following) Dragon will always be better at "free dictation" (which we take for granted) Author's comment on "pseudo-list commands" (my paraphrase)
Moreover, combining group affixes with profile switches / window targeting will likely deliver well-organized application-specific command deployment. I am speculating here as I haven't tested out the possibilities myself.
Just going by the few examples I outlined above, it is easy to conclude that VoiceMacro is CAPABLE. It goes without saying that VM also has the basic mouse control and coordinate focusing capabilities built-in.
In fact, even sans the voice component, VM's keyboard hotkey implementation aspect alone can probably give something like Macro Express a run for its money.
Anyway, I hope by scratching the surface, it helps someone.
|
|
|
|
![]() |
|
Just one particular, and more random comment on this, leaving it here for you to make the VM community aware of it:
Dragon is compatible with SAPI 4, in certain parts and in some ways, but not with SAPI 5. -------------------------
|
|
|
|
![]() |
|
Thanks RW for clarifying dependencies under the hood.
Moreover, author welcomes bug/crash reports through email/forum or preferably VM's built-in crash reporter.
Lastly, the latest builds are usually more stable, incorporating more bug fixes (they are not mere "betas"). |
|
|
|
![]() |
|
Nice thread, ax! It must have taken a lot of work to create it so that others could learn about VM. Thank you.
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and fat mouse |
|
|
|
![]() |
|
Glad you appreciate my effort to introduce this, Kevin! |
|
|
|
![]() |
|
See this VM author comment on why 0% (or too low of a) dictionary weight is undesirable (I have changed mine consequently). Apparently this whole "weight" thing under WSR could be a wee bit "voodoo" ... thus requiring some trial and error.
Recognition threshold: 85 - 90% is suitable for me as I prefer specificity over sensitivity.
I choose to uncheck the default 'Process "failed" recognition' - I'd rather it do nothing than doing the wrong thing.
A field update after months of production use:
Despite what I wrote above with respect to a recommended "non-zero low Dictionary weight", in my case, the 0% "Dictionary weight" specificity still works out the best, in conjunction with 85 to 90% of "Recognition threshold" sensitivity.
NB: if the default Windows Speech Recognition Profile, usually named after the current login, is the only one in use, then you can't delete it until you start up a second one, which could be named "temp". After a "temp" profile is created and your default profile deleted (needs to press "Apply" button for the change to stick), you can go on to re-create another profile, which Windows will default back to your login name.
P.S., the recommendation to use a "non-zero" value for Dictionary Weight by VoiceMacro author was most likely based on user experience such as described here (partly in Deutsch - nothing Google Translate couldn't handle with aplomb). But that user's mother was quite debilitated and didn't have any "sovereign control" over noises in her environment. Nor could she be expected to know how to "reset" speech Recognition Profile in Windows" once in while as necessary. |
|
|
|
![]() |
|
I only just noticed, 6 months later, @ax taking my name in vain - or at least my chemical symbol "Ag". :-)
Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):
Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?
A: Seriously, how would one expect a dodo north of the 49th parallel to be able to answer that (and one who has never even beholden either up close)?
Hey, @ax, *some* Canadians were or are in the Army reserve.
Neither AR-15 nor Glock are on https://en.wikipedia.org/wiki/List_of_equipment_of_the_Canadian_Army.
I was sad when Canada switched to the C7 member of the M-16 family.
For my money, as an assault rifle the AK-47 is better suited to Canadian conditions, ranging from Arctic to muskeg to boreal forest. If you want distance, accuracy, and the ability to stop polar/grizzly/pizzly bears or moose, the good old FN FAL (Canadian C1). Not as good as a real hunting rifle, but better than the AK-47 and much better than the AR-15.
:-)
------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
|
![]() |
|
I only just noticed, 6 months later, @ax taking my name in vain - or at least my chemical symbol "Ag". :-)
Pretending to be Ag (except minus that IEEE-certified engineering mindset) for a day, let's ask some elementary questions (from a "certifiable mindset", perhaps):
Q: If AR-15 represents the "pinnacle" of small-arms design, why would anyone still carry a Glock?
Excusez mon "certifiable mindset", for transgressing your sterling call sign, Capitaine! Number myself a fan of the Socratic style of discourse.
The only alphanumerically-monikered "equipment" that regretably became a requisite in my daily existence are stamped with "N95" ... not a huge enthusiast of this style of "nom de guerre".
At least "Glock" sounds colloquial ... not that I have anything else (knowledge or desire) to add to this subject. |
|
|
|
![]() |
|
Wow, great post! I have been using VoiceMacro extensively (and very happily) for the past few weeks in my actual work, having uninstalled Dragon again recently for the millionth time in disgust. I'm a technical/patent translator, and so use speech recognition mainly to control my translation software (memoQ), and to dictate the occasional bit of text. VM is much lighter/quicker on my computer and doesn't bring things to a crawl like Dragon invariably does. I haven't had much time to add commands, but here is what I currently have: (basically all the stuff I need to do when working: add selected terms to termbase, run concordance search, insert matches from termbases/translation memories, search in termbases, etc.) ------------------------- VoiceMacro / AutoHotkey |
|
|
|
![]() |
|
..., and to dictate the occasional bit of text.
Nice to hear you were able to dictate some prose with it. That's something I haven't tried myself. Here is hoping that the new and improved MS Speech Recognizer on the horizon will "kick it up a notch" in that regard. |
|
|
|
![]() |
|
Here's a quick test of me dictating some flowing text with VoiceMacro: https://www.youtube.com/watch?v=AE5Y1Pcu5o4 ------------------------- VoiceMacro / AutoHotkey |
|
|
|
![]() |
|
Your video definitely piqued my interest and indeed the output you showed was quite "not bad". |
|
|
FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.