KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: What software to use for PROGRAMMING?
Topic Summary:
Created On: 05/19/2020 05:40 AM
Status: Post and Reply
Linear : Threading : Single : Branch
 What software to use for PROGRAMMING?   - tripoF - 05/19/2020 05:40 AM  
 What software to use for PROGRAMMING?   - kkkwj - 05/19/2020 10:02 AM  
 What software to use for PROGRAMMING?   - PG LTU - 05/19/2020 11:33 AM  
 What software to use for PROGRAMMING?   - tripoF - 05/19/2020 01:26 PM  
 What software to use for PROGRAMMING?   - Lunis Orcutt - 05/19/2020 10:12 PM  
 What software to use for PROGRAMMING?   - Mav - 05/25/2020 02:44 AM  
 What software to use for PROGRAMMING?   - R. Wilke - 05/25/2020 01:48 PM  
 What software to use for PROGRAMMING?   - PG LTU - 05/25/2020 02:09 PM  
 What software to use for PROGRAMMING?   - Mav - 05/26/2020 03:20 AM  
 What software to use for PROGRAMMING?   - kkkwj - 05/20/2020 09:47 AM  
 What software to use for PROGRAMMING?   - MDH - 05/20/2020 11:33 AM  
 What software to use for PROGRAMMING?   - kkkwj - 05/20/2020 01:48 PM  
 What software to use for PROGRAMMING?   - MDH - 05/20/2020 03:01 PM  
 What software to use for PROGRAMMING?   - kkkwj - 05/27/2020 03:09 PM  
 What software to use for PROGRAMMING?   - Lunis Orcutt - 05/27/2020 03:49 PM  
 What software to use for PROGRAMMING?   - MDH - 05/27/2020 04:42 PM  
 What software to use for PROGRAMMING?   - Mav - 06/02/2020 08:58 AM  
 What software to use for PROGRAMMING?   - Ag - 06/02/2020 12:40 PM  
 What software to use for PROGRAMMING?   - Lunis Orcutt - 06/02/2020 02:12 PM  
 What software to use for PROGRAMMING?   - Ag - 06/08/2020 02:51 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/02/2020 03:47 PM  
 What software to use for PROGRAMMING?   - PG LTU - 06/02/2020 04:38 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/02/2020 06:05 PM  
 What software to use for PROGRAMMING?   - Mphillipson - 06/04/2020 01:34 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/08/2020 12:32 PM  
 What software to use for PROGRAMMING?   - R. Wilke - 06/08/2020 02:35 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/10/2020 06:38 PM  
 What software to use for PROGRAMMING?   - Ag - 06/08/2020 03:27 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/10/2020 06:48 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/10/2020 06:35 PM  
 What software to use for PROGRAMMING?   - alexander - 06/26/2020 03:20 PM  
 What software to use for PROGRAMMING?   - kkkwj - 06/29/2020 10:56 AM  
 What software to use for PROGRAMMING?   - benTalks - 09/06/2020 10:35 AM  
Keyword
 05/19/2020 05:40 AM
User is offline View Users Profile Print this message

Author Icon
tripoF
New Member

Posts: 2
Joined: 05/19/2020

Hi,

I'm a professional programmer (Java & Python) and can't type a lot anymore, so I got to switch to Speech Recognition.

I've tried WSR with Win 7 which worked pretty okay (language German). Now, I've switched to Win 10 and also the language to English (I'd prefer keep using English from now on).

I've encountered these problems:

- Even with training the recognition is very bad, much worse than in Win 7 German. 

- I have a lot of Macros defined like: lee lee 2, lee lee 3, ..., lee lee 100 --> Press the LEFT key 2/3/100 times. Now WSR recognizes the numbers so enormously bad! Very often I say "lee lee 4" and it does "lee lee 40" or even "lee lee 97"?! Or even "righ ree 40" (is the analogon for the RIGHT key). And this way I just can't programm efficiently, impossible! (With Win7 German it worked much better)

- Very often it doesn't recognize a command but tries to "detect" some dictation. Like when I see "lee lee" it often recognizes like "he's free"...

- It interprets some words to the end I've never said. Like I dictate "hello you" and it writes "hello you there" or "hello you will". Also happens very often..

So I don't know what's the reason? Is it the switch to Windows 10? Works Win 7 WSR better? Is it the language switch? Is German maybe a "clearer" language?

Anyway, what I need is:

- Self defined (so I can use my own "made up language", it's faster) commands to navigate (like LEFT, LEFT+CTRL, LEFT+SHIFT, ..., CTRL+C and so on)

- I need to program own commands, like for doing camel case or camel case variables. I'd wanna do it in Java Script (but that's not important)

- So I need safe and FAST working self defined commands (for both InsertText and PressKeys) (I've heard Dragon's self defined commands are slow - is that true? Would be very bad in my case)

- Also, Dictation (in English)

But, the self defined commands are certainly more important than the dictation!

---

So, what do you recommend me to use? I'd be willing to pay some money, it's urgent.

Is maybe Dragon good for that? Maybe a mix of WSR and Dragon?

 

Thank you very very much!!

 05/19/2020 10:02 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Yes, I think Dragon (with scripting) would be a much better experience for you than WSR. With KnowBrainer, you can even define your own "language" commands by voice. Your other alternative is to try the Natlink approach where you define your own language grammar too (not just your own commands). There is a paper floating around on "the state of the art of voice programming" (or something like). It was written a few years ago, and discusses all the alternatives and their cost/benefits. Sorry, but I don't a link for it. I mention it because it did a pretty good job.

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 05/19/2020 11:33 AM
User is offline View Users Profile Print this message

Author Icon
PG LTU
Top-Tier Member

Posts: 2072
Joined: 03/21/2007

Python, hmmm. Have you looked into Vocola for WSR? I can't say how well it has been supported into Win 10, but it used to be eminently helpful . . .
http://vocola.net/v3/

Dragon is better than WSR, but mostly for command and control of applications and desktop environment (and some better auto-formatting and custom word abilities) but I'd hazard you'd be fine with WSR and vocola.

Hth,

 

EDIT: I see elsewhere on the Forum a claim that WSR has not been developed by MS since 2007?  If that is true, I take back the limitations on where Dragon excels over WSR and better say more in general, it's better.



-------------------------




PG





Remember folks, my comments and this forum are for entertainment value only, please, no wagering or other reliance on the contents herein.  I permit no commercial use of my ideas (whether expressions or embodiments) without my written consent.



 05/19/2020 01:26 PM
User is offline View Users Profile Print this message

Author Icon
tripoF
New Member

Posts: 2
Joined: 05/19/2020

Thank you for the answers!

 

To point out clearer what I need:

I'd define classic things I need for Python/Java on my own with self defined commands: like Integer, System.out.print, String, ...

And I need commands for navigation self defined: like "go up 5 times"

And very important: Commands also self defined for variable names, like: "camel case temp iterator" --> tempIterator (I'd dot this with JavaScript)

 

With Win 7 and WSR worked all of this pretty fine. Now, with Win10 & English not anymore. The recognition of my self defined commands is very very bad.

 

So I need to know what would be the best solution for the mentioned things.

I think in Dragon you can of course also define commands your self, right? But I've heard that they're slowly executed, slowlier than in WSR. Does someone know if that's true?

 

For general dictation Dragon is far better as I know.

But I mean, I also have to know if Dragon is also better for self defined commands. 

Especially concerning these things: (Execution velocity), (recognition) of the self defined commands, and (ability to define the commands, like can you use JavaScript just as well as with WSR).

 

Thanks a lot!



 05/19/2020 10:12 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 38002
Joined: 10/01/2006

                 Welcome (See Mission Statement)


Dragon 15.3 Advanced-Scripting command execution can take as much as 10 times longer because of Nuance's refusal to retire the miserable SAX Basic scripting engine, which should have been discontinued in 2006 In many situations, especially noticeable in Outlook, Dragon commands will simply give up after taking as long as a minute or 2. It's also just as likely to take down Dragon and Outlook. We converted over 3000 Advanced-Scripting commands into DVC scripting which wasn't quite as fast as KnowBrainer but significantly more stable. This is why KnowBrainer switched to WinWrap Basic about a decade ago. It is far more stable and significantly faster. Here's the rub. When Nuance released Ver. 15.5 for medical and group versions of Dragon, they introduce WinWrap Basic. If you were using DPG 15.5, you would be able to improve your situation but if you are stuck with DPI 15.3, you'll have to wait until Ver. 16 is released.

Dragon already includes move up five but it is deliberately designed to deploy slowly. In certain situations, such as when moving through files in File Explorer, you would need to slow down this command; especially if you have Preview enabled. KnowBrainer 2017 includes a similar slowed down version but also includes a much faster version;<Up/Down/Left/Right> <1 to 1000>. For example, if you say  Down 132, it will seem like the movement was instant.  This is also doable in Dragon 15.5 but not Ver. 15.3/4.

Microsoft essentially gave up on WSR in 2006. However, they continue to update WSR (when absolutely necessary) so that it will work in each Windows release. 

PS: KnowBrainer also includes VerbalBasic which is a patented verbal toolbox for creating personal commands, like the creation of a Vocola toolbox.

 



-------------------------

Forum Mission Statement
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ext 1

 05/25/2020 02:44 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Advanced Member

Posts: 200
Joined: 10/02/2008

From what I understand you seem to try programming by voice (i.e. speaking Python code instead of typing).

 

While dictation itself undoubtedly is way better with Dragon than WSR and also voice commands are very flexible, one key aspect of this whole setup will be very difficult to come by IMO:

 

Dictating program code is very different to dictating emails or letters because the vocabulary as well as the contexts for each of the words are different.

 

Basic type names can be added as regular words, if your language is strongly typed and doesn't infer the type automatically.

 

As soon as you introduce a variable, problems start.
Most of the time your variable's names won't be single plain words from the standard vocabulary but a combination of several words which have to be concatenated without spaces and a special casing (i.e. CamelCase), which Dragon (or basically every speech recognition trimmed for conversation dictation) isn't prepared to do.

 

There is a command to keep Dragon from inserting a blank before the following word and also a command to capitalize a word, but you'd have to speak every omitted space and each capitalization, making your dictation very painful and error-prone.

 

I've tried to simply get "Integer myCustomerAccount" by saying "my no space cap customer no space cap account", but this failed most of the time and requires a lot of concentration to get things right.

 

Once you've introduced variable names or class names or function names, you of course would want to dictate them. But all these names are not part of the vocabulary, only the single words they are built from. Once again you'd have to painstakingly speak exactly the right spacing and capitalization or you get a syntax error.

 

I strongly doubt that's practical without some major overhaul of the standard way conversational speech recognition works.

 

Perhaps one could program editor support for writing programs (e.g. create a special mode where all the words in a single utterance are concatenated and camel cased automatically or adding variable/function/class names as temporary words to your vocabulary so that you can reference them easily, something like intellisense for SR...), but I'm afraid this will require a lot of effort.

 

 05/25/2020 01:48 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7236
Joined: 03/04/2007

I strongly doubt that's practical without some major overhaul of the standard way conversational speech recognition works.

 

Mav,

 

Just for your information. The people in the voice coding community have covered all this already long ago, and I wonder why none of them has already shown up here.

 

However, I don't envy anyone using voice recognition to write code to be honest, not even with the best of the enhancements which they have created.

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage



 05/25/2020 02:09 PM
User is offline View Users Profile Print this message

Author Icon
PG LTU
Top-Tier Member

Posts: 2072
Joined: 03/21/2007

Originally posted by: Mav ... I've tried to simply get "Integer myCustomerAccount" by saying "my no space cap customer no space cap account", but this failed most of the time and requires a lot of concentration to get things right.

 

This is not how to do it if you have commands available.  Either dictate it normally as one utterance "my customer account" and then say "camel case that" or just say it all at once "camel case my customer account" as a command utterance.

 

Then, your "camel case [dictation]" command first checks if your ListVar1 is "that" and if it is, emulates "select that" and "copy that" commands to copy the "my customer account" part (and leave it selected), or else the ListVar1 already contains that phrase.  Then use some string processing to get what you want (caps and no caps, underscores or hyphens between words, whatever works for your language).  Then either replace the selection (in the first case) with the result or simply type out the resultant processed phrase.

 

Many folks here have examples of these types of commands for many variable styles or other kinds of declarations.  Vocola (or advanced scripting) makes this quite easy.

 

Hth,



-------------------------




PG





Remember folks, my comments and this forum are for entertainment value only, please, no wagering or other reliance on the contents herein.  I permit no commercial use of my ideas (whether expressions or embodiments) without my written consent.

 05/26/2020 03:20 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Advanced Member

Posts: 200
Joined: 10/02/2008

Thanks for pointing this out.

I was vaguely aware that there was a coding by voice community somewhere but because it seemed so strenuous I didn't care to dig any deeper without actual need.

 

But for the op this could indeed be a better place to discuss the topic.

 

Thanks,

mav

 05/20/2020 09:47 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

IMHO, you definitely should use Dragon for its recognition. Even Natlink, Vocola, DragonFly, and their friends use Dragon for that. Once you have the recognition engine, then recognized words can be used by everything from Vocola to KB to whatever. Despite its problems, Dragon is the dominant and best recognition in the marketplace, for sure. Probably your most significant choice is to use DPI 15.3 or the DPG 15.5(?) product that is supposed to be more stable. After that, your next decision is to decide what to use for scripting. Natlink, Vocola, DragonFly (most use Python) for custom language grammars, or Dragon DVC/Scripting for scripts, or KB for scripts and faster executions. Although KB is much faster than Dragon scripting, I haven't found the speed difference to be that exciting. Most of your time is wasted waiting for Dragon to finish its recognition and dump the result into your document or the script engine. I think KB does a superior job at adding power (access full .NET capabilities) and convenience (some really nice and convenient commands for defining new commands, adding stuff to the vocabulary, moving around, and so on.) Dragon + KB is a top-end solution - it's hard to imagine having something better than that, IMHO. Then the rest is up to you (for custom scripting, etc.) Oh yes, and PG has many tips on using AHK for tricky things. Good luck with your upgrade!

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 05/20/2020 11:33 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2163
Joined: 04/02/2008

"you definitely should use Dragon for its recognition"

 

On a number of occasions I have tested MMODAL. It is clearly at least as accurate in terms of recognition as Dragon, and to my observations, slightly more so. However, it's command writing functionality is PRIMITIVE compared to Dragon's, which for the way I use speech recognition, makes it a non-starter. That said, most physicians, and probably most non-physicians, do not create custom commands. So probably for 95% of people using speech recognition, MMODAL is as good as, if not superior to Dragon for straight dictation purposes. But if one uses speech recognition to it's full potential, creating custom commands to improve workflow and reducing clicks in click-intensive applications (read here in EMR's), then stick with Dragon.

 

MDH



-------------------------
 05/20/2020 01:48 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

That's interesting - I could be wrong, but the MMODAL website says that it is embedded into EHRs. (So, not available for common people.) Do they use some other voice recognition code other than the Dragon engine? That would be most interesting. Maybe the MMODAL guys don't see the "common man" market worth pursuing.

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 05/20/2020 03:01 PM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2163
Joined: 04/02/2008

MMODAL is not embedded in either Greenway or Epic, which are the previous and current EMRs that our hospital/clinics have used. They work closely having a command set they have made for Epic (which suck), but MMODAL is not integral to Epic. I use Dragon with Epic. They would be smart to go for the non-medical market. As an example, the other day, for fun, I was "playing" with MMODAL to see if their command functionality had improved with time. I dictated 500 words and only had a misrecognition on just one word = 99.8% accuracy! It doesn't get better than that. Too bad they have such an inferior command functionality in comparison. But most users don't care at all about creating commands.

 

And yes, I am certain they use other than the Dragon speech engine, which Dragon would never allow as they are competitors. As Lunis will tell you, Dragon is suing MMODAL for patent infringements, but I don't think the speech engine is the issue.

 

MDH



-------------------------


 05/27/2020 03:09 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

After reading your post, I went back to the MMODAL site and tried to find anything that looked like I could try it out. Nothing. It's clearly aimed at the medical market and talks about using a cloud recognizer. It looks like they support almost a dozen different medical applications from dictation to coding, but nothing for the common man that I could find. I guess Nuance / Dragon is the only real alternative for non-medical applications.

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 05/27/2020 03:49 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 38002
Joined: 10/01/2006

M*Modal has no interest in anything other than Medical use because that's where the money is and it wouldn't be possible for them to compete with Dragon on any level.



-------------------------

Forum Mission Statement
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ext 1

 05/27/2020 04:42 PM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2163
Joined: 04/02/2008

"M*Modal has no interest in anything other than Medical use because that's where the money is and it wouldn't be possible for them to compete with Dragon on any level."

 

Not quite so based on KLAS Research:

 

https://www.salesforce.com/content/dam/web/en_us/www/assets/pdf/datasheets/KLAS-Best-in-KLAS-Software-and-Services-2018.pdf

 

 

MDH



-------------------------


 06/02/2020 08:58 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Advanced Member

Posts: 200
Joined: 10/02/2008

Please note that KLAS is comparing M*Modal's cloud-based speech recognition to Nuance's Dragon Medical One, which is also cloud based and can not be compared to DMPE or DPG/DLG/DPI which all are being run on a fat client and without outsourcing recognition to a server somewhere else.

 

 06/02/2020 12:40 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 449
Joined: 07/08/2019

First, a question for PG_LTU and kkkwj: after KnowBrainer, which would you pick up next: DragonFly or Vocola?


I am more comfortable with having a full programming language via Python, whereas Vocola seems to be yet another scripting llanguage of limited capability. But I keep hearing knowledgeable people like PG_LTU speak highly about Vocola. (For all I know, Vocola may easily interface with Python.)


I somehow have the impression that DragonFly and the other Python NatLink based scripting are more actively maintained than Vocola. Is this incorrect?


I am distinctly concerned about the possibility of investing time in a speech recognition/command system that will become orphaned. I'm worried about Dragon, but more so about NatLink interface to Dragon used by Vocola and DragonFly. Both Vocola and DragonFly seem to be trying to become independent of Dragon/NatLink, both with WSR and DragonFly also trying to use Kaldi.


I have benefited from the commands that KnowBrainer has already written. Although I am sure that I use less than than 10% of them, even with the software products that I actively use ( Chrome, Edge, Firefox, Thunderbird, OneNote, Outlook, Excel, ... and then EMACS for my programming). Nevertheless, it is always defined that somebody has already written the command, so I don't need to write it. (Although it's frequently a pain to try to find the command is...)


Q: can DragonFly and/or Vocola coexist with KnowBrainer commands? Even if both defined the same commands - does one take priority cleanly? Obviously it is possible to have 2 command/grammar systems sitting on the speech input stream - e.g. when I say "RESTART Dragon" by both SpeechStart+ and KnowBrainer throw up a command at the same time. But it would be better if one took priority over the other.


--


Generically, I am thinking of creating a GitHub project to share various scripts that I have written. Possibly also to collect scripts posted by others on this forum. Before I do so, has anybody in the speech community set up the equivalent of Perl CPAN or Python PyPi for speech recognition tools in general? It might be KnowBrainer or Dragon scripting specific, but there might be value in collecting or indexing multiple speech command systems in the same place. One-stop shopping, as it were, especially if it is possible to use multiple command interpreters at the same time.


Also, if I set up a GitHub project probably have restricted access, whereas if somebody is already figured out a sharing system whereby multiple people can contribute, that would be good. (I know, get a pull request, but it is remarkable how often the same software is available in multiple GitHub projects. Not just full project forks, but also libraries it happened get used by multiple projects, but which have not yet been created as independent projects of their own.)


One of the advantages I imagine about using Python based DragonFly, etc., is that it might be possible to use the existing PyPi distribution system.

-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.

 06/02/2020 02:12 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 38002
Joined: 10/01/2006

Originally posted by: Ag First, a question for PG_LTU and kkkwj: can DragonFly and/or Vocola coexist with KnowBrainer commands? Even if both defined the same commands - does one take priority cleanly? Obviously it is possible to have 2 command/grammar systems sitting on the speech input stream - e.g. when I say "RESTART Dragon" by both SpeechStart+ and KnowBrainer throw up a command at the same time.


KnowBrainer does not include a Restart Dragon command so you are only looking at SpeechStart+ for this function. As far as what takes precedence, Dragon is designed so that any third-party utility will take precedence over a Dragon command but note that global third-party commands will not take precedence over Dragon application-specific commands. Third-party utilities will only take precedence over Dragon when both commands are window, application or global specific. To our knowledge, all third-party command utilities work well together but only one command, with the same name, will be deployed. For example, if KnowBrainer and Vocola both include the same global command one or the other will take precedence. We do not know which but no 2 commands, with the same name, will deploy simultaneously. 1 of the commands will simply take precedence.

Note that the previous post pictures are somewhat large and may prevent some end-users from stretching the browser Window far enough to view the Reply menu, which unfortunately does not include a horizontal bar. However, everyone has access to the Quick Reply field.



-------------------------

Forum Mission Statement
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ext 1



 06/08/2020 02:51 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 449
Joined: 07/08/2019

Originally posted by: Lunis Orcutt
Originally posted by: Ag Obviously it is possible to have 2 command/grammar systems sitting on the speech input stream - e.g. when I say "RESTART Dragon" by both SpeechStart+ and KnowBrainer throw up a command at the same time.
KnowBrainer does not include a Restart Dragon command so you are only looking at SpeechStart+ for this function. 

I wrote my own KnowBrainer "RESTART Dragon" command, which displays a message box,  as part of debugging startup problems with SpeechStart+. I'm still running it.

 

I regularly see both  my KnowBrainer "RESTART Dragon" and The SpeechStart+ "RESTART Dragon"  command run at the same time when I say "RESTART Dragon".  I just saw it right now.   Though obviously SpeechStart+ does not take "ownership"  of the "RESTART Dragon"   utterance -  but it either passes it through (if  SpeechStart+  is layered underneath  Dragon)  or both SpeechStart+ and Dragon receive the audio stream  concurrently.

 

if the latter, confusion may arise when different speech recognition engines recognize the same command at the same time with different purposes.   This is not that much of a problem for SpeechStart+, because when running "RESTART Dragon"  and "Microphone On"  in SpeechStart+ it doesn't matter what Dragon/KnowBrainer are doing:  if microphone on,  then Dragon/ KnowBrainer are doing nothing,  whereas "RESTART Dragon"  is aboutt to kill Dragon, so as long as the conflicting command doesn't inflict any damage  while Dragon is being shut down, it's okay.

 

 I suppose even  if the former.

 

--

 

BTW,  in some old post I described the SpeechStart+ startup problems. It looked like there was a race condition at boot.   I have worked around this race condition by not starting KnowBrainer Dragon in startup, instead letting SpeechStart+ and all of the rest of my  startup settle down.

 



-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.

 06/02/2020 03:47 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Originally posted by: Ag First, a question for PG_LTU and kkkwj: after KnowBrainer, which would you pick up next: DragonFly or Vocola?

 

Hi Ag, I'll try to answer you as best I can. 

 

Bottom line: If it was me, choosing between your two alternatives, I would choose DragonFly. But in real life, I chose to walk away from both of them.

 

Details: I looked into the very same question within the past year. Like you, I run a mix of the same programs that you name (although I haven't used Emacs much in the past year; my automated Lisp tools are not yet installed on the new machine I built over Christmas).

 

THE OVERVIEW PAPER, THE EMACS DEMO

 

There's a paper out there somewhere titled something like "State of Voice Programming." It was written a few years by someone (I forget who). He wrote a few pages on the same things I'm writing about in this post. But he actually used Vocola or DragonFly, whereas I abandoned that path before wasting more time. I didn't learn much from the paper, although a newbie might. I doubt you would learn much from it either.

 

There's also a semi-interesting video out there where a guy uses Dragon and a variety of gutteral sounds and squeaks to program in Emacs. He hurt his hands in an accident and had time to learn and program the Lisp code. After his hands healed, he said he still uses it 40% (?) of the time for convenience. 

 

NATLINK, VOCOLA, DRAGONFLY, and FRIENDS

 

Vocola, DragonFly, and their friends all use the NatLink interface.That's scary because Nuance could shut it off at any time in their next Dragon version. But they haven't yet, so that's good news. In contrast, I think KB (and maybe DragonCapture) use a more documented interface into Dragon (Lunis and Rudiger can provide more info).

 

I found that Vocola was a tad more documented than DragonFly (and there was another one too, but I forget the name of it), but was limited in scope. Not only for the existing commands, but for future headroom. I could not imagine myself investing effort along that tool chain.

 

DragonFly had more headroom, but was another fringe project. The scope of implemented commands, like Vocola, was limited. 

 

The problem with all those tools is that if you want to do something new, the incremental cost of adding a new function is pretty high. You need to write a new grammar rule, make sure that it does not conflict with anything else, test it, and then document the details of the rule(s)/tokens and backing code, lest you forget.

 

They are probably all reasonable tools for what they were intended to do. But will they satisfy your requirements? What exactly do you need to do that KB can't do?

 

MY EVENTUAL CONCLUSIONS

 

I concluded that NONE of the NatLink family (Vocola, DragonFly, etc.) were ANYWHERE CLOSE to KB in terms of existing commands, headroom (KB has full .NET access), stability of Dragon interface (not risky like NatLink), documentation, and superb-Lunis + forum support. So, my recommendation is to stop investing your brainpower searching for (low-hanging) tools that could help you.

 

I spent probably a good month of time doing that research trying to find info (piecemeal, old, incomplete), documentation (there was none), or quick implementations to study (there were a few tiny grammar examples, but not much. You would have to study the python and grammar code to figure out what was going on.) Although I walked away satisfied because I knew that I had investigated that route, I wish I had a read a posting like this one to save me the month of effort.

 

Even if you installed those tools (always with some difficulty), you'd still have to spend many hours writing and debugging custom grammars and code to implement the grammar operations. In contrast, you can create a new KB command in seconds, with Lunis helping you all the way in custom red ink to help you see the spoken words.

 

POSSIBLE ROUTES FOR YOU

 

If you really need to scratch an itch to write something yourself, I would suggest you write some little AHK scripts/PGEmulate scripts or little utilities like PG does and then call them from Dragon. Or write MacroExpress expansions for injecting pathnames like Alan explained - that seemed like a great way to ease the pain of moving around with the FileExplorer.

 

Or if you still need more, write some Emacs Lisp functions to do interesting things and call them from Dragon. Emacs will be there forever. The problem with Emacs (I looked at doing that too) is that Emacs is great for editing but has no easy access to the OS (think SendKeys, for example). Again, I walked away from that route too. Investing in Dragon or KB scripting is a far better alternative, IMHO.

 

A GITHUB PROJECT

 

Speaking of Github, Mark Philipson already has a Github project for his tools. He has kindly provided all the C# (Yay Mark!) source code for his tools. I keep telling myself I should download them to figure out what he's got going there. He has a custom intellisense program, a snippet program, and some other tools that I can't remember. He uses them all the time, and so you could ask him if you had a problem.

 

You might find it more productive to contribute to his tools/project rather than starting a new one for similar tools (if your tools vision overlaps with his; maybe it doesn't). At any rate, you could easily fork his project and continue it on your own.

 

You might also be further ahead to spin the forum transactions for all the nice scripts that Edgar and PG and others have posted over the years. I consider that to be a semi-wasted resource because they are so hard to find.

 

I have often wondered if there is a way to combine the knowledge/tools/scripts of all the senior forum guys into one place, hopefully with some documentation to guide newbies (like you and me) who are looking for ways and tools to improve their productivity. Edgar and PG and other have a HUGE, HUGE amount of knowledge and tools that you just never get to see except in little snippets. Maybe you can think of a way to do that. I would help on a project like that.

 

Collecting and making visible the forum knowledge would be helpful to new people, I think. Maybe it never happens because everyone who is capable just rolls their own custom solutions, and everyone who cannot program or invest the time just does without (or picks up a tip script here and there).

 

And last, you could start your own Github project with a manifesto to see where it goes.

 

Hopefully, my long scribbles will give you some perspective and save you some time.Saving time and effort is what I wrote it for. Maybe it will save some other newbie (or senior person) some time and effort and angst by convincing them to just use Dragon and/or KB scripting and/or Mark's C# tools.

 

Cheers, Kevin



-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones



 06/02/2020 04:38 PM
User is offline View Users Profile Print this message

Author Icon
PG LTU
Top-Tier Member

Posts: 2072
Joined: 03/21/2007

+1 Kevin.

Another one to look into is Caster which has language-specific grammars. For me, I started with vocola

because it worked with Premium editions of DNS (old days) and I still use a few grammars from back then,

but once I got a Dragon Pro version, I only used DVC and DAS scripting (plus AutoHotKey) and never

went back to vocola. But I don't code for a living.

The thing with the sample I provided here is quite flexible. In my full implementation I have a function

to check spacing (a coding-specific version of pgCheckNewPara that works with Sublime's auto spacing) and

also looks for the sole word "that" in the dictation ListVar2 so I can first say the variable names and

then "camel case that" to do the conversion.

Note, I often break up the command almost naturally, but it still works quite naturally, so I can say
"let camel case my variable name equals" to get Let myVariableName =

then I can dictate what ever it equals, or say "string what it equals" to get:

Let myVariableName = 'what it equals'

 

or I can say "what it equals" and then "string that" because my "string [dictation]" command also

looks for "that" in the ListVar2. For that matter, I can also say "equals string what it equals" becuase "string" is

actually a list that include "equals string" and the command adds the " = " when the command name starts

with "equals" in it (like I handle Dim and Let).



or I can say "let camel case my variable name equals string" to get Let myVariableName = '' with the cursor

between the two ' apostrophes.
Then I can dictate what ever it equals. Saying "move" moves me to the right of the trailing ' apostrophe.

So, less brain power coming up with the full statement all at once, but naturally, I could if I thought about

it in advance:
"let camel case my variable name equals string what it equals" to get this all at once:

Let myVariableName = 'what it equals'

Hth,



-------------------------




PG





Remember folks, my comments and this forum are for entertainment value only, please, no wagering or other reliance on the contents herein.  I permit no commercial use of my ideas (whether expressions or embodiments) without my written consent.



 06/02/2020 06:05 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

PG, .. Caster! That was the NatLink guy that I couldn't remember with DragonFly. Thank you for the reminder.

When I read your camel case implementation it made me smile. I thought it was brilliant. And the post just above where you demonstrate dictating a line of code is also excellent. I especially like the two-command approach because doing it all in one command requires lining up a long line of words in your brain and speaking them clearly in one go.

** I think this is important.** It seems to me that the user models are the key aspect to getting these little tools and command to work properly. If you don't have a decent user speaking / use case model that users can speak, then you end up with special case commands that don't fit together well, are hard to remember, and probably hard to dictate. I think a document describing the models that PG and Edgar use to get things done in particular situations (use cases) would be every bit as valuable as having the scripts themselves.

Just my two bits. (But of course, it's challenging to think up and document good user models for everything that happens in programming.)

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 06/04/2020 01:34 PM
User is offline View Users Profile Print this message

Author Icon
Mphillipson
Top-Tier Member

Posts: 230
Joined: 09/22/2014

This is the KnowBrainer script I use for dictating variables:

https://www.screencast.com/t/PdQCZCZPN

 

 






-------------------------

Thanks Mark


 


 


 


Dragon Professional Advanced Scripting/KnowBrainer Scripts
Video Examples of Coding by Voice

 06/08/2020 12:32 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

I saw a couple of voice programming videos on the weekend. I am adding links to them to this thread to put the information in one place.

TALON

Talon runs on Mac computers (Windows in Beta), currently version 0.0.77 or something. It ships with a non-Dragon voice recognition engine, but is most commonly hooked up to Dragon for Mac v4 or v5. (Nuance discontinued Dragon for Mac, so Talon users search Ebay for old copies of it.) I don't know about the quality of the default non-Dragon recognition engine. Maybe it uses some Siri-like engine. 

It's obviously not a mature product, but the demo girl does a good job of low-level editing Perl code by voice in a Vim/Sublime/iTerm text editor. Talon is written in Python and requires grammar rules and recognition objects, so it operates more or less at the level of NatLink in Dragon.

The default in Talon is that utterances are commands, not dictation text. If you watch the video, each time she wants to insert some straight text she precedes the utterance with "phrase xxxx." The phrase keyword switches Talon into what we would call dictation mode for the length of the utterance. Then by default it goes back to command mode and interprets utterances as commands until the next phrase keyword.

FWIW, my take is that the free Talon is like NatLink/DragonFly/Caster -- a low-level system for programmers to program their own grammars AND command-action code. This will limit the generic usefulness and UI of the system for a long time because you must be a Python programmer to achieve anything of consequence.

Search for "talon voice programming" on YouTube for a pile of videos
"Voice Driven Development: Who needs a keyboard anyway?" by Emily Shea
https://www.youtube.com/watch?v=YKuRkGkf5HU  

VOICE ATTACK

Voice Attack runs with Windows speech recognition, is written in Python again, and provides a user interface that lets you define your own commands. The default interface is kind of a Dragon step-by-step interface where you build up your command scripts by selecting predefined steps. One of the steps allows you to call out to .NET libraries (C# was demonstrated), so in theory the system is fully extensible in any .NET language.

The video demonstrates several small low-level editing commands that send keystrokes to the foreground app. The most complex commands he demonstrated were a command to insert underscores in variable names and one to copy text to the clipboard, edit it and replace it, then paste it into his editor buffer.


Recognition accuracy with Windows Speech Recognition was an issue.

FWIW, my take is that the $10 VoiceAttack is a big step above NatLink and DragonFly / Caster because VoiceAttack has a UI. Clearly, non-programmers could use it. (It was originally developed to give voice commands in video games.) You could clearly use this system on Windows to send keystrokes and do things that are preprogrammed into it. And it is fully extensible to make NET library calls, so it has unlimited range, in theory. Probably the biggest limitations of this system are the recognition accuracy of WSR (which might be solved a decade from now) and the lack of Dragon-like intelligence about normal dictation and voice control of everything. From what I could see, VoiceAttack still requires lots of keyboard and mouse interaction with the UI to program commands.


Search for VoiceAttack or Voice Attack on YouTube
Coding by Voice with Voice Attack: a Practical Guide for Programmers
https://www.youtube.com/watch?v=U-NZjzDj-Xk



-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones



 06/08/2020 02:35 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7236
Joined: 03/04/2007

Kevin,

You did a bang job reading all up about voice coding in this forum, although not getting it all correctly, and then went on further looking it up in Google, again not getting it correctly altogether, and I hate to say it, but your time and effort was probably wasted, for the most part.

Just wait until one of the real protagonists in that scene comes along to put you straight, and I would feel very sorry for you if and when it happens.

Although, it might never happen also because they are just not interested in reading this forum any longer, for being a waste of time from their perspective, and who could disagree with them about it.



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage



 06/10/2020 06:38 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Originally posted by: R. Wilke Kevin, You did a bang job reading all up about voice coding in this forum, although not getting it all correctly, and then went on further looking it up in Google, again not getting it correctly altogether, and I hate to say it, but your time and effort was probably wasted, for the most part. Just wait until one of the real protagonists in that scene comes along to put you straight, and I would feel very sorry for you if and when it happens. Although, it might never happen also because they are just not interested in reading this forum any longer, for being a waste of time from their perspective, and who could disagree with them about it.

 

 

Not to worry. Hopefully they would post something informative  that we could all benefit from while they were trashing my sincere but uninformed efforts. :-) I took the time to post mostly for Ag, who is always hunting for other tools and better options, and because I had gone down the same road long ago. So, my hope is that the next guy after Ag might not have to follow in our footsteps so long if they manage to find this post. 



-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones



 06/08/2020 03:27 PM
User is offline View Users Profile Print this message

Author Icon
Ag
Top-Tier Member

Posts: 449
Joined: 07/08/2019

First, thanks Kevin for your writeup.

Second, Rudiger, I can only hope that one of the real protagonists will come along. But why in the world should you "feel very sorry for you if and when it happens", for either Kevin or I? There's no shame in being wrong, especially if we can't find proper descriptions of any hypothetical better way.



--

Finally (for this post), some of the things I am interested in DragonFly for, or for anything where I can add my own grammar, include:


* I rather like the way that DragonFly can chain commands. More and more I would like to create composable commands, like the equivalent of vim "4dd" or "/abc{ret}dd" - e.g. things that combine defining the object specification verbs (movement, selection), and the actual command verb (e.g. delete, highlight, etc.). in KnowBrainer this is done by writing a command for each composition, whereas in VI or emacs they are defined separately, thus avoiding combinatoric explosion. with my Perl and Python scripts that generate commands I can inflict the combinatoric explosion on DragonFly KnowBrainer and not on myself (except when browsing the KnowBrainer sidebar - I really need to create a better Command Browser), but I cannot help fear that adding a combinatoric explosion of command names will negatively impact Dragon/KnowBrainer performance.


* I would really like to make the command recognition occasionally postfix rather than prefix. I can certainly do that in a grammar, although I suspect that Dragon underlying HMM search is prefix.

E.g. I would love it if Could be placed at the beginning of the command rather than at the end. Although unfortunately I don't think that can be done with Dragon.


* Most important, like have better error handling and/or dynamic and/or context-sensitive command definition and discovery.

I doubt that emacs could be made "fully speech ready" anytime soon.

but obviously emacs makes it very easy to discover what emacs commands are available, both in the very large space of commands, and in the much smaller set that is currently bound in any particular emacs mode.

I have already been able to exercisee emacs command completion, by saying things like "press {escape} foo {press tab}" to emulate the keyboard driving emacs. I can imagine speech commands to make this easier, but (a) certainly not with my convention adding "PUFF" before KnowBrainer commands, and (b) probably not at all the Dragon/KnowBrainer system where context is only application or window name. I have already at my emacs to put the mode name in the window title to make it easier for Dragon to handle. But in command completion the context depends on exactly how far along you are. I suppose I could modify emacs command completion to reflect that into the window. But the main issue is that I need my speech commands to be able to get feedback from emacs (or other target program).

... Blah blah blah ...

-------------------------

DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design.

 06/10/2020 06:48 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Originally posted by: Ag First, thanks Kevin for your writeup. Second, Rudiger, I can only hope that one of the real protagonists will come along. But why in the world should you "feel very sorry for you if and when it happens", for either Kevin or I? There's no shame in being wrong, especially if we can't find proper descriptions of any hypothetical better way. -

 

(Smiling) Assuming that I was wrong, of course. (But I don't think I was far off the mark.)

 

The only thing that was not explained on the video was how Talon connected to Dragon. Since it does not connect through Dragon Scripting, and since Talon works with grammars, I thought it was a good bet that it connected through NatLink the same way that Vocola/DragonFly/Caster do. Presumably if you join their forum chat group or read their Python code the interface would be clear.

 

VoiceAttack doesn't work with grammars and doesn't even use Dragon, so it's not a Natlink interface for sure. Cheers



-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 06/10/2020 06:35 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Hi Ag, may I ask what you mean in your wish list by 1) "better error handling" and 2) "dynamic command definition", exactly? Thank you

-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 06/26/2020 03:20 PM
User is offline View Users Profile Print this message

Author Icon
alexander
Senior Member

Posts: 156
Joined: 07/31/2016

Read through the following thread

http://www.knowbrainer.com/forums/forum/messageview.cfm?catid=25&threadid=30660&highlight_key=y

And especially the following links

http://vocola.net/programming-by-voice-FAQ.html
http://explosionduck.com/wp/tag/voice-programming/

Also rather than focusing on just the grammar I would give some thought to the environment you are using. This document discusses a number of items that you can think about with regards to getting around quickly
https://docs.google.com/document/d/e/2PACX-1vSGRicRTJ2iv7rzLnwYxGnUb39usUk_5o2KPxJ5YE91qv-W_lWHD1C7S4syAHM61VAheR5lQ6hoE55W/pub
Here's also another document that you can use as a reference which while the bit dated has a number of grammars to think about
https://docs.google.com/spreadsheets/d/1pk2gwTFbMebgYSsrxIFsZ-QvpEPWCybF8ypdeBvfBsg/pubhtml

There also seems to be a lot of recent development on dragonfly/caster and a gitter It's worth checking out some of these rooms
https://gitter.im/dictation-toolbox/home



 06/29/2020 10:56 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 497
Joined: 11/05/2015

Alexander, thank you for all the links to read. As you say, they make it clear that the voice picture is much bigger than just Dragon dictation into documents. There are several worlds of concepts -> utterances -> grammars -> code -> actions (keys, clicks, and calls) between users and their outputs. I have been poking away at the conceptual architecture of the problem and technologies for quite a while as part of my Sunday night reading. Every time that I see a new Natlink / Vocola / Unimacro / Caster / Dragonfly / Talon / VoiceAttack technology, or a WSR / Kaldi voice recognizer, I try to understand where it fits in the big picture and what problem it's trying to solve.



-------------------------

Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones

 09/06/2020 10:35 AM
User is offline View Users Profile Print this message

Author Icon
benTalks
Junior Member

Posts: 24
Joined: 04/27/2020

I believe this is a perpetual problem. And every so often, someone undertakes a giant project in attempts to solve it. Many of those projects are now abandoned.

For my part, I need to code in HTML, CSS, JavaScript, and a tag language called Twig. I started working with Talon, and it's pretty good, although it is still in beta for Windows. I'm impressed by the hard work of the creator.

The main drawback to Talon is the need to create your own configurations. But I suppose this is true of any software.

Talon works pretty well with Dragon, and you can turn it on and off with a simple voice command "Talon Sleep", "Talon Wake".
Statistics
31793 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 0 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 625 guests browsing this forum, which makes a total of 625 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2020 FuseTalk™ Inc. All rights reserved.