![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: Automatically create speech commands from UNIX-like manpages? Topic Summary: Created On: 03/25/2022 01:45 PM Status: Post and Reply |
|
![]() |
![]() |
- Ag | - 03/25/2022 01:45 PM |
![]() |
![]() |
- mdl | - 04/04/2022 09:30 PM |
![]() |
![]() |
- mdl | - 04/04/2022 09:31 PM |
![]() |
![]() |
- Ag | - 04/16/2022 02:22 AM |
![]() |
![]() |
- mdl | - 04/04/2022 09:32 PM |
![]() |
![]() |
- mdl | - 04/04/2022 09:36 PM |
![]() |
![]() |
- alexander | - 04/08/2022 01:44 PM |
![]() |
![]() |
- alexander | - 04/08/2022 01:46 PM |
![]() |
![]() |
- Ag | - 04/16/2022 09:40 PM |
![]() |
|
Q: has anybody written a tool to automatically produce speech commands or grammars the man pages for UNIX commands?
I often create custom words and sometimes speech commands to make it easier to utter UNIX-like command lines in places like a terminal window running a UNIX shell, emacs shell mode, makefiles, etc.
Up until now I've been handcrafting them on a case-by-case basis.
For example, I have separate custom word entries for various git commands like "git diff" and "git diff -U\git diff U" and "git diff -U\git diff unified". Custom words in this example because these were some of the first things I set up for Dragon.
I know, commands, and grammars in places like DragonFly and Vocola help. Especially when I know many UNIX commands by heart and don't necessarily want human friendly things like "git diff -U\git diff unified", but instead want "git diff {minus,dash}U".
But DOH! it just occurred to me that it might be possible to automatically generate speech commands or grammars for any command that has a man page. so I would not need to write even the grammars by hand.
Q: has anybody done this?
Obviously it does not need to be restricted to UNIX man pages. Any system where there is consistent documentation for commands might be able to have this, whether the online documentation is in Windows PowerShell, Perldoc, pydoc, emacs elisp docstrings, etc.
---
This was prompted by @mdl in https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=34117&highlight_key=y#192966 mentioning a Vocola command := ( git | grit ); diff with = Command("git diff $2 $3");
Now of course I can define my own commands, whether in Dragon Basic or my ahk scripting or Vocola or DragonFly.
But it sure would be nice not to have to define my own commands at all - at least for the large number of UNIX/... programs that I don't use often enough to spend the time adding custom vocabulary or grammars or speech commands for Dragon, but which I nevertheless still want to dictate.
I am quite enjoying having smart command completion in emacs, automatically creating reasonably pleasant ways of saying emacs commands that I have not added special Dragon support for... and I would like to extend this.
Now, most emacs commands have reasonably human friendly names. most interactive emacs commands have reasonably human friendly prompts.
Most UNIX commands have names that are... perhaps not human unfriendly, but are certainly not part of standard English. although often pronounced as they are spelled.
Original UNIX command options are quite unfriendly, a dash or minus single letter, sometimes multiple letters concatenated. but I am so used to these that I often know them by heart, and am used to saying them to people that I am trying to help. one of the most petty annoyances that I've had with Dragon is that sometimes I say minus and sometimes I say dash for options like -U. that is trivially handled in a speech command, and is common to all UNIX commands. a tool would be able to handle this easily.
Option case: I nearly always say "git diff minus U", not "minus cap U", and certainly not "minus letter U" --- which is ambiguous between -u/-U. But even the git diff man page indicates that the git diff -u option is deprecated, -p/--patch preferred.
Moreover saying "minus U 4" must always disambiguate to -U4, since -u does not take a numeric argument.
The "newer" GNU command line options like --patch and --patience are much more human friendly, and hence more pronounceable. whether such a tool might require me to say "git diff dash dash patch" or just "git diff patch" is an exercise for the implementer, but certainly not requiring the user to pronounce the dashes in --patch-with-stats is trivial
---
What I talk about here, automatically generating speech commands from a UNIX man page, is very much like what I asked for in Looking for tools to generate KnowBrainer/dragon/....commands from list of keyboard shortcuts. I regret to admit that, although I have started writing such tool several times, I have never "finished" them because (a) there is quite a lot of variability in webpages that list keyboard shortcuts, and (b) I inevitably require my human intervention and editing of such list of commands, if only to choose the speech command names - the webpages that list such keyboard shortcuts far too often have a really verbose names that are not practical for speech commands. nevertheless, I semiautomatically convert webpage list of keyboard shortcuts into speech commands, with the help of some emacs commands and global edits.
I suspect that automatically converting UNIX man pages into speech commands can more easily be fully automated. ------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
|
![]() |
|
too many choices hurt recognition so probably converting every option in a man page not a great idea
You can get a lot of mileage from a few vocabulary words; for example: -a:option Alpha -b:option Bravo ...
-A:big option Alpha
-B:big option Bravo
...
--:long option:p # for GNU's --XXX options
-HUP:option hangup
...
ln -s: Unix symlink
ln: Unix link
ls: Unix LS
ls: Unix list
...
Then many UNIX commands are just straight vocabulary
|
|
|
|
![]() |
|
e.g., 'ls -l' is "unix list option lima" -- just pure dictation
|
|
|
|
![]() |
|
I was wondering about this, but then I realized that what you said is true when commands are entered into Dragon's or Natspeaks (or Vocola's or whatever) grammars. But when I write commands purely in AutoHotKey, forwarded by a ?? <dictation> stub in Dragon, I'm not adding any grammar complexity, so the choices are not affecting recognition accuracy. Performance of my AHK command recognizers, although that should be rewritten to be O(length of utterance). I suppose that adding vocabulary/custom word entries will hurt recognition. But for the most part I don't do that when I add commands, unless a command is frequently misrecognized. ------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
|
![]() |
|
the trickier parts is not the options but the filenames and things like git branch names
|
|
|
|
![]() |
|
|
|
|
|
![]() |
|
for filenames I use a voice-recognition alphabet with mostly one syllable letters (arch brov char dale etch fomp ....) and then I have commands which use tab, I use
In Vocola the commands looks like: C Dab C Dab <_anything> = "cd " Terse($2) {Tab}; # in this 1 I can say anything or just Dab <_anything> = Terse($1) {Tab}; Dab If capitalization is important I use ( C Pass ) <_anything> = "cd " PascalCase($2) {Tab}; ( C Pass ) |
|
|
|
![]() |
|
if you are going to do this a lot by the way I highly recommend memorizing a voice friendly alphabet you can look at the Rosetta sheet for ideas ( just search for alphabet) |
|
|
|
![]() |
|
Being the sort of guy I am, I will try providing Soundex filename matching first.
I prefer to adapt the computer to the human rather than the reverse. ------------------------- DPG15.6 (also DPI 15.3) + KB, Sennheiser MB Pro 1 UC ML, BTD 800 dongle, Windows 10 Pro, MS Surface Book 3, Intel Core i7-1065G7 CPU @ 1.3/1.5GHz (4 cores, 8 logical, GPU=NVIDIA Quadro RTX 3000 with Max-Q Design. |
|
|
FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.