KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: From C# - turn the Dragon® microphone OFF then back ON
Topic Summary:
Created On: 05/19/2021 10:07 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/19/2021 10:07 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - dilligence - 05/19/2021 11:21 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/20/2021 02:17 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - Mphillipson - 05/20/2021 12:50 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - TomDonovan - 05/21/2021 08:31 AM  
 From C# - turn the Dragon® microphone OFF then back ON (solved)   - Edgar - 05/21/2021 12:41 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/21/2021 02:51 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/21/2021 04:14 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/21/2021 08:33 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/21/2021 08:56 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Handsfreecoder - 05/22/2021 08:18 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/22/2021 08:45 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/22/2021 08:48 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/23/2021 02:33 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - Handsfreecoder - 05/24/2021 10:41 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/24/2021 11:02 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/24/2021 02:09 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/24/2021 02:31 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Pranav Lal - 04/22/2022 10:01 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - tar - 04/23/2022 12:21 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 04/23/2022 03:14 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Lunis Orcutt - 04/23/2022 12:49 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/25/2022 12:48 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/28/2022 02:43 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/25/2022 07:54 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/28/2022 02:01 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/28/2022 02:06 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/31/2022 10:46 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - monkey8 - 06/01/2022 05:26 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/01/2022 12:13 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Mav - 06/02/2022 01:51 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - monkey8 - 06/05/2022 08:08 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/05/2022 12:55 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/02/2022 01:55 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - Mav - 06/02/2022 08:02 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/02/2022 12:00 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 06/02/2022 12:49 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/02/2022 05:08 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Mav - 06/03/2022 01:50 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/03/2022 11:49 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - monkey8 - 06/05/2022 02:59 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/05/2022 04:47 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - R. Wilke - 06/06/2022 04:51 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - monkey8 - 06/06/2022 01:00 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/07/2022 07:08 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 06/06/2022 10:16 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - R. Wilke - 06/06/2022 11:38 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/07/2022 07:06 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Matt_Chambers - 06/08/2022 06:50 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/09/2022 02:11 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - R. Wilke - 06/08/2022 02:52 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/09/2022 02:31 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - R. Wilke - 06/09/2022 06:55 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 06/10/2022 02:46 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 03/26/2023 05:48 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - BigTech - 04/24/2023 11:11 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - wheels496 - 05/03/2023 11:44 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - Edgar - 05/03/2023 12:15 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - Mav - 05/04/2023 01:49 AM  
 From C# - turn the Dragon® microphone OFF then back ON   - kkkwj - 05/04/2023 04:39 PM  
 From C# - turn the Dragon® microphone OFF then back ON   - PG LTU - 05/04/2023 05:02 PM  
Keyword
 05/19/2021 10:07 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

I use this function ( private static void TextToSpeech(string pSpeakThis) ) in many of my applications. Here's what I have so far:

      private static void TextToSpeech(string pSpeakThis) {

         try {

            engineControl.RecognitionMimic("go to sleep", 0);            

            Thread.Sleep(700);

            SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer();

            speechSynthesizerObj.Speak(pSpeakThis);

            speechSynthesizerObj.Dispose();

         }

         catch (Exception exception) {

            MessageBox.Show("Dragon®’s DgnEngineControl failed when putting the microphone to sleep. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR", MessageBoxButtons.OK, MessageBoxIcon.Error);

         }

         try {            

            engineControl.RecognitionMimic("wake up", 0);

         }

         catch (Exception exception) {

            MessageBox.Show("Dragon®’s DgnEngineControl failed when waking the microphone up. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR", MessageBoxButtons.OK, MessageBoxIcon.Error);

         }

      }

Usually, RecognitionMimic("go to sleep") and RecognitionMimic("wake up") do just fine. Unfortunately sometimes when the text is read out loud there is something that Dragon® recognizes as "wake up" and from there on out it gets ugly. How can I actually turn Dragon®’s microphone completely off then turn it back on?



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/19/2021 11:21 PM
User is offline View Users Profile Print this message

Author Icon
dilligence
Top-Tier Member

Posts: 1649
Joined: 08/16/2010

Turning the microphone off:

 

engineControl.RecognitionMimic("microphone off", 0); 

 

Turning it back on requires either first creating a new voice command "microphone on" (SetMicrophone 1) and then:

 

engineControl.RecognitionMimic("microphone on", 0); 

 

or create a vbs, ahk etc. script that does that last mimic and call it with System.Diagnostics.Process.Start



-------------------------

https://speechproductivity.eu


Turbocharge your Dragon productivity with 40+ Power Addons



 05/20/2021 02:17 AM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Originally posted by: dilligence Turning it back on requires either first creating a new voice command "microphone on" (SetMicrophone 1) and then: 

 

engineControl.RecognitionMimic("microphone on", 0);  

 

or create a vbs, ahk etc. script that does that last mimic and call it with System.Diagnostics.Process.Start

There must be a better way than creating an external script. I suspect that both of these are exposed through something other than "RecognitionMimic".



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/20/2021 12:50 PM
User is offline View Users Profile Print this message

Author Icon
Mphillipson
Top-Tier Member

Posts: 311
Joined: 09/22/2014

This seems to be mentioned in the Dragon NaturallySpeaking API as per the following screenshot:

https://www.screencast.com/t/9G4M214BJ7



-------------------------

Thanks Mark


 


Dragon Professional Advanced Scripting/KnowBrainer Scripts
Video Examples of Coding by Voice

 05/21/2021 08:31 AM
User is offline View Users Profile Print this message

Author Icon
TomDonovan
Power Member

Posts: 43
Joined: 08/22/2020

Here is an AutoHotKey Script I wrote that doesn't something along the lines of what you want. It shouldn't be too hard to translate.

PlayText(Text){
DgnMic := ComObjCreate("Dragon.MicBtn")
DgnMic.Register()
MicState := DgnMic.MicState
DgnMic.MicState := 1


voice := ComobjCreate("SAPI.SpVoice")
ComObjConnect(voice,"SP_")
voice.Rate := 3
voice.Speak(Text, 1)

voice.waituntildone(20000)

DgnMic.MicState := MicState
sleep, 100

DgnMic.UnRegister()
DgnMic :=
;Dgn.UnRegister(0)
;Dgn :=
Voice :=

return
}



-------------------------

Tom

 05/21/2021 12:41 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Thanks Tom, that was enough to push to get me over the hump!

Currently I have this defined outside the actual method:
      public static DgnMicBtn gDgnMic;

although I doubt that it is necessary to define it externally as it will never be used anywhere but in the method. Here is the method:

      private static void TextToSpeech(string pSpeakThis) {

         gDgnMic = new DgnMicBtn();

         gDgnMic.Register(0);

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff;

            SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer();

            speechSynthesizerObj.Speak(pSpeakThis);

            speechSynthesizerObj.Dispose();

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnEngineControl failed when turning the microphone off. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn;

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnEngineControl failed when turning the microphone on. The error message is:" +

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         gDgnMic.UnRegister();

      }

"TimedMessage" is just a speech friendly message box with an optional timeout. This version turns the microphone off instantly before reading then turns it right back on.



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors



 05/21/2021 02:51 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Yay! Thank you Edgar - we now have some C# code that works with the microphone, which is a start. I sure wish Nuance would convert their C++ API documentation to C#. Maybe they will now, after the acquisition. Maybe some Microsoft guys will produce an API that matches their standard C# "going forward" language. (Haha, hope, hope, :-))

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/21/2021 04:14 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

I used to be on Microsoft's Accessibility Team. I was hoping to be able to influence the way their applications were designed in order to be more accessible to folks with disabilities. This did not pan out. I've been thinking about contacting them and offering my "work from home" services - just in the realm of their acquisition of Nuance and specifically Dragon Professional Individual. I wonder if any of the Microsoft Dragon software team will be/is monitoring this forum?

-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/21/2021 08:33 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

What a great idea. If you can find the right people, maybe they would be interested. I suppose you have find the team that is/becomes responsible for Dragon. You would be a great addition to their team, even as a part-time consultant!

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/21/2021 08:56 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

The real problem is the commute - back then I only had to go in about once a month and they always scheduled it for off-peak traffic hours; still, it was always well over an hour and sometimes over two hours!

-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/22/2021 08:18 PM
User is offline View Users Profile Print this message

Author Icon
Handsfreecoder
New Member

Posts: 18
Joined: 08/10/2018

Originally posted by: Edgar

 

         catch (Exception exception) {     

TimedMessage("Dragon®’s DgnEngineControl failed when turning the microphone off. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

 

I hate to tell you this but you are not even using an engine control in your code above so the engine control wouldn't trigger an exception.

 

Originally posted by: kkkwj I sure wish Nuance would convert their C++ API documentation to C#. Maybe they will now, after the acquisition. Maybe some Microsoft guys will produce an API that matches their standard C# "going forward" language. (Haha, hope, hope, :-))

 

There are more VB examples in the API than C++ which are easily convertible to C#? Why would Microsoft produce an API for Dragon that matches their standard C# when it already exists?

 

Originally posted by: Edgar I used to be on Microsoft's Accessibility Team. I was hoping to be able to influence the way their applications were designed in order to be more accessible to folks with disabilities. This did not pan out…

 

Which Microsoft's accessibility team would that be? I have never seen applications so accessible for the disabled as Microsoft applications. I wish other developers went to the same lengths as Microsoft did. Are you actually aware of the number of APIs available for Microsoft applications that make them accessible for the disabled? What exactly isn't accessible?

 

 05/22/2021 08:45 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

 

      private static void TextToSpeech(string pSpeakThis) {

         DgnMicBtn gDgnMic = new DgnMicBtn();

         gDgnMic.Register(0);

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff;

            SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer();

            speechSynthesizerObj.Speak(pSpeakThis);

            speechSynthesizerObj.Dispose();

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone off. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn;

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone on. The error message is:" +

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         gDgnMic.UnRegister();

      }



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/22/2021 08:48 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Originally posted by: Handsfreecoder 

 

Originally posted by: Edgar I used to be on Microsoft's Accessibility Team.

 

 

Which Microsoft's accessibility team would that be?

https://news.microsoft.com/on-the-issues/2019/09/25/accessibility-supportability-anne-taylor/



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/23/2021 02:33 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

@Handsfree, >There are more VB examples in the API than C++ which are easily convertible to C#? Why would Microsoft produce an API for Dragon that matches their standard C# when it already exists?

Yes, there are many such examples. But they're in VB and C++, not C#. I'm sure the argument-calling conventions (types, casts, etc.) could be worked out for C (in the DllCall APIs), C++, and so on. But it would sure be easier if there were some C# examples around. Credit to Edgar for biting the bullet, spending some time, and posting some C# examples. I suppose for precision, I should say that I'm not asking MS/Nuance to create a brand new API to replace the one that exists. I'm wishing for better documentation and examples, which do not exist. If you know all these answers and can easily show some C# examples of doing things, please feel free to post some to get the rest of us started. :-)


-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/24/2021 10:41 AM
User is offline View Users Profile Print this message

Author Icon
Handsfreecoder
New Member

Posts: 18
Joined: 08/10/2018

So just to be clear on this you don't know how to integrate Dragon objects (which after all are implemented using a Microsoft object model/interfaces) into a Microsoft .net language C# application and you are unaware of the existing accessibility APIs available with Microsoft applications. Yet you, or one of you, are going to offer Microsoft consultancy on how to make their applications more accessible for disabled users including those using Dragon?

 

Wow.

 

Then for good measure you want me to post some C# code showing you how to do this to save you doing your own homework when there are pages and pages of relevant information and examples available on Google where you just need to join the dots?

 

 

 

 

 

 

 05/24/2021 11:02 AM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Originally posted by: Handsfreecoder So just to be clear […] you, are going to offer Microsoft consultancy on how to make their applications more accessible for disabled users including those using Dragon?

 

 

[…] you want me to post some C# code […] there are pages and pages of relevant information and examples available on Google where you just need to join the dots?

For me, that sums it up in a nutshell. But, just to clarify the issue, if Microsoft® were to ask me to rejoin the Accessibility Team (which, from the looks of it, may no longer exist) it would be in the sense of me offering my experience as a disabled user of both Dragon® and Windows. Think of me as a "highly(?) experienced" beta tester.

As for the "pages and pages of relevant information…", I beg to differ. I believe that I am a reasonably adept and persistent googler. I have not been able to find a lot of information online in this area. When it comes to posting code I believe that I have paid forward in great measure.



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/24/2021 02:09 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

I agree with Edgar's C# contributions, especially over the past year. I recognize that he posts things "at the right level" and "with a suitable standalone architecture" so that they can be used by people who can compile C# programs. Thanks Edgar!

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/24/2021 02:31 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Oh, just so folks don't get the wrong impression, working on Microsoft®'s Accessibility Team was virtually a volunteer position. We were not paid but got significant perks.

-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 04/22/2022 10:01 PM
User is offline View Users Profile Print this message


Pranav Lal
Top-Tier Member

Posts: 221
Joined: 10/02/2006

Hi Edgar,

If I may ask, what is the assembly name you used to get access to dragon functionality in .net? Did you need to have the SDK from nuance installed?

 

Pranav

 04/23/2022 12:21 PM
User is offline View Users Profile Print this message

Author Icon
tar
Power Member

Posts: 53
Joined: 05/15/2019

no, you don't need the SDK.

Go to the reference manager in Visual Studio, go to COM, then "Dragon naturallyspeaking ActiveX controls"



-------------------------

Tom


 


Programmer Of SP 7 PRO (speechproductivity.eu)

 04/23/2022 03:14 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Originally posted by: Pranav Lal what is the assembly name you used to get access to dragon functionality in .net? Did you need to have the SDK from nuance installed?

 

Hi Pranav!

As Tom points out, you do not need the SDK (I do not have it). If Tom's answer didn't give you enough information please post here and we will try to step you through the process. Please make sure to let us know which language (C++, C# etc.) and IDE (Microsoft Visual Studio 2019 etc.) you are using.



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 04/23/2022 12:49 PM
User is online View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 40716
Joined: 10/01/2006

Just a quick note on Dragon SDK. It is no longer available and hasn't been for quite some time.

-------------------------

Change "No" to "Know" w/KnowBrainer 2022
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

 05/25/2022 12:48 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

In the past few days, I implemented some C# code to turn the Dragon mic on and off. Here are some tips for those who might read this someday.

 

For example, Microsoft shows an example here https://docs.microsoft.com/en-us/dotnet/api/system.speech.recognition.speechrecognitionengine?view=netframework-4.8

 

I used a different Microsoft example to get started, but I can't find the link anymore. Go figure.

 

I had trouble defining the required reference to the speech libraries and Dragon libraries. I could not find them in the usual places in the Add Reference dialogs in Visual Studio.

 

For the Windows speech libraries/platforms, I found these. What a mess. The System.Speech NuGet package is the easiest to work with - install the package in VStudio and go. The Microsoft Speech SDK requires downloading an SDK. I do not understand the benefits of the SDK since the APIs are almost exactly the same. Maybe the SDK has access to more languages or something.

 

(1) System.Speech.dll (2022 NuGet package with 2.3M downloads

 

(doc from 2015 https://docs.microsoft.com/en-us/archive/msdn-magazine/2014/december/voice-recognition-speech-recognition-with-net-desktop-applications) Install this as a NuGet package 'System.Speech', 2.3M downloads. This is the one that I used to get my program (and Ed's code above) working.

 

The System.Speech libraries are installed through the NuGet package mechanism (System.Speech), which is why I could not find them in the Add Reference dialog box as installed type libraries. I ended up using the System.Speech package by accident, I think.

 

(2) Microsoft Speech Platform - Software Development Kit (SDK) (Version 11)

Speech Platform features compared with SAPI

The Speech Platform API and runtime are derived from the Speech API (SAPI) and the speech runtime in Windows, but are primarily intended to support speech applications running as standalone services. As a result, there are some important differences. The following table lists key functional areas of the Speech Platform and describes the principal differences between the Speech Platform and the speech functionality in Windows, as provided by SAPI.

The Microsoft.Speech library is installed by downloading the SDK from the official Microsoft download center. I downloaded the x64_MicrosoftSpeechPlatformSDK.msi file. After installing it, open the Add Reference dialog and browse to c:/program files/microsoft sdks/speech/ and select the dll. Then you can say 'using Microsoft.Speech' in your C# code.

 

The APIs in this SDK are very close to the System.Speech APIs and I was able to use my System.Speech code by changing the using statements. (But I still found an old chunk of code on the net for the SpeechRecognizedEventArgs that uses fields like e.Error that do not appear in either the System.Speech API or the Microsoft.Speech API. 

 

Here is an old (2014) comment from https://docs.microsoft.com/en-us/archive/msdn-magazine/2014/december/voice-recognition-speech-recognition-with-net-desktop-applications

 

Figure 9 Microsoft.Speech vs System.Speech (circa 2014)

Microsoft Speech is a downloadable SDK in 2022

System.Speech is a NuGet package in 2022, usable in C# as managed code

 

Microsoft.Speech.dll System.Speech.dll

Must install separately Part of the OS (Windows Vista+)

Can package with apps Cannot redistribute

Must construct Grammars Uses Grammars or free dictation

No user training Training for specific user

Managed code API (C#) Native code API (C++)

 

(3) Microsoft Speech SDK 5.1 (circa 2022) - looks like this uses SAPI API calls

 

The Microsoft Speech SDK 5.1 adds Automation support to the features of the previous version of the Speech SDK. You can now use the Win32 Speech API (SAPI) to develop speech applications with Visual Basic ®, ECMAScript and other Automation languages.

Version: 5.1
File Name: SpeechSDK51.exe

msttss22L.exe

sapi.chm

Sp5TTIntXP.exe

SpeechSDK51LangPack.exe

SpeechSDK51MSM.exe

 

See the history here https://en.wikipedia.org/wiki/Microsoft_Speech_API

 

(4) .NET Speech SDK (circa 2022, a paid Azure cloud service)

 

The Speech software development kit (SDK) exposes many of the Speech service capabilities you can use to develop speech-enabled applications. The Speech SDK is available in many programming languages and across all platforms.

 

(from Microsoft Cognitive Services) https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdkThe .NET Speech SDK is available as a NuGet package and implements .NET Standard 2.0. For more information, see Microsoft.CognitiveServices.Speech.

 

Here is a nice FAQ page for the MS Speech Service (it uses the cloud for speech recognition at scale for telephony apps, and requires a paid service). https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/

 

(5) Windows.Media.Speech - forget it; this has nothing to do with general speech recognition

 

From a project on GitHub (https://github.com/zoomicon/SpeechLib): There is an issue with the Windows Speech API I think regarding recognition quality for commands. The Windows Speech library has less languages, but user can train it from the control panel (there is a speech item there). There is an alternative API called Microsoft Speech that is very similar (only the SGML syntax needed some small changes to support both of those). That one has its own SDK (Kinect runtime installs it) and has more generic recognition (less tuned to a specific user) with more languages. So if you get wrong recognition, maybe you should switch to the Microsoft Speech that Kinect team suggests instead.

 

The Microsoft.Speech.dll libaries (I read somewhere from 2015?) are the best place to start because they are somewhat simpler and work with fixed-length, word-specific grammars. I think the Microsoft.Speech libraries are redistributable too, whereas the System.Speech libraries are not.

 

In contrast, the System.Speech.dll libraries are supposed to be more complex and have extra features for complex dictations.

 

I tried to find out how to do open-ended using WSR doc and examples, but I failed. Does anyone know of examples to do that in WSR? Or if it is even possible? (I could find no documentation on the topic either way.) Update: I eventually found a comment on the CodeProject that said, "Forget about training in WSR in your own code. The best you can do is run the OS training app for speech recognition in the Accessibility settings."

 

For working with Dragon, I thought I should look for COM libraries that started with "dgnXXX" that you see around the forum and that you can see in the list of libraries shown in the Add Reference dialog box. But I could never get any of those to work. Ed helped me out by giving me the secret Dragon Naturally Speaking ActiveX Controls library name, which I found easily. Once I installed that, I was off to the races.

 

You use the Microsoft or System.Speech libraries to do the recognition, and the Dragon library (using Ed's example code above) to turn the Dragon mic on and off.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 05/28/2022 02:43 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

BTW, open-ended recognition in WSR is hopeless. It can't even recognize simple phrases on Windows 11 when I use the System.Speech 'DictationGrammar' object to load a grammar containing common phrases and words. Dragon is far superior. 



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/25/2022 07:54 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

https://docs.microsoft.com/en-us/previous-versions/office/developer/speech-technologies/dd147134(v=office.14)

I think that is the link where I started. Anyhow,, it was one of the links (code examples) that I used as a model.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 05/28/2022 02:01 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Here is a helpful post with more information on what the various libraries do.

 

https://stackoverflow.com/questions/2977338/what-is-the-difference-between-system-speech-recognition-and-microsoft-speech-re

 

Desktop recognizers are designed to run inprocess or shared.

 

Shared recognizers are useful on the desktop where voice commands are used to control any open applications. Server recognizers can only run inproc.

 

Inproc recognizers are used when a single application uses the recognizer or when wav files or audio streams need to be recognized (shared recognizers can’t process audio files, just audio from input devices).

 

Only Desktop speech recognizers include a dictation grammar (a system provided grammar used for free text dictation). The class System.Speech.Recognition.DictationGrammar has no complement in the Microsoft.Speech namespace.

** This means you should use System.Speech if you want to recognize open-ended dictation. **

 



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 05/28/2022 02:06 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

https://stackoverflow.com/questions/2977338/what-is-the-difference-between-system-speech-recognition-and-microsoft-speech-re

 

At last, a decent answer. Basically, use System.Speech (with optional free-form DictationGrammar) on Desktops. And use Microsoft.Speech for server-side scalable apps.

 

I guess I got lucky using the easy System.Speech NuGet package by accident. And you could argue that all my time researching the terrible mess of Windows Speech APIs was wasted. Oh well. At least I'm on the right track now, and others who read this thread won't waste as much time.

 

The short answer is that Microsoft.Speech.Recognition uses the Server version of SAPI, while System.Speech.Recognition uses the Desktop version of SAPI.

 

The APIs are mostly the same, but the underlying engines are different. Typically, the Server engine is designed to accept telephone-quality audio for command & control applications; the Desktop engine is designed to accept higher-quality audio for both command & control and dictation applications.

 

You can use System.Speech.Recognition on a server OS, but it's not designed to scale nearly as well as Microsoft.Speech.Recognition.

 

The differences are that the Server engine won't need training, and will work with lower-quality audio, but will have a lower recognition quality than the Desktop engine.

 

 

https://stackoverflow.com/questions/9347346/appenddictation-on-microsoft-speech-platform-11-server

 

For anyone who may come across this in the future -- I've now emailed back and forth with Microsoft, and ultimately received this response:

 

The managed interfaces (Microsoft.Speech and System.Speech) are built on top of the native SAPI interfaces. These interfaces are the same for both the Server engine and the Desktop engine.

 

BUT the engine itself is responsible for implementing dictation, and the Server engine does not do so. Therefore, the call will fail when you load the grammar.

 

 

More comments from this thread back in 2015.

https://stackoverflow.com/questions/12101120/matching-wildcard-dictation-in-microsoft-speech-grammar

 

But in the C# API there is a DictationGrammar and WildcardGrammar. I could archive my goal if I "harcode" it. In fact I activate a Dictation grammar for som special case (even if it is bad I agree) – 

 

The C# API works with both the desktop engine and the server engine. The desktop engine supports DictationGrammar and WildcardGrammar; the server engine does not. – 

 

Kinect uses Microsoft.Speech, not System.Speech as it seems, although you could probably grab the audio from Kinect and use it with System.Speech somehow (but I think you need training of the recognition engine if you go with System.Speech) – 

 

Btw, it seems Microsoft united their previously separate installers for their Server (accessed via Microsoft.Speech in .NET) and Client (accessed via System.Speech in .NET) Speech Runtimes into Microsoft Speech Platform Runtime (Version 11), found at microsoft.com/en-us/download/details.aspx?id=27225.

 

The respective SDK is at microsoft.com/en-us/download/details.aspx?id=27226. The speech runtime version that was labeled in the past "for servers" is non-trainable (has setting for acoustic model adaptation on/off though) and doesn't accept free speech, only commands – 

 

Here is another comment from VoiceElements website. He says that MS stopped development on the Microsoft Speech Platform back in 2012.

 

Microsoft stopped development on the Microsoft Speech platform in 2012. Instead of processing text-to-speech (TTS) or speech recognition (SR) on-premises, Microsoft now steers its customers to use their cloud services on Azure. Those services and other similar services on the cloud can provide excellent SR and TTS and can work in conjunction with the Voice Elements platform. However, since there is no charge for the Microsoft Speech Platform, we continue to support it as our go-to default facility for TTS and SR.

 

The Microsoft Speech Platform is comprised of the following:

 

Microsoft Speech Platform Runtime

 

You should have this installed on your server in order to perform speech recognition functions within Voice Elements. Voice Elements has built out support for Microsoft Speech Platform, as long as you use Microsoft compatible grammar files. These are easy to create using the methods outlined in this article: Create Microsoft Speech Compatible Grammar Files

 

The runtime can be downloaded at Microsoft Speech Runtime.

 

Microsoft Speech Language Packs

 

The Microsoft Speech Platform relies on different language packs in order to provide speech recognition capabilities for different languages. Microsoft Speech Platform supports 18 different languages and accents. You can download some of the more popular languages using the links below. For additional options, please contact Inventive Labs Technical Support.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 05/31/2022 10:46 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

A bit more info gleaned from reading the VoiceAttack and VoiceMacro forums and manuals and Github extensions. Both of these programs were (or are) based on the built-in MS Speech Recognizer Engine 8.0 that has been included with Windows ever since Windows XP. VoiceAttack now allows you to download the upgraded MS Speech Platform 11 runtime engine and select it within the VoiceAttack options. I investigated the Speech Recognition / Advanced Options on my Win10 machine, and much to my surprise found a SpeechStart6Profile in my list of trained voice profiles, which means that SpeechStart also uses the built-in MS recognition engine.


As described above somewhere, the APIs are now basically the same for both System.Speech.* and Microsoft.Speech.* because they are both built on the same SAPI interfaces. But the System.Speech engine (for desktops) has an extra feature called a DictationGrammar for doing free-speech recognition. (It's hopeless - at least it was when I tried it out.) So I don't think it would be any big loss to use the MS Speech Recognizer 8.0 or MS Speech Platform 11 recognizers. They are supposed to be redistributable and have more world languages to work with.


If you hear about "changing profiles" with VoiceAttack or VoiceMacros, they mean to hack the registry (or use the checkboxes in the Control Panel) to switch training profiles that you made with different microphones and audio environments. This can make a worthwhile difference if you have different people using voice on the same PC, or if you use different microphone setups for different work/gaming tasks.

 

Note that if you're using Speech Platform 11, there is no applicable training. That only applies to the built-in SAPI engine. Speech Platform 11 is designed to be speaker-independent, and does not need to be/cannot be trained

 

For my two bits, nothing touches Dragon for recognition accuracy.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 06/01/2022 05:26 AM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4170
Joined: 01/14/2008

Originally posted by: tar no, you don't need the SDK.

 

Go to the reference manager in Visual Studio, go to COM, then "Dragon naturallyspeaking ActiveX controls"

 

 

Originally posted by: Edgar 

 

 

Hi Pranav!

 

As Tom points out, you do not need the SDK (I do not have it). If Tom's answer didn't give you enough information please post here and we will try to step you through the process.

 

Nothing could be further from the truth. Both statements above are contradictions in terms. 

 

Kevin,

 

Hopefully to help, others if not yourself, the bottom line is that accessing the .NET libraries like System.Speech & System.Speech.Recognition only gives you access to a very small subset of SAPI 5 interfaces albeit they are convenient to use with C# or VB .NET. To unleash the true power of SAPI 5 you need to access the SpeechLib.dll COM libraries and to use that with C# you will need to use COM Interop (homework) or alternatively C++. 

 

Writing a program with .net to recognises SAPI 5 commands to switch on the Dragon microphone is child's play to a programmer especially with a shared recogniser. It's all too easy to add voice commands to SAPI 5 that get recognised but it's not so easy to confine those voice commands to only being recognised with the EXACT phrase spoken with multiple commands. In other words tuning the recognition accuracy of the speech engine that you create which involves things like setting up your recognition grammars, training and using an inproc recogniser to give you full control of your application and avoiding the use of any integration with WSR. 

 

 



-------------------------



 06/01/2022 12:13 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi Lindsay, thank you for your comments. I'm sure that you're right - the System.Speech.* and Microsoft Speech libraries totally hide and protect me from the SAPI API complexity! I had the SpeechLib.dll library installed for a while as I tried to figure out the complete mess that is the speech library world. But I ended up on System.Speech in C#, which is working okay so far. (I keep trying to avoid learning C++ - I went from C to C# and jumped over C++; I met with Bjarne Stroustrup the C++ creator a few times at conferences, and he was much like the C++ language... :-))


I also agree that turning the mic on and off is child's play (especially since Edgar posted the C# code for it. :-) I would love to find some C# examples (or even C++ examples) of how to use an inproc recognizer instead of a shared recognizer, but in all my searching over the past week or two, I couldn't find any on how to do it. Maybe it's buried in the SAPI or Microsoft Speech Platform 11 SDK manuals somewhere. I also think you're right on recognizing the larger problem of choosing phrases and keywords for your command grammars to promote higher-quality recognition. You need to pick words and phrases that are technically workable while still being memorable, pronounceable, and cognitively pleasing (or at least acceptable).


But for now, since I have no C# examples, no inproc recognizer, no solid grammars, and no training ability, I will slog on with the lowly WSR, which seems acceptable for short, fixed-length phrases but is totally useless (IMHO) at free-form phrases. It would be very nice to know how Dragon implements . I can see how they use word lists (and how I would use word lists) in building grammars, but the magic of escapes me. The closest I've found is using the DictationGrammar object in System.Speech, but it fails miserably for me on my machine at recognizing even the simplest of sentences.


I know that WSR gives you some training scripts, so there is some training going on somewhere, somehow. But I don't know how to do it myself. Do you know of a recent good book on SAPI that would explain all this stuff easily (and that uses C# for examples :-))? That might save me some time on my journey... Cheers

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/02/2022 01:51 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 666
Joined: 10/02/2008

Are you sure you want to invest your time in a dead technology like WSR/SAPI?

WSR's accuracy has never been that great for dictation (even though it's working fine for simple grammars most of the time) and customization/adaptation has been very tricky all the time.

If you want up-to-date technology for high accuracy dictation apart from Dragon I can only recommend MS Azure Speech Services.

There's a ton of examples on GitHub, it doesn't cost you if you use the free tier and now that MS has bought Nuance and is integrating them into their cloud business unit, chances are that cloud-based SR will get even better.

 

hth

mav

 06/05/2022 08:08 AM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4170
Joined: 01/14/2008

Originally posted by: Mav Are you sure you want to invest your time in a dead technology like WSR/SAPI?

 

 

 

hth

 

mav

 

 

A point I thought I made when recommending using Google cloud services but somehow that was removed in my edit. Yes SAPI 5 is virtually deprecated.



-------------------------

 06/05/2022 12:55 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi, yes, all this local speech recognition stuff is old/dead tech; I recall posting somewhere that the doc was all ancient (circa 2006 Windows XP for MS Speech 8.0, and circa 2012 for MS Speech Platform 11). 

 

But Windows still ships with MS SPeech 8.0, you can use 11.0 with a download, it's free, documented (very messy doc overall), and doesn't require yet another account and an internet connection.

 

I suppose you can see that I'm on a learning journey. I freely admit that I might end up in the Cognitive Azure cloud at the end of it.

 

But for now, I'm just trying to build an app like Mark has already built to run beside or instead of Dragon. I'm sure that building an Azure app will be easier after learning this MS speech stuff. It sure looked like there were many examples around (including the word "Azure" in Mark's posted example!)



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/02/2022 01:55 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi Mav, thank you for the interesting info! I had heard of the Azure cloud stuff (is that the same as the Cognitive cloud group in Microsoft)? But I had not heard of a free tier nor of the examples on GitHub. I will have a look as time permits. At least for now, I only have a few simple fixed commands that WSR is recognizing okay (microphone on, microphone off, restart dragon, etc.) I think I'll try out the MS Speech Platform 11 stuff when I get a chance too. And then if the examples on GitHub are in C#, maybe I'll even try the cloud services. So much to do, and so little time... :-)

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/02/2022 08:02 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 666
Joined: 10/02/2008

 06/02/2022 12:00 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi Mav, thank you for the encouragement and for providing some easy-access links. Here is the link for C# Windows Hello World using the Azure Cognitive speech stuff. It looks pretty straightforward and convenient. It is most interesting that it uses free-form dictation and does not require loading up a grammar (although I am sure that it is capable of doing so). Cheers

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/02/2022 12:49 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

@KKKWJ - I don't see a link...

-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 06/02/2022 05:08 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Ed, you're right! Here it is for the first time. :-) https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/csharp/dotnet/from-microphone/helloworld/Program.cs



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 06/03/2022 01:50 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 666
Joined: 10/02/2008

Azure Speech Services is speaker-independent and thus doesn't require a speaker profile.

Regarding grammars I'm unsure if this is supported - haven't looked into it.

But judging from available services I think MS is following a different approach by separating input from intent recognition. I.e. provide a speech recognition engine that's capable of recognizing exactly what was said and provide a transcript of this. The next service detects the intent from the textual natural language input and invokes actions according to the different intents (that's what LUIS does).

It's more of an "engine" for driving voice assistants that can perform different tasks.

 

hth

mav

 06/03/2022 11:49 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Yes, I agree with everything you say about Azure Cognitive Services speech stuff. I explored the documentation and it seemed to be all about transcription of incoming speech for larger voice-driven systems - bots, assistants, ordering systems, meetings, etc. (Makes me wonder if someone will make an auto-speech bot for the old porn phone lines that earned people a lot of money. :-))

I also saw the "free" stuff was a limited number of hours (like 5 hours) of free transcription, etc. I couldn't find anything that looked like "give a voice command and then do something on the local machine with the result." But I suppose if the system could "transcribe" my Dragon commands, it could send me back some text and I could parse it and dispatch to a script on my own computer. And then after five hours of commands, I'd have to sign up for a subscription (sometimes hundreds of bucks a month). I probably won't go through the hassle of creating an account and trying the free stuff - you need to provide a $200 deposit to even sign up (as far as I could tell).

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/05/2022 02:59 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4170
Joined: 01/14/2008

Originally posted by: kkkwj Hi Lindsay, thank you for your comments. I'm sure that you're right - the System.Speech.* and Microsoft Speech libraries totally hide and protect me from the SAPI API complexity! I had the SpeechLib.dll library installed for a while as I tried to figure out the complete mess that is the speech library world.

 

… I know that WSR gives you some training scripts, so there is some training going on somewhere, somehow. But I don't know how to do it myself. Do you know of a recent good book on SAPI that would explain all this stuff easily (and that uses C# for examples :-))? That might save me some time on my journey... Cheers

 

Kevin the Speechlib examples used to be included with the SAPI 5 SDK which was then moved to the Windows SDK but whether they are still available I don't know. Unfortunately they are difficult to follow you need to persevere.

 

 

You can see training of a SAPI 5 profile with Speechstart+, you will be familiar with the training for MICROPHONE ON and RESTART DRAGON. It is all created from scratch using C#, a lot of work.

 

However I would repeat to try using some of the newer technologies as SAPI 5 recognition is not brilliant although it is okay with commands. It will soon be deprecated.



-------------------------



 06/05/2022 04:47 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Thank you for your thoughts. Currently, my voice technology world of options has exploded since I can do open-ended  now. Honestly, I have essentially no interest at all in working with the low-level SAPI interfaces in C++ (or COM, as you pointed out). Too much complexity with no observable benefit that I can see (at least for now). I have about 40-50 commands running with System.Speech (and one open-ended command), and it never misses a beat. I do not know how well it will scale up to thousands of commands, and dozens of open-ended commands. Maybe Azure will be required by then, who knows.

 

One other thought I had - it is better to give Dragon and WSR 1) one grammar containing 1000 choices in a list, or 2) 1000 separate and simple grammars, one for each of the 1000 commands? My thinking is that option 1 could be done by a recognizer with simple Dictionary lookups, whereas with option 2 the recognizer would have to iterate over 1000 small grammars.

 

Do you have any thoughts on which approach would be better for performance or limitations (like Dragon slowing down at 3000 commands, and RW saying his new thing slows down at 5000+ grammars). Maybe if I put 5000 fixed-length commands into one grammar as choices, I wouldn't run into the "5000" command-count performance ceiling as quickly?



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 06/06/2022 04:51 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8058
Joined: 03/04/2007

Originally posted by: kkkwj  

 

Do you have any thoughts on which approach would be better for performance or limitations (like Dragon slowing down at 3000 commands, and RW saying his new thing slows down at 5000+ grammars). Maybe if I put 5000 fixed-length commands into one grammar as choices, I wouldn't run into the "5000" command-count performance ceiling as quickly?

 

 

Just for the the record, my "new thing" is more of an "old new thing" actually. It is designed as a Dragon COM add-in and meant to be the underlying custom grammar interface going into the next KnowBrainer generation.

 

Here is a quick and dirty video demonstrating how it basically works:

 

https://rwilke.de/Downloads/KB%20Test%20Driver%20Last%20Stage%20-%20new.mp4

 

The scalable stress testing has been integrated specifically to check out performance issues relative to the number of commands currently loaded, load times, unload times, and the impact the module has on the overall stability of the Dragon process.

 

In the preliminary development, stress testing was the first step in taking the new module for a spin. Creating, loading and activating commands takes about 2 milliseconds per command in this implemenation. Unloading takes a little less. Recognition is instantaneous, relative to the overall number of commands currently loaded. Dragon won't hang or crash even with up to 7,500 commands loaded and activated, and it won't hang or crash during shut down.

 

In the next step of the development, facilitating for custom commands including lists as well as the "dictation" variable has been targeted. The video should demonstrate that this has been accomplished. Do note that you can use the "dictation" variable more than once and not just at the end of the command.



Therefore, all the relevant fundamentals have been covered by now so that anything else would just be a matter of routine data handling and UI work. The early working title had been "KnowBrainer 2023", and I guess we could stick with it as far as the timeline to be expected.



-------------------------



 06/06/2022 01:00 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4170
Joined: 01/14/2008

Originally posted by: kkkwj Thank you for your thoughts. Currently, my voice technology world of options has exploded since I can do open-ended  now. Honestly, I have essentially no interest at all in working with the low-level SAPI interfaces in C++ (or COM, as you pointed out). Too much complexity with no observable benefit that I can see (at least for now). I have about 40-50 commands running with System.Speech (and one open-ended command), and it never misses a beat. 

 

 

I'm curious here Kevin so please answer me this. Sure your 40 or 50 commands are recognised and never miss a beat but I am more interested in the misrecognitions.What I mean by that is if you have a command called, for example, "Simply Retrain Dragon" and you say "Simply derail Dragon", "Simply retail Dragon" are the commands then misrecognised using similar phrases? 

 

The principal advantages of speechlib libraries are that they contain access to many other audio and speech features than the .net libraries. However if you don't need them no point going any further than you have as you say. Also are you using the shared recogniser? Does WSR load? In which case I would expect them to be accurate but then you have WSR loading every time which is a bit of a disadvantage in many situations.



-------------------------



 06/07/2022 07:08 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi Lindsay, here are answers to your questions. I’m not sure what you and Lunis (or anyone means by “misfire” or “misrecognition.” (I learned misfire from one of Lunis’ jokes that made me laugh.) Does misfire mean a false positive recognition of the wrong phrase, or a non-recognition of the right phrase?

On this project, all the command phrases that I speak correctly have had a 100% success rate at being recognized correctly. All rejected phrases should have been rejected. (Usually when this happens, it is because I have uttered a phrase that is missing a whole word.) So, at this point with 50 commands in the pool, I have 0.000 worries about false positive recognitions (accepting the wrong phrase) and false negative rejections (rejecting the right phrase). That may change as I add more commands, of course. I have no idea how it will perform with 1000 commands in the list.

Shared recognizer. No, I’m not using the shared recognizer (although Mark’s example did). I’m using the dedicated inproc engine. I don’t know if that triggers a WSR shared recognizer load operation – I don’t think so, at least visibly. I have “use speech recognition for your computer” turned off in the settings panel. I did fire up WSR / shared recognizer with the top center button/window showing, but I didn’t want it, so I disabled “WSR” in the settings.

As far as I know, my little app just creates its own recognizer and gets on with the job. I’m very happy with it. I’m thinking at this point that it looks like it will scale up for quite a bit before I encounter problems. If it helps any, I have the default confidence level set at .944, and I often see recognitions at .95, .96, .97, and even some at .98+. Once I remember seeing .997. Perhaps that has to do with the words in the commands, the power of the recognizer, or (blush) my superb and articulate diction skills (not!) like Rob and RW.

I hope that answers your questions. If not, just speak up.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/06/2022 10:16 AM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Wow! That video looks great. With multiply-embedded open-ended dictation this will be a must-have.

-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 06/06/2022 11:38 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8058
Joined: 03/04/2007

Originally posted by: Edgar Wow! That video looks great. With multiply-embedded open-ended dictation this will be a must-have.


Thanks for the feedback. Pointing out facilitating the multiple instance dictation variable was done on purpose, knowing that this would attract attention for sure, but effectively, it will be just the tip of the iceberg.

There will be a lot more additional features coming along with it, and most of all, the overall infrastructure will be a lot more responsive and robust than any previous KnowBrainer implementation.

It all will be thanks to the CFG grammar capabilities provided by the underlying SAPI 4 COM objects, thus allowing to "talk to the horses mouth" directly rather than going through the previously used SDK.



-------------------------

 06/07/2022 07:06 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

I'm with Edgar on the RW video - flawless, as usual. I admit that I did not understand probably 75% of the video. But what I could easily see was the software discipline of building a whole infrastructure and app for testing whatever was being tested. And RW's dictation never missed a beat, smoothly dictating content and "click this" and "click that." Very smooth, like Rob's videos - maybe it's a European quality standard that I have not reached yet. Anyhow, it looks impressive. I'm sure the next KB will be smoking fast and reliable if this demo is any indication. RW, may I ask a question? Why did you go with SAPI 4 instead of SAPI 5? (I don't know the difference.) Thanks.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/08/2022 06:50 AM
User is offline View Users Profile Print this message

Author Icon
Matt_Chambers
Top-Tier Member

Posts: 756
Joined: 08/09/2018

Originally posted by: kkkwj I'm with Edgar on the RW video - flawless, as usual. I admit that I did not understand probably 75% of the video. But what I could easily see was the software discipline of building a whole infrastructure and app for testing whatever was being tested. And RW's dictation never missed a beat, smoothly dictating content and "click this" and "click that." Very smooth, like Rob's videos - maybe it's a European quality standard that I have not reached yet. Anyhow, it looks impressive. I'm sure the next KB will be smoking fast and reliable if this demo is any indication. RW, may I ask a question? Why did you go with SAPI 4 instead of SAPI 5? (I don't know the difference.) Thanks.

Rüdiger answered your SAPI 4 question a few days ago, in this thread.  https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=12&threadid=36307&enterthread=y  SAPI 4 is what Dragon uses.

 06/09/2022 02:11 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Thanks Matt! I must have missed it. That makes perfect sense, to use the SAPI that Dragon uses.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 06/08/2022 02:52 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8058
Joined: 03/04/2007

First off, thanks Kevin, for the positive feedback. I am fully aware that I should have made the video more obvious in terms of providing pointers and hints as to what it is all about, but I didn't, and that was the reason why I called it "a quick and dirty demo", hoping the message may make it through all the same. For the least, it may have become obvious that the new KnowBrainer and the previous KnowBrainer will be like night and day.

I will try and provide a few more keynotes as I go along.

Matt has been kind enough to test and handle all the details as regards the new development as well as posting about them on this forum, so this is also why he brought up my previous response to the SAPI 4 question. When choosing Matt as my regular sidekick in all this, I did this for a reason, and his overall contribution has already become invaluable.

Now, for starters, what is the video about?

The primary goal is to demonstrate the power and flexibility of the new custom grammar object going into the next KnowBrainer generation. Along with allowing multiple "dictation" variable instances, it will also cater for alternate phrases and optional phrases within a command, thus allowing for defining "natural language commands" the same way they have been implemented into Dragon itself. (Admitting that this has not been a part of the video.)

A secondary goal has been to demonstrate that I wasn't kidding when mentioning that, on my system, it typically takes about 10 seconds to get Dragon up and running, and about the same time to shut it down "cleanly". And by "cleanly" I mean making it disappear completely from the tasks, as demonstrated by the microphone icon disappearing from the system tray. Which has always been the ultimate proof of concept for me.

Now, how does all this come together by any means.

SAPI 5 was published in 2000, at a time, when the basic Dragon architecture had already been developed, and, very fortunately from my perspective, had been developed along the lines of and compatible about 100% with SAPI 4. 

So basically, at the bottom, Dragon runs on SAPI 4, and very fortunately again and quite astonishingly, that never changed.

To that end, if you really want to hack into Dragon, and do this cleanly enough, you will have to go the SAPI 4 route, and do this in COM at best. By the same token, if you want to get out of it cleanly enough, stick with it, and don't go through wrappers, as such add-ons as the previous KB, VC, and SS do.

If you or anyone else wants more evidence, feel free to ask.



-------------------------



 06/09/2022 02:31 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi RW, thank you for the extra info above - it was excellent and explained a lot. I have no doubt that the next KB will "do things right" regarding loading in commands with your new grammar object. IMHO, anyone who builds a detailed app for testing (as shown in your video) is very serious about getting things right and measuring performance carefully. I'm also sure that KB users will definitely notice and appreciate the effects of your new and better software architecture.

On another topic of 10-second Dragon loads, I have made some progress on that topic on my end. I found that SS had created a 75MB log file on my end, so that might have had something to do with my blues. So I don't use it anymore. I also found that there was another process (UtterCommand) that I should have killed when I killed Dragon for its slowness. UC adds many DVC commands to the Dragon and runs a process in the tray to help. I believe that it (the tray process), being a zombie after I killed Dragon), really messed with the Dragon startup procedure somehow. So now I kill the UC tray process whenever I kill Dragon and I don't have the infinite Initializing messages from Dragon after a kill/boot cycle. KB is not part of my normal command set, so it had nothing to do with the issue.

Anyway, now I can kill Dragon/UC instantly, and my Dragon startups are better I will measure them again after a reboot. As I write (with no UC tray running), it takes 17 seconds for the Dragon button to appear, and (much to my surprise) isn't ready to go until the 58-second mark. For some reason, I thought I remember faster startups on fresh boot operations. (Keep in mind that Dragon is still loading a pile of DVC commands under the hood, loading my vocabulary, and a couple of hundred simple script commands). I will post here again after a reboot. (I tried looking at the Dragon log file, but it is inscrutable.)

 

UPDATE: After a fresh reboot, with UCTray running from the Start folder, Dragon took about 10-11 seconds for the DragonBar to appear, and another 9-10 seconds for the microphone ready light to appear (that probably includes loading UC DVC commands, my 200 commands, and my profile with vocabulary). 20 seconds is much more acceptable.

 

I wonder what slows it down so much after I kill it and when I have my apps open (Outlook, Chrome (a huge time hole if Dragon starts looking at all those tabs)). I'll have to do more experimenting.

 

UPDATE 2: I exited Dragon after taking those measurements, started Chrome with 50+ tabs, and restarted Dragon again. The DragonBar appeared at 8 seconds, and the ready microphone at 14 seconds. Wow. I'd love to have that all the time!

 

Then I killed Dragon and the UC tray process, leaving Chrome open, opened up VStudio 2022, opened up Outlook, and booted Dragon again. The DragonBar appeared at 8 seconds, and the ready light appeared at 14 seconds. From this I conclude that none of Chrome, VStudio, Outlook, UCTray, or my kill process procedure affect the Dragon startup time. It must be something else. I will watch out and report back if I find out anything new.



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 06/09/2022 06:55 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 8058
Joined: 03/04/2007

Thanks, Kevin, but the ultimate goal is to stop the killing altogether, in the Ukraine, in the US, and on the computer.

-------------------------

 06/10/2022 02:46 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Totally true. And by the jabs!

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 03/26/2023 05:48 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

To get back to the original topic that Edgar posted long ago, I thought I would add this C# code to show to talk to the Dragon engine control. It works, sort of. 

 

I tried a C# example to hook up to the RecognitionMimic method. It worked, sort of. Looking at the recognition history, Dragon recognized the "wake up" string as a command, and recognized Hello and there as strings that it inserted into Notepad.

 

Dragon completely failed to recognize the multi-word string in my example, although it works for the VBA example above. I looped and split a multi-word into individual words, and Dragon stopped recognizing the words after only FIVE words. No matter what I did, I could not get it to recognize six words. Very strange.

 

I continued to play with the looped words, varying them. It turns out that one of the embedded words was also a DVC command - the sixth word, which explains why Dragon would only recognize and insert the first five words into the Notepad buffer. So that issue was solved.

 

But the string with multiple words in one string still failed completely. (In another post, Chuck said that (historically) HeardWord only took lowercase individual words, which may map on to RecognitionMimic in some way. Maybe that is the problem there.

 

Anyhow, someone might be able to get something out of this C# example. You need to include the "using DNStools;" line in your project. If anyone has any ideas or comments on the slowness of this approach in C#, feel free to comment. I ran this code snippet from a tiny Windows Forms project under the Visual Studio debugger. 

 

    // the mic button state is ignored when using the engine control mimic interface

    using DNSTools;

    var engine = (IDgnEngine) new DgnEngineControl();

    MessageBox.Show(@"Registering DgnEngineControl ...");

    engine.Register();

    MessageBox.Show(@"DgnEngineControl registered! Switch to notepad now!");

    Thread.Sleep(2000); // switch to notepad manually

    engine.RecognitionMode = DgnRecognitionModeConstants.dgnrecmodeNormal;

 

    string msg;

    // it recognized this as a command to wake up the mic (which was already awake)

    msg = "wake up";

    engine.RecognitionMimic(msg);

    Thread.Sleep(2000);

 

    // hello and there arrived in notepad

    msg = "Hello";

    engine.RecognitionMimic(msg);

    Thread.Sleep(2000);

 

    msg = "there";

    engine.RecognitionMimic(msg);

    Thread.Sleep(2000);

 

    // this was not recognized in the Recognition history

    msg = "Can dragon recognize multiple words?";

    engine.RecognitionMimic(msg);

 

    // it can see and insert FIVE words only in this loop!

    // very strange

    var words = "maybe it can see six words or more".Split();

    foreach (var word in words) {

      engine.RecognitionMimic(word);

      Thread.Sleep(1000);

    }

 

    engine.UnRegister(false);

    MessageBox.Show(@"Engine unregistered!");

 

 



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 04/24/2023 11:11 PM
User is offline View Users Profile Print this message


BigTech
Senior Member

Posts: 122
Joined: 11/25/2008

Vocola 2 mic off, FWIW...

offer = #turn off microphone
  PlaySound("D:\files\wav\special effects\thumph1.wav")
  HeardWord("microphone","off")
;#turn off microphone

 05/03/2023 11:44 AM
User is offline View Users Profile Print this message


wheels496
Advanced Member

Posts: 197
Joined: 10/01/2008

I have been experimenting with trying to turn the microphone on and off in C# (it's part of a bigger idea to solve the server busy problem with visual studio-using the application for enabling/disabling Select-and-Say in window presentation controls work but then you cannot dictate in text boxes).

So in C# I have registered the Dragon activateX control DLL and here is my code:-

using System;

using System.Windows.Forms;

using DNSTools;

namespace Dragon_API_test

{

    public partial class Form1 : Form

    {

        public Form1()

        {

            InitializeComponent();

        }

 

        private void button1_Click(object sender, EventArgs e)

        {

            var engine = (IDgnEngine)new DgnEngineControl();

            engine.RecognitionMimic("microphone off", 0);

 

        }

    }

}

However when it gets to the recognition statement, it reports that the class is not registered. Yes I could try registering the DLL with regsvr32 but wanted advice first.

Thanks



-------------------------

DPI 15.6.1

 05/03/2023 12:15 PM
User is offline View Users Profile Print this message

Author Icon
Edgar
Top-Tier Member

Posts: 1375
Joined: 04/03/2009

Originally posted by: wheels496 

 

However when it gets to the recognition statement, it reports that the class is not registered.

from earlier in this thread:

private static void TextToSpeech(string pSpeakThis) {

         DgnMicBtn gDgnMic = new DgnMicBtn();

         gDgnMic.Register(0);

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff;

            SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer();

            speechSynthesizerObj.Speak(pSpeakThis);

            speechSynthesizerObj.Dispose();

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone off. The error message is:" + 

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         try {

            ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn;

         }

         catch (Exception exception) {

            TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone on. The error message is:" +

               Environment.NewLine + exception.Message, "Dragon® ERROR");

         }

         gDgnMic.UnRegister();

      }

and I think that with Dragon 16 the very last line is no longer valid and needs to be removed/commented out. I think that the two first statements of the method are what you are missing.



-------------------------

-Edgar
DPI 15.3, 64-bit Windows 10 Pro, OpenOffice & Office 365, Norton Security, Shure X2U XLR to USB mic adapter with Audio Technica DB135 vocal mic, Asus X299-Deluxe Prime, Intel Core i9-7940X (14 core, 4.3 GHz overclocked to 4.9 GHz), G.SKILL TridentZ Series 64GB (4 x 16GB) DDR4 3333 (PC4 26600) F4-3333C16Q-64GTZ, NVIDIA GIGABYTE GeForce GTX 1060 GV-N1060G1 GAMING-6GD REV 2.0 6GB graphics card with 3 1920x1080 monitors

 05/04/2023 01:49 AM
User is offline View Users Profile Print this message

Author Icon
Mav
Top-Tier Member

Posts: 666
Joined: 10/02/2008

The error doesn't relate to DgnEngine not being registered in COM.

In order to call (almost) any method on DgnEngine you first have to call its Register() method.

 

hth

mav

 05/04/2023 04:39 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

For what it is worth, here is the code that I use now to turn the Dragon mic on. Thanks to Ed for getting me started with his example a year ago or more. Thanks Ed!

 

  static void

    UxDragonMicrophoneOn() {

    // turn the mike on after Dragon is available

    try {

      var mikeButton = (IDgnMicBtn) new DgnMicBtn();

      mikeButton.Register(0);

      mikeButton.MicState = DgnMicStateConstants.dgnmicOn;

      mikeButton.UnRegister();

    }

    catch

      (Exception ex) {

      var m = $"{ex.Message}";

      Debug.WriteLine(m);

      FormAppendTextError(m);

    }

  }



-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.



 05/04/2023 05:02 PM
User is online View Users Profile Print this message

Author Icon
PG LTU
Top-Tier Member

Posts: 2248
Joined: 03/21/2007

So I can see why all the fuss if you are building a speech-enabled application, but if all you need to do is run a command to turn the mic on, off or put it in asleep mode, a simple vbs script passing the micstate as a parameter should get you all you need:

pgDgnMicControl.vbs

YMMV,



-------------------------




PG





Remember folks, my comments and this forum are for entertainment value only, please, no wagering or other reliance on the contents herein.  I permit no commercial use of my ideas (whether expressions or embodiments) without my written consent.



Statistics
32528 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 3 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 394 guests browsing this forum, which makes a total of 397 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.