![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: From C# - turn the Dragon® microphone OFF then back ON Topic Summary: Created On: 05/19/2021 10:07 PM Status: Post and Reply |
|
![]() |
![]() |
- Edgar | - 05/19/2021 10:07 PM |
![]() |
![]() |
- dilligence | - 05/19/2021 11:21 PM |
![]() |
![]() |
- Edgar | - 05/20/2021 02:17 AM |
![]() |
![]() |
- Mphillipson | - 05/20/2021 12:50 PM |
![]() |
![]() |
- TomDonovan | - 05/21/2021 08:31 AM |
![]() |
![]() |
- Edgar | - 05/21/2021 12:41 PM |
![]() |
![]() |
- kkkwj | - 05/21/2021 02:51 PM |
![]() |
![]() |
- Edgar | - 05/21/2021 04:14 PM |
![]() |
![]() |
- kkkwj | - 05/21/2021 08:33 PM |
![]() |
![]() |
- Edgar | - 05/21/2021 08:56 PM |
![]() |
![]() |
- Handsfreecoder | - 05/22/2021 08:18 PM |
![]() |
![]() |
- Edgar | - 05/22/2021 08:45 PM |
![]() |
![]() |
- Edgar | - 05/22/2021 08:48 PM |
![]() |
![]() |
- kkkwj | - 05/23/2021 02:33 AM |
![]() |
![]() |
- Handsfreecoder | - 05/24/2021 10:41 AM |
![]() |
![]() |
- Edgar | - 05/24/2021 11:02 AM |
![]() |
![]() |
- kkkwj | - 05/24/2021 02:09 PM |
![]() |
![]() |
- Edgar | - 05/24/2021 02:31 PM |
![]() |
![]() |
- Pranav Lal | - 04/22/2022 10:01 PM |
![]() |
![]() |
- tar | - 04/23/2022 12:21 PM |
![]() |
![]() |
- Edgar | - 04/23/2022 03:14 PM |
![]() |
![]() |
- Lunis Orcutt | - 04/23/2022 12:49 PM |
![]() |
![]() |
- kkkwj | - 05/25/2022 12:48 PM |
![]() |
![]() |
- kkkwj | - 05/28/2022 02:43 PM |
![]() |
![]() |
- kkkwj | - 05/25/2022 07:54 PM |
![]() |
![]() |
- kkkwj | - 05/28/2022 02:01 PM |
![]() |
![]() |
- kkkwj | - 05/28/2022 02:06 PM |
![]() |
![]() |
- kkkwj | - 05/31/2022 10:46 PM |
![]() |
![]() |
- monkey8 | - 06/01/2022 05:26 AM |
![]() |
![]() |
- kkkwj | - 06/01/2022 12:13 PM |
![]() |
![]() |
- Mav | - 06/02/2022 01:51 AM |
![]() |
![]() |
- monkey8 | - 06/05/2022 08:08 AM |
![]() |
![]() |
- kkkwj | - 06/05/2022 12:55 PM |
![]() |
![]() |
- kkkwj | - 06/02/2022 01:55 AM |
![]() |
![]() |
- Mav | - 06/02/2022 08:02 AM |
![]() |
![]() |
- kkkwj | - 06/02/2022 12:00 PM |
![]() |
![]() |
- Edgar | - 06/02/2022 12:49 PM |
![]() |
![]() |
- kkkwj | - 06/02/2022 05:08 PM |
![]() |
![]() |
- Mav | - 06/03/2022 01:50 AM |
![]() |
![]() |
- kkkwj | - 06/03/2022 11:49 AM |
![]() |
![]() |
- monkey8 | - 06/05/2022 02:59 PM |
![]() |
![]() |
- kkkwj | - 06/05/2022 04:47 PM |
![]() |
![]() |
- R. Wilke | - 06/06/2022 04:51 AM |
![]() |
![]() |
- monkey8 | - 06/06/2022 01:00 PM |
![]() |
![]() |
- kkkwj | - 06/07/2022 07:08 PM |
![]() |
![]() |
- Edgar | - 06/06/2022 10:16 AM |
![]() |
![]() |
- R. Wilke | - 06/06/2022 11:38 AM |
![]() |
![]() |
- kkkwj | - 06/07/2022 07:06 PM |
![]() |
![]() |
- Matt_Chambers | - 06/08/2022 06:50 AM |
![]() |
![]() |
- kkkwj | - 06/09/2022 02:11 PM |
![]() |
![]() |
- R. Wilke | - 06/08/2022 02:52 PM |
![]() |
![]() |
- kkkwj | - 06/09/2022 02:31 PM |
![]() |
![]() |
- R. Wilke | - 06/09/2022 06:55 PM |
![]() |
![]() |
- kkkwj | - 06/10/2022 02:46 AM |
![]() |
![]() |
- kkkwj | - 03/26/2023 05:48 PM |
![]() |
![]() |
- BigTech | - 04/24/2023 11:11 PM |
![]() |
![]() |
- wheels496 | - 05/03/2023 11:44 AM |
![]() |
![]() |
- Edgar | - 05/03/2023 12:15 PM |
![]() |
![]() |
- Mav | - 05/04/2023 01:49 AM |
![]() |
![]() |
- kkkwj | - 05/04/2023 04:39 PM |
![]() |
![]() |
- PG LTU | - 05/04/2023 05:02 PM |
![]() |
|
Turning the microphone off:
engineControl.RecognitionMimic("microphone off", 0);
Turning it back on requires either first creating a new voice command "microphone on" (SetMicrophone 1) and then:
engineControl.RecognitionMimic("microphone on", 0);
or create a vbs, ahk etc. script that does that last mimic and call it with System.Diagnostics.Process.Start ------------------------- Turbocharge your Dragon productivity with 40+ Power Addons |
|
|
|
![]() |
|
engineControl.RecognitionMimic("microphone on", 0);
or create a vbs, ahk etc. script that does that last mimic and call it with System.Diagnostics.Process.Start There must be a better way than creating an external script. I suspect that both of these are exposed through something other than "RecognitionMimic". ------------------------- -Edgar |
|
|
|
![]() |
|
This seems to be mentioned in the Dragon NaturallySpeaking API as per the following screenshot: ------------------------- Thanks Mark
Dragon Professional Advanced Scripting/KnowBrainer Scripts |
|
|
|
![]() |
|
Here is an AutoHotKey Script I wrote that doesn't something along the lines of what you want. It shouldn't be too hard to translate.
PlayText(Text){ DgnMic := ComObjCreate("Dragon.MicBtn") DgnMic.Register() MicState := DgnMic.MicState DgnMic.MicState := 1 voice := ComobjCreate("SAPI.SpVoice") ComObjConnect(voice,"SP_") voice.Rate := 3 voice.Speak(Text, 1) voice.waituntildone(20000) DgnMic.MicState := MicState sleep, 100 DgnMic.UnRegister() DgnMic := ;Dgn.UnRegister(0) ;Dgn := Voice := return } ------------------------- Tom |
|
|
|
![]() |
|
Thanks Tom, that was enough to push to get me over the hump! Currently I have this defined outside the actual method: although I doubt that it is necessary to define it externally as it will never be used anywhere but in the method. Here is the method: private static void TextToSpeech(string pSpeakThis) { gDgnMic = new DgnMicBtn(); gDgnMic.Register(0); try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff; SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer(); speechSynthesizerObj.Speak(pSpeakThis); speechSynthesizerObj.Dispose(); } catch (Exception exception) { TimedMessage("Dragon®’s DgnEngineControl failed when turning the microphone off. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn; } catch (Exception exception) { TimedMessage("Dragon®’s DgnEngineControl failed when turning the microphone on. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } gDgnMic.UnRegister(); } "TimedMessage" is just a speech friendly message box with an optional timeout. This version turns the microphone off instantly before reading then turns it right back on. ------------------------- -Edgar |
|
|
|
![]() |
|
Yay! Thank you Edgar - we now have some C# code that works with the microphone, which is a start. I sure wish Nuance would convert their C++ API documentation to C#. Maybe they will now, after the acquisition. Maybe some Microsoft guys will produce an API that matches their standard C# "going forward" language. (Haha, hope, hope, :-))
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
I used to be on Microsoft's Accessibility Team. I was hoping to be able to influence the way their applications were designed in order to be more accessible to folks with disabilities. This did not pan out. I've been thinking about contacting them and offering my "work from home" services - just in the realm of their acquisition of Nuance and specifically Dragon Professional Individual. I wonder if any of the Microsoft Dragon software team will be/is monitoring this forum?
------------------------- -Edgar |
|
|
|
![]() |
|
What a great idea. If you can find the right people, maybe they would be interested. I suppose you have find the team that is/becomes responsible for Dragon. You would be a great addition to their team, even as a part-time consultant!
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
The real problem is the commute - back then I only had to go in about once a month and they always scheduled it for off-peak traffic hours; still, it was always well over an hour and sometimes over two hours!
------------------------- -Edgar |
|
|
|
![]() |
|
catch (Exception exception) { Environment.NewLine + exception.Message, "Dragon® ERROR"); }
I hate to tell you this but you are not even using an engine control in your code above so the engine control wouldn't trigger an exception.
There are more VB examples in the API than C++ which are easily convertible to C#? Why would Microsoft produce an API for Dragon that matches their standard C# when it already exists?
Which Microsoft's accessibility team would that be? I have never seen applications so accessible for the disabled as Microsoft applications. I wish other developers went to the same lengths as Microsoft did. Are you actually aware of the number of APIs available for Microsoft applications that make them accessible for the disabled? What exactly isn't accessible?
|
|
|
|
![]() |
|
private static void TextToSpeech(string pSpeakThis) { DgnMicBtn gDgnMic = new DgnMicBtn(); gDgnMic.Register(0); try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff; SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer(); speechSynthesizerObj.Speak(pSpeakThis); speechSynthesizerObj.Dispose(); } catch (Exception exception) { TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone off. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn; } catch (Exception exception) { TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone on. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } gDgnMic.UnRegister(); } ------------------------- -Edgar |
|
|
|
![]() |
|
Which Microsoft's accessibility team would that be? https://news.microsoft.com/on-the-issues/2019/09/25/accessibility-supportability-anne-taylor/ ------------------------- -Edgar |
|
|
|
![]() |
|
@Handsfree, >There are more VB examples in the API than C++ which are easily convertible to C#? Why would Microsoft produce an API for Dragon that matches their standard C# when it already exists?
Yes, there are many such examples. But they're in VB and C++, not C#. I'm sure the argument-calling conventions (types, casts, etc.) could be worked out for C (in the DllCall APIs), C++, and so on. But it would sure be easier if there were some C# examples around. Credit to Edgar for biting the bullet, spending some time, and posting some C# examples. I suppose for precision, I should say that I'm not asking MS/Nuance to create a brand new API to replace the one that exists. I'm wishing for better documentation and examples, which do not exist. If you know all these answers and can easily show some C# examples of doing things, please feel free to post some to get the rest of us started. :-) ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
So just to be clear on this you don't know how to integrate Dragon objects (which after all are implemented using a Microsoft object model/interfaces) into a Microsoft .net language C# application and you are unaware of the existing accessibility APIs available with Microsoft applications. Yet you, or one of you, are going to offer Microsoft consultancy on how to make their applications more accessible for disabled users including those using Dragon?
Wow.
Then for good measure you want me to post some C# code showing you how to do this to save you doing your own homework when there are pages and pages of relevant information and examples available on Google where you just need to join the dots?
|
|
|
|
![]() |
|
[…] you want me to post some C# code […] there are pages and pages of relevant information and examples available on Google where you just need to join the dots? For me, that sums it up in a nutshell. But, just to clarify the issue, if Microsoft® were to ask me to rejoin the Accessibility Team (which, from the looks of it, may no longer exist) it would be in the sense of me offering my experience as a disabled user of both Dragon® and Windows. Think of me as a "highly(?) experienced" beta tester. ------------------------- -Edgar |
|
|
|
![]() |
|
I agree with Edgar's C# contributions, especially over the past year. I recognize that he posts things "at the right level" and "with a suitable standalone architecture" so that they can be used by people who can compile C# programs. Thanks Edgar!
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Oh, just so folks don't get the wrong impression, working on Microsoft®'s Accessibility Team was virtually a volunteer position. We were not paid but got significant perks.
------------------------- -Edgar |
|
|
|
![]() |
|
Hi Edgar, If I may ask, what is the assembly name you used to get access to dragon functionality in .net? Did you need to have the SDK from nuance installed?
Pranav |
|
|
|
![]() |
|
no, you don't need the SDK. Go to the reference manager in Visual Studio, go to COM, then "Dragon naturallyspeaking ActiveX controls" ------------------------- Tom
Programmer Of SP 7 PRO (speechproductivity.eu) |
|
|
|
![]() |
|
Hi Pranav! As Tom points out, you do not need the SDK (I do not have it). If Tom's answer didn't give you enough information please post here and we will try to step you through the process. Please make sure to let us know which language (C++, C# etc.) and IDE (Microsoft Visual Studio 2019 etc.) you are using. ------------------------- -Edgar |
|
|
|
![]() |
|
Just a quick note on Dragon SDK. It is no longer available and hasn't been for quite some time.
------------------------- Change "No" to "Know" w/KnowBrainer 2022 |
|
|
|
![]() |
|
In the past few days, I implemented some C# code to turn the Dragon mic on and off. Here are some tips for those who might read this someday.
For example, Microsoft shows an example here https://docs.microsoft.com/en-us/dotnet/api/system.speech.recognition.speechrecognitionengine?view=netframework-4.8
I used a different Microsoft example to get started, but I can't find the link anymore. Go figure.
I had trouble defining the required reference to the speech libraries and Dragon libraries. I could not find them in the usual places in the Add Reference dialogs in Visual Studio.
For the Windows speech libraries/platforms, I found these. What a mess. The System.Speech NuGet package is the easiest to work with - install the package in VStudio and go. The Microsoft Speech SDK requires downloading an SDK. I do not understand the benefits of the SDK since the APIs are almost exactly the same. Maybe the SDK has access to more languages or something.
(1) System.Speech.dll (2022 NuGet package with 2.3M downloads
(doc from 2015 https://docs.microsoft.com/en-us/archive/msdn-magazine/2014/december/voice-recognition-speech-recognition-with-net-desktop-applications) Install this as a NuGet package 'System.Speech', 2.3M downloads. This is the one that I used to get my program (and Ed's code above) working.
The System.Speech libraries are installed through the NuGet package mechanism (System.Speech), which is why I could not find them in the Add Reference dialog box as installed type libraries. I ended up using the System.Speech package by accident, I think.
(2) Microsoft Speech Platform - Software Development Kit (SDK) (Version 11) Speech Platform features compared with SAPIThe Speech Platform API and runtime are derived from the Speech API (SAPI) and the speech runtime in Windows, but are primarily intended to support speech applications running as standalone services. As a result, there are some important differences. The following table lists key functional areas of the Speech Platform and describes the principal differences between the Speech Platform and the speech functionality in Windows, as provided by SAPI. The Microsoft.Speech library is installed by downloading the SDK from the official Microsoft download center. I downloaded the x64_MicrosoftSpeechPlatformSDK.msi file. After installing it, open the Add Reference dialog and browse to c:/program files/microsoft sdks/speech/ and select the dll. Then you can say 'using Microsoft.Speech' in your C# code.
The APIs in this SDK are very close to the System.Speech APIs and I was able to use my System.Speech code by changing the using statements. (But I still found an old chunk of code on the net for the SpeechRecognizedEventArgs that uses fields like e.Error that do not appear in either the System.Speech API or the Microsoft.Speech API.
Here is an old (2014) comment from https://docs.microsoft.com/en-us/archive/msdn-magazine/2014/december/voice-recognition-speech-recognition-with-net-desktop-applications
Figure 9 Microsoft.Speech vs System.Speech (circa 2014) Microsoft Speech is a downloadable SDK in 2022 System.Speech is a NuGet package in 2022, usable in C# as managed code
Microsoft.Speech.dll System.Speech.dll Must install separately Part of the OS (Windows Vista+) Can package with apps Cannot redistribute Must construct Grammars Uses Grammars or free dictation No user training Training for specific user Managed code API (C#) Native code API (C++)
(3) Microsoft Speech SDK 5.1 (circa 2022) - looks like this uses SAPI API calls
The Microsoft Speech SDK 5.1 adds Automation support to the features of the previous version of the Speech SDK. You can now use the Win32 Speech API (SAPI) to develop speech applications with Visual Basic ®, ECMAScript and other Automation languages. Version: 5.1
File Name: SpeechSDK51.exe
msttss22L.exe sapi.chm Sp5TTIntXP.exe SpeechSDK51LangPack.exe (4) .NET Speech SDK (circa 2022, a paid Azure cloud service)
The Speech software development kit (SDK) exposes many of the Speech service capabilities you can use to develop speech-enabled applications. The Speech SDK is available in many programming languages and across all platforms.
(from Microsoft Cognitive Services) https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk. The .NET Speech SDK is available as a NuGet package and implements .NET Standard 2.0. For more information, see Microsoft.CognitiveServices.Speech.
Here is a nice FAQ page for the MS Speech Service (it uses the cloud for speech recognition at scale for telephony apps, and requires a paid service). https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/
(5) Windows.Media.Speech - forget it; this has nothing to do with general speech recognition
From a project on GitHub (https://github.com/zoomicon/SpeechLib): There is an issue with the Windows Speech API I think regarding recognition quality for commands. The Windows Speech library has less languages, but user can train it from the control panel (there is a speech item there). There is an alternative API called Microsoft Speech that is very similar (only the SGML syntax needed some small changes to support both of those). That one has its own SDK (Kinect runtime installs it) and has more generic recognition (less tuned to a specific user) with more languages. So if you get wrong recognition, maybe you should switch to the Microsoft Speech that Kinect team suggests instead.
The Microsoft.Speech.dll libaries (I read somewhere from 2015?) are the best place to start because they are somewhat simpler and work with fixed-length, word-specific grammars. I think the Microsoft.Speech libraries are redistributable too, whereas the System.Speech libraries are not.
In contrast, the System.Speech.dll libraries are supposed to be more complex and have extra features for complex dictations.
I tried to find out how to do open-ended using WSR doc and examples, but I failed. Does anyone know of examples to do that in WSR? Or if it is even possible? (I could find no documentation on the topic either way.) Update: I eventually found a comment on the CodeProject that said, "Forget about training in WSR in your own code. The best you can do is run the OS training app for speech recognition in the Accessibility settings."
For working with Dragon, I thought I should look for COM libraries that started with "dgnXXX" that you see around the forum and that you can see in the list of libraries shown in the Add Reference dialog box. But I could never get any of those to work. Ed helped me out by giving me the secret Dragon Naturally Speaking ActiveX Controls library name, which I found easily. Once I installed that, I was off to the races.
You use the Microsoft or System.Speech libraries to do the recognition, and the Dragon library (using Ed's example code above) to turn the Dragon mic on and off. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
BTW, open-ended recognition in WSR is hopeless. It can't even recognize simple phrases on Windows 11 when I use the System.Speech 'DictationGrammar' object to load a grammar containing common phrases and words. Dragon is far superior. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
https://docs.microsoft.com/en-us/previous-versions/office/developer/speech-technologies/dd147134(v=office.14) ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Here is a helpful post with more information on what the various libraries do.
Desktop recognizers are designed to run inprocess or shared.
Shared recognizers are useful on the desktop where voice commands are used to control any open applications. Server recognizers can only run inproc.
Inproc recognizers are used when a single application uses the recognizer or when wav files or audio streams need to be recognized (shared recognizers can’t process audio files, just audio from input devices).
Only Desktop speech recognizers include a dictation grammar (a system provided grammar used for free text dictation). The class System.Speech.Recognition.DictationGrammar has no complement in the Microsoft.Speech namespace. ** This means you should use System.Speech if you want to recognize open-ended dictation. **
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
At last, a decent answer. Basically, use System.Speech (with optional free-form DictationGrammar) on Desktops. And use Microsoft.Speech for server-side scalable apps.
I guess I got lucky using the easy System.Speech NuGet package by accident. And you could argue that all my time researching the terrible mess of Windows Speech APIs was wasted. Oh well. At least I'm on the right track now, and others who read this thread won't waste as much time.
The short answer is that Microsoft.Speech.Recognition uses the Server version of SAPI, while System.Speech.Recognition uses the Desktop version of SAPI.
The APIs are mostly the same, but the underlying engines are different. Typically, the Server engine is designed to accept telephone-quality audio for command & control applications; the Desktop engine is designed to accept higher-quality audio for both command & control and dictation applications.
You can use System.Speech.Recognition on a server OS, but it's not designed to scale nearly as well as Microsoft.Speech.Recognition.
The differences are that the Server engine won't need training, and will work with lower-quality audio, but will have a lower recognition quality than the Desktop engine.
https://stackoverflow.com/questions/9347346/appenddictation-on-microsoft-speech-platform-11-server
For anyone who may come across this in the future -- I've now emailed back and forth with Microsoft, and ultimately received this response:
The managed interfaces (Microsoft.Speech and System.Speech) are built on top of the native SAPI interfaces. These interfaces are the same for both the Server engine and the Desktop engine.
BUT the engine itself is responsible for implementing dictation, and the Server engine does not do so. Therefore, the call will fail when you load the grammar. More comments from this thread back in 2015. https://stackoverflow.com/questions/12101120/matching-wildcard-dictation-in-microsoft-speech-grammar
But in the C# API there is a DictationGrammar and WildcardGrammar. I could archive my goal if I "harcode" it. In fact I activate a Dictation grammar for som special case (even if it is bad I agree) –
The C# API works with both the desktop engine and the server engine. The desktop engine supports DictationGrammar and WildcardGrammar; the server engine does not. –
Kinect uses Microsoft.Speech, not System.Speech as it seems, although you could probably grab the audio from Kinect and use it with System.Speech somehow (but I think you need training of the recognition engine if you go with System.Speech) –
Btw, it seems Microsoft united their previously separate installers for their Server (accessed via Microsoft.Speech in .NET) and Client (accessed via System.Speech in .NET) Speech Runtimes into Microsoft Speech Platform Runtime (Version 11), found at microsoft.com/en-us/download/details.aspx?id=27225.
The respective SDK is at microsoft.com/en-us/download/details.aspx?id=27226. The speech runtime version that was labeled in the past "for servers" is non-trainable (has setting for acoustic model adaptation on/off though) and doesn't accept free speech, only commands –
Here is another comment from VoiceElements website. He says that MS stopped development on the Microsoft Speech Platform back in 2012.
Microsoft stopped development on the Microsoft Speech platform in 2012. Instead of processing text-to-speech (TTS) or speech recognition (SR) on-premises, Microsoft now steers its customers to use their cloud services on Azure. Those services and other similar services on the cloud can provide excellent SR and TTS and can work in conjunction with the Voice Elements platform. However, since there is no charge for the Microsoft Speech Platform, we continue to support it as our go-to default facility for TTS and SR.
The Microsoft Speech Platform is comprised of the following:
Microsoft Speech Platform Runtime
You should have this installed on your server in order to perform speech recognition functions within Voice Elements. Voice Elements has built out support for Microsoft Speech Platform, as long as you use Microsoft compatible grammar files. These are easy to create using the methods outlined in this article: Create Microsoft Speech Compatible Grammar Files
The runtime can be downloaded at Microsoft Speech Runtime.
Microsoft Speech Language Packs
The Microsoft Speech Platform relies on different language packs in order to provide speech recognition capabilities for different languages. Microsoft Speech Platform supports 18 different languages and accents. You can download some of the more popular languages using the links below. For additional options, please contact Inventive Labs Technical Support. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
A bit more info gleaned from reading the VoiceAttack and VoiceMacro forums and manuals and Github extensions. Both of these programs were (or are) based on the built-in MS Speech Recognizer Engine 8.0 that has been included with Windows ever since Windows XP. VoiceAttack now allows you to download the upgraded MS Speech Platform 11 runtime engine and select it within the VoiceAttack options. I investigated the Speech Recognition / Advanced Options on my Win10 machine, and much to my surprise found a SpeechStart6Profile in my list of trained voice profiles, which means that SpeechStart also uses the built-in MS recognition engine.
Note that if you're using Speech Platform 11, there is no applicable training. That only applies to the built-in SAPI engine. Speech Platform 11 is designed to be speaker-independent, and does not need to be/cannot be trained
For my two bits, nothing touches Dragon for recognition accuracy. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Go to the reference manager in Visual Studio, go to COM, then "Dragon naturallyspeaking ActiveX controls"
Hi Pranav!
As Tom points out, you do not need the SDK (I do not have it). If Tom's answer didn't give you enough information please post here and we will try to step you through the process.
Nothing could be further from the truth. Both statements above are contradictions in terms.
Kevin,
Hopefully to help, others if not yourself, the bottom line is that accessing the .NET libraries like System.Speech & System.Speech.Recognition only gives you access to a very small subset of SAPI 5 interfaces albeit they are convenient to use with C# or VB .NET. To unleash the true power of SAPI 5 you need to access the SpeechLib.dll COM libraries and to use that with C# you will need to use COM Interop (homework) or alternatively C++.
Writing a program with .net to recognises SAPI 5 commands to switch on the Dragon microphone is child's play to a programmer especially with a shared recogniser. It's all too easy to add voice commands to SAPI 5 that get recognised but it's not so easy to confine those voice commands to only being recognised with the EXACT phrase spoken with multiple commands. In other words tuning the recognition accuracy of the speech engine that you create which involves things like setting up your recognition grammars, training and using an inproc recogniser to give you full control of your application and avoiding the use of any integration with WSR.
------------------------- |
|
|
|
![]() |
|
Hi Lindsay, thank you for your comments. I'm sure that you're right - the System.Speech.* and Microsoft Speech libraries totally hide and protect me from the SAPI API complexity! I had the SpeechLib.dll library installed for a while as I tried to figure out the complete mess that is the speech library world. But I ended up on System.Speech in C#, which is working okay so far. (I keep trying to avoid learning C++ - I went from C to C# and jumped over C++; I met with Bjarne Stroustrup the C++ creator a few times at conferences, and he was much like the C++ language... :-))
I also agree that turning the mic on and off is child's play (especially since Edgar posted the C# code for it. :-) I would love to find some C# examples (or even C++ examples) of how to use an inproc recognizer instead of a shared recognizer, but in all my searching over the past week or two, I couldn't find any on how to do it. Maybe it's buried in the SAPI or Microsoft Speech Platform 11 SDK manuals somewhere. I also think you're right on recognizing the larger problem of choosing phrases and keywords for your command grammars to promote higher-quality recognition. You need to pick words and phrases that are technically workable while still being memorable, pronounceable, and cognitively pleasing (or at least acceptable). But for now, since I have no C# examples, no inproc recognizer, no solid grammars, and no training ability, I will slog on with the lowly WSR, which seems acceptable for short, fixed-length phrases but is totally useless (IMHO) at free-form I know that WSR gives you some training scripts, so there is some training going on somewhere, somehow. But I don't know how to do it myself. Do you know of a recent good book on SAPI that would explain all this stuff easily (and that uses C# for examples :-))? That might save me some time on my journey... Cheers ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Are you sure you want to invest your time in a dead technology like WSR/SAPI? WSR's accuracy has never been that great for dictation (even though it's working fine for simple grammars most of the time) and customization/adaptation has been very tricky all the time. If you want up-to-date technology for high accuracy dictation apart from Dragon I can only recommend MS Azure Speech Services. There's a ton of examples on GitHub, it doesn't cost you if you use the free tier and now that MS has bought Nuance and is integrating them into their cloud business unit, chances are that cloud-based SR will get even better.
hth mav |
|
|
|
![]() |
|
|
|
![]() |
|
Hi, yes, all this local speech recognition stuff is old/dead tech; I recall posting somewhere that the doc was all ancient (circa 2006 Windows XP for MS Speech 8.0, and circa 2012 for MS Speech Platform 11).
But Windows still ships with MS SPeech 8.0, you can use 11.0 with a download, it's free, documented (very messy doc overall), and doesn't require yet another account and an internet connection.
I suppose you can see that I'm on a learning journey. I freely admit that I might end up in the Cognitive Azure cloud at the end of it.
But for now, I'm just trying to build an app like Mark has already built to run beside or instead of Dragon. I'm sure that building an Azure app will be easier after learning this MS speech stuff. It sure looked like there were many examples around (including the word "Azure" in Mark's posted example!) ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Hi Mav, thank you for the interesting info! I had heard of the Azure cloud stuff (is that the same as the Cognitive cloud group in Microsoft)? But I had not heard of a free tier nor of the examples on GitHub. I will have a look as time permits. At least for now, I only have a few simple fixed commands that WSR is recognizing okay (microphone on, microphone off, restart dragon, etc.) I think I'll try out the MS Speech Platform 11 stuff when I get a chance too. And then if the examples on GitHub are in C#, maybe I'll even try the cloud services. So much to do, and so little time... :-)
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
|
|
![]() |
|
Hi Mav, thank you for the encouragement and for providing some easy-access links. Here is the link for C# Windows Hello World using the Azure Cognitive speech stuff. It looks pretty straightforward and convenient. It is most interesting that it uses free-form dictation and does not require loading up a grammar (although I am sure that it is capable of doing so). Cheers
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
@KKKWJ - I don't see a link...
------------------------- -Edgar |
|
|
|
![]() |
|
Ed, you're right! Here it is for the first time. :-) https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/csharp/dotnet/from-microphone/helloworld/Program.cs ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Azure Speech Services is speaker-independent and thus doesn't require a speaker profile. Regarding grammars I'm unsure if this is supported - haven't looked into it. But judging from available services I think MS is following a different approach by separating input from intent recognition. I.e. provide a speech recognition engine that's capable of recognizing exactly what was said and provide a transcript of this. The next service detects the intent from the textual natural language input and invokes actions according to the different intents (that's what LUIS does). It's more of an "engine" for driving voice assistants that can perform different tasks.
hth mav |
|
|
|
![]() |
|
Yes, I agree with everything you say about Azure Cognitive Services speech stuff. I explored the documentation and it seemed to be all about transcription of incoming speech for larger voice-driven systems - bots, assistants, ordering systems, meetings, etc. (Makes me wonder if someone will make an auto-speech bot for the old porn phone lines that earned people a lot of money. :-))
I also saw the "free" stuff was a limited number of hours (like 5 hours) of free transcription, etc. I couldn't find anything that looked like "give a voice command and then do something on the local machine with the result." But I suppose if the system could "transcribe" my Dragon commands, it could send me back some text and I could parse it and dispatch to a script on my own computer. And then after five hours of commands, I'd have to sign up for a subscription (sometimes hundreds of bucks a month). I probably won't go through the hassle of creating an account and trying the free stuff - you need to provide a $200 deposit to even sign up (as far as I could tell). ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
… I know that WSR gives you some training scripts, so there is some training going on somewhere, somehow. But I don't know how to do it myself. Do you know of a recent good book on SAPI that would explain all this stuff easily (and that uses C# for examples :-))? That might save me some time on my journey... Cheers
Kevin the Speechlib examples used to be included with the SAPI 5 SDK which was then moved to the Windows SDK but whether they are still available I don't know. Unfortunately they are difficult to follow you need to persevere.
You can see training of a SAPI 5 profile with Speechstart+, you will be familiar with the training for MICROPHONE ON and RESTART DRAGON. It is all created from scratch using C#, a lot of work.
However I would repeat to try using some of the newer technologies as SAPI 5 recognition is not brilliant although it is okay with commands. It will soon be deprecated. ------------------------- |
|
|
|
![]() |
|
Thank you for your thoughts. Currently, my voice technology world of options has exploded since I can do open-ended now. Honestly, I have essentially no interest at all in working with the low-level SAPI interfaces in C++ (or COM, as you pointed out). Too much complexity with no observable benefit that I can see (at least for now). I have about 40-50 commands running with System.Speech (and one open-ended command), and it never misses a beat. I do not know how well it will scale up to thousands of commands, and dozens of open-ended commands. Maybe Azure will be required by then, who knows.
One other thought I had - it is better to give Dragon and WSR 1) one grammar containing 1000 choices in a list, or 2) 1000 separate and simple grammars, one for each of the 1000 commands? My thinking is that option 1 could be done by a recognizer with simple Dictionary lookups, whereas with option 2 the recognizer would have to iterate over 1000 small grammars.
Do you have any thoughts on which approach would be better for performance or limitations (like Dragon slowing down at 3000 commands, and RW saying his new thing slows down at 5000+ grammars). Maybe if I put 5000 fixed-length commands into one grammar as choices, I wouldn't run into the "5000" command-count performance ceiling as quickly? ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Do you have any thoughts on which approach would be better for performance or limitations (like Dragon slowing down at 3000 commands, and RW saying his new thing slows down at 5000+ grammars). Maybe if I put 5000 fixed-length commands into one grammar as choices, I wouldn't run into the "5000" command-count performance ceiling as quickly?
Just for the the record, my "new thing" is more of an "old new thing" actually. It is designed as a Dragon COM add-in and meant to be the underlying custom grammar interface going into the next KnowBrainer generation.
Here is a quick and dirty video demonstrating how it basically works:
https://rwilke.de/Downloads/KB%20Test%20Driver%20Last%20Stage%20-%20new.mp4
The scalable stress testing has been integrated specifically to check out performance issues relative to the number of commands currently loaded, load times, unload times, and the impact the module has on the overall stability of the Dragon process.
In the preliminary development, stress testing was the first step in taking the new module for a spin. Creating, loading and activating commands takes about 2 milliseconds per command in this implemenation. Unloading takes a little less. Recognition is instantaneous, relative to the overall number of commands currently loaded. Dragon won't hang or crash even with up to 7,500 commands loaded and activated, and it won't hang or crash during shut down.
In the next step of the development, facilitating for custom commands including lists as well as the "dictation" variable has been targeted. The video should demonstrate that this has been accomplished. Do note that you can use the "dictation" variable more than once and not just at the end of the command.
------------------------- |
|
|
|
![]() |
|
I'm curious here Kevin so please answer me this. Sure your 40 or 50 commands are recognised and never miss a beat but I am more interested in the misrecognitions.What I mean by that is if you have a command called, for example, "Simply Retrain Dragon" and you say "Simply derail Dragon", "Simply retail Dragon" are the commands then misrecognised using similar phrases?
The principal advantages of speechlib libraries are that they contain access to many other audio and speech features than the .net libraries. However if you don't need them no point going any further than you have as you say. Also are you using the shared recogniser? Does WSR load? In which case I would expect them to be accurate but then you have WSR loading every time which is a bit of a disadvantage in many situations. ------------------------- |
|
|
|
![]() |
|
Hi Lindsay, here are answers to your questions. I’m not sure what you and Lunis (or anyone means by “misfire” or “misrecognition.” (I learned misfire from one of Lunis’ jokes that made me laugh.) Does misfire mean a false positive recognition of the wrong phrase, or a non-recognition of the right phrase? On this project, all the command phrases that I speak correctly have had a 100% success rate at being recognized correctly. All rejected phrases should have been rejected. (Usually when this happens, it is because I have uttered a phrase that is missing a whole word.) So, at this point with 50 commands in the pool, I have 0.000 worries about false positive recognitions (accepting the wrong phrase) and false negative rejections (rejecting the right phrase). That may change as I add more commands, of course. I have no idea how it will perform with 1000 commands in the list. Shared recognizer. No, I’m not using the shared recognizer (although Mark’s example did). I’m using the dedicated inproc engine. I don’t know if that triggers a WSR shared recognizer load operation – I don’t think so, at least visibly. I have “use speech recognition for your computer” turned off in the settings panel. I did fire up WSR / shared recognizer with the top center button/window showing, but I didn’t want it, so I disabled “WSR” in the settings. As far as I know, my little app just creates its own recognizer and gets on with the job. I’m very happy with it. I’m thinking at this point that it looks like it will scale up for quite a bit before I encounter problems. If it helps any, I have the default confidence level set at .944, and I often see recognitions at .95, .96, .97, and even some at .98+. Once I remember seeing .997. Perhaps that has to do with the words in the commands, the power of the recognizer, or (blush) my superb and articulate diction skills (not!) like Rob and RW. I hope that answers your questions. If not, just speak up. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Wow! That video looks great. With multiply-embedded open-ended dictation this will be a must-have.
------------------------- -Edgar |
|
|
|
![]() |
|
Thanks for the feedback. Pointing out facilitating the multiple instance dictation variable was done on purpose, knowing that this would attract attention for sure, but effectively, it will be just the tip of the iceberg. There will be a lot more additional features coming along with it, and most of all, the overall infrastructure will be a lot more responsive and robust than any previous KnowBrainer implementation. It all will be thanks to the CFG grammar capabilities provided by the underlying SAPI 4 COM objects, thus allowing to "talk to the horses mouth" directly rather than going through the previously used SDK. ------------------------- |
|
|
|
![]() |
|
I'm with Edgar on the RW video - flawless, as usual. I admit that I did not understand probably 75% of the video. But what I could easily see was the software discipline of building a whole infrastructure and app for testing whatever was being tested. And RW's dictation never missed a beat, smoothly dictating content and "click this" and "click that." Very smooth, like Rob's videos - maybe it's a European quality standard that I have not reached yet. Anyhow, it looks impressive. I'm sure the next KB will be smoking fast and reliable if this demo is any indication. RW, may I ask a question? Why did you go with SAPI 4 instead of SAPI 5? (I don't know the difference.) Thanks.
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Rüdiger answered your SAPI 4 question a few days ago, in this thread. https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=12&threadid=36307&enterthread=y SAPI 4 is what Dragon uses. |
|
|
|
![]() |
|
Thanks Matt! I must have missed it. That makes perfect sense, to use the SAPI that Dragon uses. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
First off, thanks Kevin, for the positive feedback. I am fully aware that I should have made the video more obvious in terms of providing pointers and hints as to what it is all about, but I didn't, and that was the reason why I called it "a quick and dirty demo", hoping the message may make it through all the same. For the least, it may have become obvious that the new KnowBrainer and the previous KnowBrainer will be like night and day. ------------------------- |
|
|
|
![]() |
|
Hi RW, thank you for the extra info above - it was excellent and explained a lot. I have no doubt that the next KB will "do things right" regarding loading in commands with your new grammar object. IMHO, anyone who builds a detailed app for testing (as shown in your video) is very serious about getting things right and measuring performance carefully. I'm also sure that KB users will definitely notice and appreciate the effects of your new and better software architecture.
UPDATE: After a fresh reboot, with UCTray running from the Start folder, Dragon took about 10-11 seconds for the DragonBar to appear, and another 9-10 seconds for the microphone ready light to appear (that probably includes loading UC DVC commands, my 200 commands, and my profile with vocabulary). 20 seconds is much more acceptable.
I wonder what slows it down so much after I kill it and when I have my apps open (Outlook, Chrome (a huge time hole if Dragon starts looking at all those tabs)). I'll have to do more experimenting.
UPDATE 2: I exited Dragon after taking those measurements, started Chrome with 50+ tabs, and restarted Dragon again. The DragonBar appeared at 8 seconds, and the ready microphone at 14 seconds. Wow. I'd love to have that all the time!
Then I killed Dragon and the UC tray process, leaving Chrome open, opened up VStudio 2022, opened up Outlook, and booted Dragon again. The DragonBar appeared at 8 seconds, and the ready light appeared at 14 seconds. From this I conclude that none of Chrome, VStudio, Outlook, UCTray, or my kill process procedure affect the Dragon startup time. It must be something else. I will watch out and report back if I find out anything new. ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
|
|
![]() |
|
Totally true. And by the jabs!
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
To get back to the original topic that Edgar posted long ago, I thought I would add this C# code to show to talk to the Dragon engine control. It works, sort of.
I tried a C# example to hook up to the RecognitionMimic method. It worked, sort of. Looking at the recognition history, Dragon recognized the "wake up" string as a command, and recognized Hello and there as strings that it inserted into Notepad.
Dragon completely failed to recognize the multi-word string in my example, although it works for the VBA example above. I looped and split a multi-word into individual words, and Dragon stopped recognizing the words after only FIVE words. No matter what I did, I could not get it to recognize six words. Very strange.
I continued to play with the looped words, varying them. It turns out that one of the embedded words was also a DVC command - the sixth word, which explains why Dragon would only recognize and insert the first five words into the Notepad buffer. So that issue was solved.
But the string with multiple words in one string still failed completely. (In another post, Chuck said that (historically) HeardWord only took lowercase individual words, which may map on to RecognitionMimic in some way. Maybe that is the problem there.
Anyhow, someone might be able to get something out of this C# example. You need to include the "using DNStools;" line in your project. If anyone has any ideas or comments on the slowness of this approach in C#, feel free to comment. I ran this code snippet from a tiny Windows Forms project under the Visual Studio debugger.
------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
Vocola 2 mic off, FWIW... offer = #turn off microphone |
|
|
|
![]() |
|
I have been experimenting with trying to turn the microphone on and off in C# (it's part of a bigger idea to solve the server busy problem with visual studio-using the application for enabling/disabling Select-and-Say in window presentation controls work but then you cannot dictate in text boxes). using System; using System.Windows.Forms; using DNSTools; namespace Dragon_API_test { public partial class Form1 : Form { public Form1() { InitializeComponent(); }
private void button1_Click(object sender, EventArgs e) { var engine = (IDgnEngine)new DgnEngineControl(); engine.RecognitionMimic("microphone off", 0);
} } } However when it gets to the recognition statement, it reports that the class is not registered. Yes I could try registering the DLL with regsvr32 but wanted advice first. ------------------------- DPI 15.6.1 |
|
|
|
![]() |
|
However when it gets to the recognition statement, it reports that the class is not registered. from earlier in this thread: private static void TextToSpeech(string pSpeakThis) { DgnMicBtn gDgnMic = new DgnMicBtn(); gDgnMic.Register(0); try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOff; SpeechSynthesizer speechSynthesizerObj = new SpeechSynthesizer(); speechSynthesizerObj.Speak(pSpeakThis); speechSynthesizerObj.Dispose(); } catch (Exception exception) { TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone off. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } try { ((IDgnMicBtn)gDgnMic).MicState = DgnMicStateConstants.dgnmicOn; } catch (Exception exception) { TimedMessage("Dragon®’s DgnMicBtn failed when turning the microphone on. The error message is:" + Environment.NewLine + exception.Message, "Dragon® ERROR"); } gDgnMic.UnRegister(); } and I think that with Dragon 16 the very last line is no longer valid and needs to be removed/commented out. I think that the two first statements of the method are what you are missing. ------------------------- -Edgar |
|
|
|
![]() |
|
The error doesn't relate to DgnEngine not being registered in COM. In order to call (almost) any method on DgnEngine you first have to call its Register() method.
hth mav |
|
|
|
![]() |
|
For what it is worth, here is the code that I use now to turn the Dragon mic on. Thanks to Ed for getting me started with his example a year ago or more. Thanks Ed!
static void UxDragonMicrophoneOn() { // turn the mike on after Dragon is available try { var mikeButton = (IDgnMicBtn) new DgnMicBtn(); mikeButton.Register(0); mikeButton.MicState = DgnMicStateConstants.dgnmicOn; mikeButton.UnRegister(); } catch (Exception ex) { var m = $"{ex.Message}"; Debug.WriteLine(m); FormAppendTextError(m); } } ------------------------- Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse. |
|
|
|
![]() |
|
So I can see why all the fuss if you are building a speech-enabled application, but if all you need to do is run a command to turn the mic on, off or put it in asleep mode, a simple vbs script passing the micstate as a parameter should get you all you need: -------------------------
|
|
|
FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.