KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: Hands-free coding opportunity
Topic Summary: Hands-free coding opportunity
Created On: 11/10/2022 08:21 AM
Status: Post and Reply
Linear : Threading : Single : Branch
 Hands-free coding opportunity   - MDH - 11/10/2022 08:21 AM  
 Hands-free coding opportunity   - monkey8 - 01/24/2023 09:56 AM  
 Hands-free coding opportunity   - kkkwj - 01/25/2023 04:50 PM  
 Hands-free coding opportunity   - Lunis Orcutt - 01/25/2023 07:18 PM  
 Hands-free coding opportunity   - monkey8 - 01/27/2023 05:18 AM  
 Hands-free coding opportunity   - kkkwj - 01/28/2023 12:37 AM  
 Hands-free coding opportunity   - kkkwj - 03/19/2023 02:16 PM  
Keyword
 11/10/2022 08:21 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2321
Joined: 04/02/2008

 01/24/2023 09:56 AM
User is online View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4171
Joined: 01/14/2008

Thanks for posting Mark, I believe relating to the same project I was sent an email regarding technically previewing voice coding via:


Hey, GitHub! https://urldefense.com/v3/__https:/githubnext.com/projects/hey-github/__;!!L7QdHkQ!kfq8kewzQIEv9PwEIhSGIB1g9eAnQDPmfWyEEMhB1JH-2_PsrVT2ZtawHQrJDA8_USrcOJgezYH-vwU8xdU$


The project team are looking for anyone prepared to technically preview, just follow the link above.



-------------------------

 01/25/2023 04:50 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Wow, I watched the demo and was amazed. There's almost no obvious correlation between the spoken words and the result code tokens in the buffer. I imagine that there is a HUGE amount of AI/machine learning going on behind the scenes. The spoken phrases are extremely high-level yet they result in very low-level code being written. Most amazing.

FWIW, I also saw one or two demos of ChatGPT writing code. The guy asked for something at the function-description level, and GPT produced a complete coded SQL procedure (I think in Python) to manipulate a database and generate a report.

FWIW again, I remember reading a paper back in 1982 or so, when expert systems and AI and high-level languages were a thing. The retrospective looked at 25 years of trying to develop higher-level languages to enable programmers to get out of the low-level syntax weeds. The paper concluded that their last language attempt (fourth-generation languages) failed because it was so high that they had to write a translator program to translate the fourth generation syntax and meaning into something a human could understand. I concluded that the best impedance match to the human mind was with third-generation languages that we have today (C, C#, Python, Basic, Javascript, etc. there are many third generation languages).

Still, the Github project looks interesting, for sure.

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 01/25/2023 07:18 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 40716
Joined: 10/01/2006

Note that the patented KnowBrainer VerbalBasic command collection utilizes WYSWYS (What You See Is What You Say) AI. In our opinion, WYSWYS should be faster and easier in many situations 



-------------------------

Change "No" to "Know" w/KnowBrainer 2022
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1



 01/27/2023 05:18 AM
User is online View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 4171
Joined: 01/14/2008

Originally posted by: kkkwj Wow, I watched the demo and was amazed.

 

Kevin I'm hoping one of the github team may add to this thread very shortly. Although up until now most of us have spoken about voice coding as a necessity, mainly due to mobility restrictions or impairments like RSI, it's becoming more of a reality for all coders due to its speed and effectiveness. See the following for example:

 

https://spectrum.ieee.org/programming-by-voice-may-be-the-next-frontier-in-software-development

 

Anyway hopefully as I say one of the github team may add to the thread shortly.

 

Lindsay



-------------------------



 01/28/2023 12:37 AM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

Hi Lindsay, thank you for the link (and for the Pcbynumbers demo - you sure speak clearly!). I read the IEEE article about Serenade (I looked at it but didn't try it) and Talon (I have tried it). Talon is amazing - I was able to do really low-level character editing stuff at reasonable speed with only a day of reading and practicing. It has about zero latency, so you can see the characters appear as you speak them (unlike Dragon). I even told Talon to use the Dragon recognizer, which worked fine. Ryan has done an amazing multiplatform job with Talon. In the end, I stopped using Talon because I couldn't get it to reliably recognize my couple of thousand of non-parameterized commands (that Dragon recognizes easily) and my many dozens of dynamic (free form parameter _ListVar1_ type commands). It was just not ready (or I wasn't ready) to make the technology join possible.

I can type fine with my hands, so my work has focused on higher-level productivity operations instead of action-by-number operations (like PcByNumber or VoiceComputer, probably the lowest level ops that are *really* required for handsfree computing). If I don't use my hands, ShowFlags and PcByNumber operations are the first thing that I need. Talon is superb at voice-driven editing operations (I saw Ryan dictate a page of existing code that had been typed, and he could dictate it as fast as it was originally typed). But again, Talon is fairly low level in that operates on visible characters and clickable GUI elements, etc.

My work has mostly been to create a customizable, extensible, documentable, xml-file-configurable/sharable, scalable software architecture that can work reliably with static and dynamic (parameterized) commands and plugins. I don't send many keystrokes to apps yet; instead, my plugins interact with Office apps through COM operations that work with the internal app data structures directly. This avoids all the Wait 0.2 statements that you need with scripting, for example.

I suppose the point of this post is that I see several different areas for voice operations/programming/productivity. First are the really low level operations that depend on GUI elements (PcByNumbers, VoiceComputer). Next are low-level or fine-grained editing operations (for programming, many are character-level operations) in VStudio Code. Third, there are document/writing/word-sentence operations (Dragon, KB). And then there are higher-level scripting operations (Dragon, KB, AutoHotKey, etc.) that span all the previous areas and provide user-customizable operations. And my stuff is even higher-level than those because I work mostly with the internal data structures of apps and mostly address things by name or concept instead of by visible numbers on the screen. For the record, I think numbers are superior (or at least immediately more convenient because no memory is required to remember the name of operations). But I didn't want to spend my programming time reinventing the number wheel, so I spent it on other kinds of development.

I think in the end that true voice productivity spans many different domains, from low-level click on operating system objects, to editing characters in code effectively, to using all those tens of apps of Rob's that each solve a specific problem, to scripting larger named operations, to loading big plugins that provide named commands for specific app operations or that solve specific large-scale operations. Where does AI and ChatGPT fit into this picture? Well, you might think of ChatGPT as a kind of lookup engine that finds the piece of code that you're looking for. My system would approach it like a library lookup or search like Google would, whereas GPT generates a different document each time. And of course GPT is a generate once system, not an editing system, and it cannot work with apps or numbered desktop GUI objects, etc.

Probably a voice-computing/programming person could benefit from tools at every one of the levels that I listed above. I wonder what the crew here would allocate $2M in venture funding toward, if such funding was available. (In the meantime, we all create our little programs/systems to solve problems that fit within our resource constraints!)

-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

 03/19/2023 02:16 PM
User is offline View Users Profile Print this message

Author Icon
kkkwj
Top-Tier Member

Posts: 1123
Joined: 11/05/2015

I have been playing with Talon recently, using Dragon as the Talon recognizer. It's a pretty wonderful combination, in my view. You can run Talon in command mode and dictate phrases using Dragon's recognition ability. Using the Dragon as the recognizer, it feels somewhat slow compared to the normal Talon superspeed, low-latency response time. But using Dragon as the recognizer solves the biggest drawback of Talon (IMHO) which is free-form dictation that is not as good as Dragon's. I have also found the Talon community on their Slack channel to be very positive, helpful, and knowledgeable. And in particular, it is really, really simple to define your own commands and key bindings to send keys. Here are some examples:

: key(ctrl-s)
save my document: key(ctrl-s)
save my document as: key(shift-ctrl-s)
press : key(keys) - user.keys is a list of possible keys to press on the keyboard

Anyhow, I think Talon is worth a try for anyone who wants to augment their voice experience or push it in a new direction by trying out some new technology. Being more or less open source and free, Talon is a bit thin and disorganized as far as documentation and training products go; but on the other hand, the Slack channel runs 24x7 with helpful, knowledgable people. And you can't get that kind of support from Nuance or Microsoft, for sure!


-------------------------

Win10/11/x64, AMD Ryzen 7 3700X/3950X, 64/128GB RAM, Dragon 15.3, SP 7 Standard, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Amazon YUWAKAYI headset, Klim and JUKSTG earbuds with microphones, excellent Sareville Wireless Mono Headset, 3 BenQ 2560x1440 monitors, Microsoft Sculpt Keyboard and Logitech G502 awesome gaming mouse.

Statistics
32528 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 1 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 360 guests browsing this forum, which makes a total of 361 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2023 FuseTalk™ Inc. All rights reserved.