KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: Univoice accuracy
Topic Summary: how can I boost the accuracy of Univoice?
Created On: 08/09/2010 12:10 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 Univoice accuracy   - annasgramma - 08/09/2010 12:10 PM  
 Univoice accuracy   - GDS - 08/09/2010 01:30 PM  
 Univoice accuracy   - annasgramma - 08/09/2010 01:57 PM  
 Univoice accuracy   - Chucker - 08/09/2010 11:41 PM  
 Univoice accuracy   - annasgramma - 08/10/2010 08:27 AM  
 Univoice accuracy   - annasgramma - 08/10/2010 11:34 PM  
 Univoice accuracy   - omstefanov - 10/20/2010 10:12 AM  
 Univoice accuracy   - Lunis Orcutt - 08/09/2010 10:49 PM  
Keyword
 08/09/2010 12:10 PM
User is offline View Users Profile Print this message


annasgramma
Member

Posts: 36
Joined: 01/22/2007

I read somewhere in the forums that Univoice doesn't work so well for physicians. My doctor, who has NOT come to my house and trained his voice with DNS 10 or Univoice, talks very fast. He really speeds up in his physical exam portion of the reports. I have resorted to echo dictating his reports, but that is NOT helping my time associated with how long his reports take. I can figure on at least a dictation minute length X 3 to get my work done. In other words, a 30 minute dictation will take me at least 1 1/2 hours to do. Even straight typing would take a lot longer than 30 minutes! When I boot KB and choose a file for transcription in DNS 10, the accuracy is atrocious. Whole phrases are missed and sometimes what does get transcribed is hilarious. Totally wrong, but hilarious phrases, nonetheless. 

 At this point, even though I have had KB for quite a while (KB 2008 is my latest version) I feel that the program is not very helpful to me and I'm frustrated. I am going to be upgrading to DNS 11 very soon. If I echo dictate his reports with DNS (KB is not running), the accuracy is really great - well over 98%. Am I expecting too much from KB? Please give me some suggestions on what to do. I am totally frustrated at this point and wondering where to go next. 

 Thanks for listening and for helping me out. Your input is appreciated. 

PS I still can't get DNS to type 5 feet 6 inches as opposed to 5'6" and the word degrees as opposed to the degree symbol. GRRRR!!



-------------------------
The early bird may get the worm, but the second mouse gets the cheese in the trap. (Larry The Cable Guy)
 08/09/2010 01:30 PM
User is offline View Users Profile Print this message

Author Icon
GDS
Top-Tier Member

Posts: 747
Joined: 01/16/2009

annasgramma,

Regarding your PS- symbols vs. words is an option set in your formatting preferences. On the DragonBar, select Tools then Formatting. Uncheck the box for "units of measure." Then Dragon will default to typing feet instead of '. Numbers are formatted depending on your selection in the drop down menu "Numbers, if greater than or equal to "

I have mine set to 10, in accordance with AP style. For instance, Dragon types "three blind mice" and "12 Angry Men" based on my number formatting preferences.

Getting Dragon to default to 5'6" and "5 feet 6 inches" is a matter of a having the right check boxes checked.

Hope this helps!



-------------------------

Eric Wright At work: DNS 12 Pro. At home: DNS 11.5 Pro,  KnowBrainer 2011, and Utter Command by RedStart Systems; Dragon Dictate 3 for Mac


 


Appetite for Dictation - My Blog

 08/09/2010 01:57 PM
User is offline View Users Profile Print this message


annasgramma
Member

Posts: 36
Joined: 01/22/2007

Thanks. The degrees thing is solved. I appreciate your help.


-------------------------
The early bird may get the worm, but the second mouse gets the cheese in the trap. (Larry The Cable Guy)
 08/09/2010 11:41 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9667
Joined: 10/10/2006

annasgramma,

First of all, you're using DNS 9 Preferred.  In addition to the fact that you have to distinguish between KnowBrainer and UniVoice as Lunis points out, UniVoice was created because at the time the corpus of speech data that Nuance had was basically based on broadcast news, professional, technical/scientific, and political speech data.  There was very little in the realm of end-user speech data, particularly with regard to the range from teenagers to senior citizens.  At that time it was an excellent, more or less, speaker independent Acoustic Model.  However, much has changed over the years since UniVoice was first released:

1.  Through its acquisitions as well as having collected a huge corpus of end-user speech data, and beginning with DNS 9, Nuance began to introduce its own more or less speaker independent Acoustic Model.  What started off in the old days as having approximately a few thousand speakers and a couple of million words associated with such, has progressed to the point now with the speech data collected from the iPhones and Blackberries, along with data collection through DNS itself from end-users, the corpus of speech data has increased to a level of about several million speakers and many many millions of associated text documents comprising tens of millions of words.  The more or less speaker independent Acoustic Model in DNS 10 supports many users of all ages with no training out-of-the-box and virtually 97 to 98 or 99% without having to train a user profile.  Some users still have to go through training because of heavy accents or simply because of the fact that an untrained Acoustic Model doesn't work well for them.  But such users are the exception now rather than the rule.  As a result, UniVoice is not as viable as it was when it was first introduced.

2.  Since DNS 10-10.1 has been in use, Nuance has continued to acquire huge amounts of speech data in addition to what it already has relative to DNS 10.  This leads to the speculation that the speaker independent Acoustic Model will be even better in DNS 11.  However, we'll all just have to wait and see.

3.  Given that you're using the Preferred version of DNS 9, you're lacking a significant amount of medical vocabulary.  This basically means that any transcription of any audio file, no matter how clear and good it is, is not going to give you very good accuracy if the speaker is using medical terminology.  That's just a given.  In other words, my guess is that no matter how good your recordings are, your vocabulary and Language Models simply don't support high accuracy when transcribing from a medical dictation standpoint.  I'm not surprised that you're getting the results that you are.  In short, is trying to win the Indianapolis 500 with a 100 hp Volkswagen engine.  It just ain't going to happen.

4.  In addition, remember that audio recordings are one step removed from direct dictation using a microphone and a soundcard.  They are going to be anywhere from 2 to 5% less accurate, and sometimes more if the quality of your audio recording and digital recording equipment is less than optimal.

5.  If you're going to transcribe audio recordings from your doctor, then create a DVR profile and train it.  In addition, you need to add a significant number of documents through the vocabulary builder that contain the medical terminology so that you can add the medical terms that are necessary for proper transcription of medical dictation.  You're not using the medical version, so you don't have either the Language Model or the vocabulary that the medical version  has.  In fact, the vocabulary that you're using regardless of whether it is a DVR-based or standard user based, is a general vocabulary, which is very good for dictating e-mails and performing standard dictation, but not particularly good when you dealing with medical transcription.

The bottom line is that if you want audio transcription to give you good accuracy, then you need the following:

1.  You need to have the best DVR recording equipment.  This is one area where you cannot cut corners or scrimp or buy cheap.  If you do, continue to expect poor results.

2.  You have to set up and properly train a digital voice recorder user profile.  Using a standard dictation user profile isn't going to cut it, with or without training.  You might get reasonably good results if the equipment that you're using is the best you can get.  That is, if your profession relies on accurate transcription from audio recordings, then you got a have the best equipment.  Otherwise you're spitting in the wind.

3.  You need to make sure that you have a good list of medical terms and vocabulary that you need in order to have them properly recognized.  If words don't exist in the Active Vocabulary, they're not going to get recognized properly, and the misrecognitions will likely be as bizarre as you have been getting.  You need to add those words to the vocabulary and you need to analyze documents for writing style to make sure that the Language Model conforms to the type of audio transcription that you're doing.

If you don't follow these guidelines, then the only alternative you have is to continue echo dictating.  So, the choice is yours.  How much would you like to increase your productivity and your accuracy with regard to transcribing your doctor's audio recordings.  If you follow the guidelines above, you should be able to get at least 95% accuracy when transcribing the doctor's audio recordings.  Otherwise, you just spinning your wheels.

Having been part of the original DNS development team and having worked with QA on developing profiles for digital voice recording, there are many things that you can do to improve the accuracy of audio transcription.  For example, when training a DVR profile, if you have a recording made by the Dr. that is at least 30 minutes long, or more, you can use that as a training script.  You don't have to follow absolutely the training scripts provided when creating such a profile.  The only thing that DNS is doing when it is creating an Acoustic Model for a DVR user profile is listening to the speech and creating a voice pattern (frequency spectrogram) coupled with the underlying pronunciations of each word (triphones).  This is what is used to transcribe any dictation from a digital voice recorder.  It doesn't matter what the script is, it only matters that it is the speaker for whom you will be transcribing audio files and that it is at least 30 minutes long.  I've demonstrated this to hundreds of clients over the years.  Just follow the instructions for creating such a profile and when it comes time to play the recording, use the recordings made by your doctor and forget about any specific script recommended or suggested by DNS.  You don't have to go that route and it works exceedingly well.  For example, I took a very high quality recording of Dr. Deepak Chopra for one of his one-hour lectures, created a new DVR user profile, selected Indian English as the, what is now called, accent, language and previous versions, and use his lecture to train that profile even though I had selected Dave Barry in Cyberspace as the script.  After creating that profile I then ran that audio recording back through the user profile that I created and the accuracy was virtually 93%.  After correcting and updating the profile, I got an accuracy level of 97.6% read transcribing that same audio recording.

The bottom line is that if you're going to continue trying to swat flies with a sledgehammer, you're not going to be very successful.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower® Ultimate

Education is when you read the fine print. Experience is what you get if you don't. - Pete Seeger



-------------------------

 08/10/2010 08:27 AM
User is offline View Users Profile Print this message


annasgramma
Member

Posts: 36
Joined: 01/22/2007

Thanks, Chucker, for your reply. It really was very helpful. I have several recordings from my doctor that are well over 30 minutes long. I'll work with it and see what I can do. I appreciate your explanation and helpful suggestions. 

-------------------------
The early bird may get the worm, but the second mouse gets the cheese in the trap. (Larry The Cable Guy)
 08/10/2010 11:34 PM
User is offline View Users Profile Print this message


annasgramma
Member

Posts: 36
Joined: 01/22/2007

Dear Chucker, you are the BEST!! You provided just the support and instruction that I needed. I took your suggestion and created my doctor's profile as a DVR user and the results were nothing short of phenomenal! I trained two dictations, 38 minutes and 58 minutes each. Then I corrected them and updated the user file. I printed out the corrected file and then re-transcribed the longer of the two files and compared the two for accuracy. There are minor flaws, such has DNS cannot hear words that are barely audible which he always skips over (The claimant) and (forwarded to the office - he says forward the office and I fill in the blanks), etc. Spinal levels (C2-C3) are a problem too, as he will say C23 and DNS transcribes what it hears. However, these are minor things that I can live with. In time, maybe DNS will pick up on these nuances and begin putting them in. The long and short of it is that the reply you sent to me was tremendously helpful and I will be forever grateful. I now feel like I'm using DNS to its fullest capabilities, but I also get the feeling that I've only scratched the surface. My hands thank you too. Carpal is now at bay. Blessings on you, my friend. Thanks again.

PS I ordered DNS 11 today from KnowBrainer. I have been using 10.1, but I always try to get the latest and greatest when I can. 



-------------------------
The early bird may get the worm, but the second mouse gets the cheese in the trap. (Larry The Cable Guy)
 10/20/2010 10:12 AM
User is offline View Users Profile Print this message


omstefanov
Junior Member

Posts: 6
Joined: 11/12/2006

Dear annasgramma,

If your doctor says things like C23 (spoken as "C-two-three" when you would want to see "C2-C3", you could add a "word" to the DVR user's vocabulary with the written form [C2-C3] and a spoken form matching whatever your doctor tends to say, for example [see to three] or [C two three]. You can do similar things with any other non-standard "shortcuts" your doctor tends to make. You will find, over time, that if he tends to use such shortcuts repetitively, this could further cut down the time you take to transcribe his dictation.

Olaf-Michael Stefanov (omstefanov)

Dictated with Dragon NaturallySpeaking Professional version 11.0, via the "Dictation Box" using a Philips SpeechMike Air connected to an Acer TravelMate 6292 equipped with an Intel® Core™2 Duo CPU  T8300 @ 2.4 GHz, with 3 GB RAM, running Vista Ultimate, SP2.

 08/09/2010 10:49 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22580
Joined: 10/01/2006

Quote:
When I boot KB and choose a file for transcription in DNS 10, the accuracy is atrocious. Whole phrases are missed and sometimes what does get transcribed is hilarious.


It sounds like you might be confusing UniVoice with KnowBrainer 2008 (soon to be KnowBrainer 2010) which is strictly command and control software that has nothing to do with accuracy. UniVoice is a pretrained user profile whose usefulness has come and gone. We are in the process of retiring UniVoice as we don’t plan on releasing a Ver. 11 since we don't believe it will add anything to the latest version of NaturallySpeaking. It would appear the Nuance has managed to build their own version of UniVoice; at least to some degree.

-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

KnowBrainer Speech Recognition » KnowBrainer Software and Support » Univoice accuracy

Statistics
27356 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 2 users logged in.
The most users ever online was 2028 on 04/05/2013 at 07:36 PM.
There are currently 132 guests browsing this forum, which makes a total of 134 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2013 FuseTalk™ Inc. All rights reserved.