KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: Question about new dictation source
Topic Summary: Is writing style adapatation shared by a new dictation source?
Created On: 04/12/2008 05:44 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 Question about new dictation source   - pmaddern - 04/12/2008 05:44 PM  
 Question about new dictation source   - Lunis Orcutt - 04/12/2008 08:48 PM  
 Question about new dictation source   - pmaddern - 04/13/2008 04:26 AM  
 Question about new dictation source   - Chucker - 04/13/2008 06:46 AM  
 Question about new dictation source   - pmaddern - 04/13/2008 07:00 AM  
 Question about new dictation source   - R. Wilke - 04/13/2008 08:49 AM  
 Question about new dictation source   - Chucker - 04/13/2008 09:47 AM  
 Question about new dictation source   - R. Wilke - 04/13/2008 10:15 AM  
 Question about new dictation source   - Chucker - 04/13/2008 10:38 AM  
 Question about new dictation source   - matthewls - 04/13/2008 01:27 PM  
 Question about new dictation source   - R. Wilke - 04/13/2008 02:02 PM  
 Question about new dictation source   - matthewls - 04/13/2008 02:06 PM  
 Question about new dictation source   - Lunis Orcutt - 04/13/2008 03:39 PM  
 Question about new dictation source   - R. Wilke - 04/13/2008 04:54 PM  
 Question about new dictation source   - Lunis Orcutt - 04/13/2008 06:07 PM  
Keyword
 04/12/2008 05:44 PM
User is offline View Users Profile Print this message


pmaddern
Senior Member

Posts: 104
Joined: 10/29/2006

I'm helping someone get started with DNS Professional 9.5 on a Windows XP Professional PC.

He has created a headset profile. Next week, I will guide him through setting up a profile for an Olympus DS 4000 digital recorder.

If we create a profile for the Olympus DS 4000 as a "new dictation source" for the headset profile (selecting "digital recorder using sound files - wav., mp3, wma - on disk), Dragon will prompt the user to read at least 15 minutes of recorded text and then he will be all set.

 I understand that the advantage of doing this is that if new vocabulary and commands are added to the headset profile, they are automatically updated into the digital recorder profile. And I understand that the reverse is true -  if new vocabulary and commands are added to the DS 4000 profile, they are automatically updated into the headset profile.

 But what about when we run the tool for the headset profile to "Add words from your documents to the vocabulary" which finds new words and adapts to the user's writing style? Is the adaption a user's writing style updated and "shared" by the Olympus DS 4000 recorder? And is this true if this tool is run in the Olympus DS 4000 profile i.e. Is any adaption to the user's writing style as a result of processing documents through the DS 4000 profile updated and "shared" by the headset profile?

If so, it would seem to me to be beneficial to operate this way i.e. setting up the DS 4000 as a new dictation source for the headset profile, rather than maintaining two independent user profiles.

Peter



-------------------------
DNS Professional 12 UK English version, Windows 8 64-bit with 8 Gb RAM plus 8 Gb ReadyBoost, Audio Technica ATH-COM2 headset microphone/Buddy 7G USB sound adapter, VoicePower Ultimate. Skype user name peter.maddern www.speechempoweredcomputing.co.uk
 04/12/2008 08:48 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

To specifically answer your questions, the answer is YES to all; including running the Vocabulary Editor for adapting the writing style to the DS-4000 digital recorder user profile. However, keep reading...
 
1 of the advantages of DNS Pro (which is not available in DNS Preferred) is that you can export the vocabulary, via the Manage Vocabularies menu, from your microphone user profile. Since the microphone user profile will probably see more action than the digital recorder profile, we recommend saving steps by running the Vocabulary Editor on the microphone user profile, exporting the vocabulary and then importing the vocabulary into the digital recorder user profile. You can also repeat this process from time to time to keep up to date. Note that importing a vocabulary is a bit of a misnomer. When you import a vocabulary (as a .top file) you are completely overwriting the existing vocabulary. You are not merging 2 vocabularies. One of the advantages of exporting and importing the same vocabulary is being able to maintain special properties and keep removed words properly removed. This can eliminate a good deal of your work when making vocabulary changes to both user files.
 
1 last recommendation: With the release of the DS2 (the new improved DSS Pro algorithms) technology for the Phillips 9600 and a soon-to-be released Olympus DS-5000, if it isn't too late, we recommend that your end user return his Olympus DS-4000. The reason why we bring this up is that the newer DS2 technology is so accurate that it is no longer necessary to maintain a separate digital recorder user profile. Granted, we still think  a separate digital recorder user profile is slightly more accurate but unless you're doing a great deal of digital transcription, it's simply a lot easier to maintain a single user profile and the latest crop of professional digital recorders work well right over a standard microphone user profile.


-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

 04/13/2008 04:26 AM
User is offline View Users Profile Print this message


pmaddern
Senior Member

Posts: 104
Joined: 10/29/2006

Well if the case is that when you add a new Dictation Source, your custom vocabulary, custom commands and language models are the same as the base user dictation source, then this seems to me to be a strong argument to set up a new recorder profile as a new dictation source rather than create a brand new, stand - alone recorder user profile.

Apart from the initial need to give data to the new recorder profile for a its new accoustic model by reading at least 15 minutes of your printed out enrollment text, we agree that the new recorder profile gets the "benefit" of custom words/commands and (very importantly) the language model improvements from the base headset profile and vice versa.

This inter - profile synchronisation between a "new dictation source" for a new digital recorder and its base headset profile looks like a real plus and looks like its the way to go.

Peter

 



-------------------------
DNS Professional 12 UK English version, Windows 8 64-bit with 8 Gb RAM plus 8 Gb ReadyBoost, Audio Technica ATH-COM2 headset microphone/Buddy 7G USB sound adapter, VoicePower Ultimate. Skype user name peter.maddern www.speechempoweredcomputing.co.uk
 04/13/2008 06:46 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Quote:

Well if the case is that when you add a new Dictation Source, your custom vocabulary, custom commands and language models are the same as the base user dictation source, then this seems to me to be a strong argument to set up a new recorder profile as a new dictation source rather than create a brand new, stand - alone recorder user profile.

Apart from the initial need to give data to the new recorder profile for a its new accoustic model by reading at least 15 minutes of your printed out enrollment text, we agree that the new recorder profile gets the "benefit" of custom words/commands and (very importantly) the language model improvements from the base headset profile and vice versa.

This inter - profile synchronisation between a "new dictation source" for a new digital recorder and its base headset profile looks like a real plus and looks like its the way to go.

Peter,

Your assumptions are only partially correct.

First, each dictation source creates its own Acoustic Model.  The Acoustic Model from the standard microphone input user and that from the digital voice recorder dictation source do not mix and match.  They are distinct and never the twain shall meet.  DNS does this because the developers know that you can't mix Acoustic Model's.

Second, it to dictation sources will share a common custom Vocabulary, but they will not share the same Language Models.  Simply doesn't happen.  You can also share the same MyCmds.dat, but each dictation source retains its own Acoustic Model and Language Models.

Third, when using the DVR dictation source, you can't use your microphone to correct your transcriptions.  The standard microphone/headset input is not available to the DVR dictation source.  You still have to switch dictation sources.

Try it out and take a look at your user profile and you will see.

On the other hand, I agree with you that, in terms of what you want to share, the to dictation sources will share those things that it will have in common and can have in common.  It will not share those components that it cannot share or are not compatible with one another.  This makes it a more efficient way of dealing with the standard user profile vs. creating a separate DVR user profile, but you will not get the crossovers where you think you will.  This also works very nicely in situations where the user is using DNS Preferred because, as Lunis points out, you can't export and import vocabularies in DNS Preferred.  Nevertheless, creating to dictation sources in this manner will not share apples and oranges.  They will only share apples and apples, or oranges and oranges.

Chuck Runquist
Former Dragon NaturallySpeaking SDK & Senior Technical Solutions PM for DNS

"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." -- Mark Twain



-------------------------

 04/13/2008 07:00 AM
User is offline View Users Profile Print this message


pmaddern
Senior Member

Posts: 104
Joined: 10/29/2006

Chuck

I understood (as stated in my post) that the new recorder doesn't share the Accoustic model of the headset base user.

I also understood that the recorder "new dictation source" shared the custom vocabulary and commands created in the headset base user. 

I guess I thought that the language model in the headset base user profile (from processing your concatenated documents so it adapts to your writing style) was also shared. This was my original question and you have pointed out that the language model is NOT shared.

So, in other words, from what you've said, I assume that anyone who creates a new user as a "new dictation source" piggy - backing onto an existing user profile should still employ the "Add words from your documents to the vocabulary" tool in the new recorder profile.

Thanks for the clarification.

 Peter 

 



-------------------------
DNS Professional 12 UK English version, Windows 8 64-bit with 8 Gb RAM plus 8 Gb ReadyBoost, Audio Technica ATH-COM2 headset microphone/Buddy 7G USB sound adapter, VoicePower Ultimate. Skype user name peter.maddern www.speechempoweredcomputing.co.uk
 04/13/2008 08:49 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 4391
Joined: 03/04/2007

Chuck,

on this occasion I would like to ask two more questions.

First, from what I understood the vocabulary doesn't contain information with regards to the Acoustic/Language models. The vocabulary being kept in those files you can export and import in the Pro version, 6 files with different extensions (.top the one being referrenced via import), right?

If my understanding of this is correct, in which files are the Acoustic/Language models or parts of them stored?

Second, when it comes to what is called "user file corruption", is it a matter of something going wrong with one or two of these models only and which one, or is the vocabulary also concerned? When starting a new user from scratch and importing the vocabulary from the previous one (the whole vocabulary as described above, and not just the user defined words), can we be sure, that the the faults are not transferred?

This is something I've been wishing to know for so long, and I'm looking forward to an answer so much.

Rüdiger Wilke

 



-------------------------

Well, it's past the point where we can make any changes in the code, but we can still make changes to the Easter Egg!

 04/13/2008 09:47 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Quote:

First, from what I understood the vocabulary doesn't contain information with regards to the Acoustic/Language models. The vocabulary being kept in those files you can export and import in the Pro version, 6 files with different extensions (.top the one being referrenced via import), right?

If my understanding of this is correct, in which files are the Acoustic/Language models or parts of them stored?

None of them.  These files are pure Vocabulary.  There is no acoustic or language model information contained in any of these files.  The Acoustic Model is located in the following folder:

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking9\Users\\current\voice

the Acoustic Model is "acoustic.enh" along with its associated pronunciation tables.  This file also incorporates any general training.  In previous versions, you could not create a new user with the "Skip initial training for this user."  Because on installation there is no acoustic.enh file until you perform general training.  In DNS 9, Nuance has provided a speaker independent acoustic.enh, which can then be trained in addition, but which is based on nuances speech data.  When you send your speech data out to Nuance using the data collection tool, what you get back is both an updated set of Vocabulary files, along with a new acoustic.enh file and new Language Model files if they are updated using the data that you send to Nuance.  Nuance is data collection adaptation process is much more sophisticated than what is currently available in DNS via the Acoustic and Language Model Optimizer.  Nevertheless, this is also why some people report dramatic improvement and others report no improvement at all.

The Language Model files are located in the following folder:

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking9\Users\\current\General__container

lmo.slt = middle slot Language Model (professional versions only)
usr.slt = user slot Language Model
index.xml = index of documents analyzed for writing style using voctool (professional series) or Vocabulary Optimizer (Accuracy Center) all other versions.

When you export and import the Vocabulary using one of the professional versions, only the Vocabulary is exported and imported.  However, some of the files that you refer to contain your custom words, some of them contain the pronunciation tables for written form/spoken form.  However, any of the pronunciation tables for your Acoustic Model are located in the associated files where the Acoustic Model file is located.  This is where any training that you do for individual words in the Vocabulary editor are stored (i.e., along with the acoustic.enh).  Training that you do in the Correction window or Spell dialog are stored in dra files that are associated with the Acoustic and Language Model Optimizer, but which are also incorporated into the Acoustic Model when you save your user(s).

PS, I should have added that in the Vocabulary files, one of these files is the linguistic pronunciation table that is based on standard linguistics (pronunciations) for each specific language (i.e., English contains all of the lexicon phonetics that you can find in any dictionary stored in a standard format that you can find on the web using the IPA pronunciation format ( /'la?ni/ ). If you add a new word that does not exist in either the active Vocabulary or the background dictionary, DNS uses the standard IPA pronunciation table to assign a pronunciation to new words. If you train a new word that is not in either the active Vocabulary or the background dictionary,  then DNS uses the IPA pronunciation table to add your pronunciation to that word and, in addition, also as it to your Acoustic Model so that DNS can identify it when you pronounce it the way you train it.  For example, the initial word "liney" would be entered with the IPA pronunciation table entry " /'la?ni/ ", but if you train it, for example, as "lina", then DNS would add (append) that pronunciation " /'la?ni//' la?na/ ".  And to add that pronunciation to your Acoustic Model so that when you speak the word pronouncing it the way that you want to, DNS recognizers the word via the appended pronunciation.  If the word already exists in the either the active Vocabulary or the background Vocabulary/dictionary, then DNS only adds the standard IPA pronunciation table entry.  Any training that you do under this condition is only added to the Acoustic Model and linked to the IPA pronunciation.  If I go into any greater detail, I will be asked so many questions about how this works.  So, please don't ask because I've given you enough to understand how it works.

Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS

If you hear the sound of hoofbeats, think horses not zebras.
Law of Parsimony (Occam's razor)

 



-------------------------

 04/13/2008 10:15 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 4391
Joined: 03/04/2007

Thank you very much for the quick response.

Rüdiger Wilke



-------------------------

Well, it's past the point where we can make any changes in the code, but we can still make changes to the Easter Egg!

 04/13/2008 10:38 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Roger,

I appended a PS to the previous post that gives a little bit more detail on pronunciations and the Acoustic Model/Vocabulary.  Take a look at the addition.  I think it will help to explain a couple of things, but as I say, please don't ask me for anymore because I'm not going to get any more technical than that.  I couldn't do it any way under NDA with Nuance.

Chuck Runquist
Former Dragon NaturallySpeaking SDK & Senior Technical Solutions PM for DNS

I know that you believe you understand what you think I said, but, I am not sure you realize that what you heard is not what I meant..



-------------------------

 04/13/2008 01:27 PM
User is offline View Users Profile Print this message

Author Icon
matthewls
Top-Tier Member

Posts: 601
Joined: 10/01/2006

Quote:
None of them. These files are pure Vocabulary. There is no acoustic or language model information contained in any of these files. The Acoustic Model is located in the following folder: C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking9\Users\\current\voice the Acoustic Model is "acoustic.enh" along with its associated pronunciation tables.

In my ..\Users\Matthew\current\ directory are several numbered folders (e.g. 0_1, 0_10, 0_17) that contain different acoustic.enh files (comp shows mismatches). I'm not sure how to releate these directory names/numbers to the names in the "Manage users" list, but it seems they're listed in order in the acoustic.ini file.

 

 04/13/2008 02:02 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 4391
Joined: 03/04/2007

Quote:
In my ..\Users\Matthew\current\ directory are several numbered folders (e.g. 0_1, 0_10, 0_17) that contain different acoustic.enh files (comp shows mismatches). I'm not sure how to releate these directory names/numbers to the names in the "Manage users" list, but it seems they're listed in order in the acoustic.ini file.

From what I remember when having different dictation sources within one user profile, this is the way the folders corresponding with the sources are numbered. And from what Chuck pointed out further above, each source has its own acoustic model, hence different  .enh files, I presume.

Rüdiger Wilke



-------------------------

Well, it's past the point where we can make any changes in the code, but we can still make changes to the Easter Egg!

 04/13/2008 02:06 PM
User is offline View Users Profile Print this message

Author Icon
matthewls
Top-Tier Member

Posts: 601
Joined: 10/01/2006

my understanding exactly.
 04/13/2008 03:39 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

Quote:
Third, when using the DVR dictation source, you can't use your microphone to correct your transcriptions.  The standard microphone/headset input is not available to the DVR dictation source.
 
Actually... if you want to use microphone correction you can hack your options.ini file. You'll find a step-by-step on this process when you look up “How to Turn the Microphone on in a Digital Recorder User Profile” in our Quick Tips. Please bear in mind that there is a reason why Nuance defeated this control. Using your microphone to make corrections is fine but Nuance would prefer that you avoid training.


-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

 04/13/2008 04:54 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 4391
Joined: 03/04/2007

Lunis,

I know it works, and I've done so following your workaround two years ago when I needed input from headset and digital recorder occasionally, but what does it mean for the acoustic models, are they getting mixed up? Apples and oranges?

Rüdiger Wilke

 



-------------------------

Well, it's past the point where we can make any changes in the code, but we can still make changes to the Easter Egg!

 04/13/2008 06:07 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

If you're only exporting and importing your vocabulary via the Manage Vocabularies feature, you are not actually mixing your acoustic models BUT we use a much simpler approach. In our case we are using UniVoice but we believe that any standard microphone user profile will work almost equally well with the 2 top digital recorders which are the Olympus DS-5000 (not yet available) and the Phillips 9600. It's much easier to use one profile for everything. Yes, we can obtain slightly higher accuracy maintaining a separate digital recorder user profile but the slight accuracy gain just doesn't outweigh the convenience for us.
 
Unfortunately our NDA will not permit us discuss the new Olympus DS-5000 in detail other than to say it will be very impressive when used with DNS.


-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

Statistics
27371 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 1 users logged in.
The most users ever online was 2028 on 04/05/2013 at 07:36 PM.
There are currently 450 guests browsing this forum, which makes a total of 451 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2013 FuseTalk™ Inc. All rights reserved.