KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: Medical/Legal Vocbularies.
Topic Summary: Anyone tried these
Created On: 06/20/2011 05:52 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 Medical/Legal Vocbularies.   - Rag - 06/20/2011 05:52 PM  
 Medical/Legal Vocbularies.   - Stephan Kuepper - 06/21/2011 07:39 AM  
 Medical/Legal Vocbularies.   - Chucker - 06/21/2011 08:39 AM  
 Medical/Legal Vocbularies.   - Stephan Kuepper - 06/21/2011 09:24 AM  
 Medical/Legal Vocbularies.   - Chucker - 06/21/2011 10:31 AM  
 Medical/Legal Vocbularies.   - Matt Chambers - 06/21/2011 02:16 PM  
 Medical/Legal Vocbularies.   - Chucker - 06/21/2011 04:31 PM  
 Medical/Legal Vocbularies.   - monkey8 - 06/21/2011 04:33 PM  
 Medical/Legal Vocbularies.   - Chucker - 06/21/2011 04:52 PM  
 Medical/Legal Vocbularies.   - monkey8 - 06/21/2011 05:18 PM  
 Medical/Legal Vocbularies.   - Matt Chambers - 06/22/2011 09:03 AM  
 Medical/Legal Vocbularies.   - supee - 06/23/2011 06:30 PM  
 Medical/Legal Vocbularies.   - Lunis Orcutt - 06/23/2011 06:41 PM  
 Medical/Legal Vocbularies.   - Rag - 06/21/2011 02:35 PM  
 Medical/Legal Vocbularies.   - Rag - 06/23/2011 06:44 PM  
Keyword
 06/20/2011 05:52 PM
User is offline View Users Profile Print this message


Rag
Top-Tier Member

Posts: 202
Joined: 06/16/2011

Hi

There are the above on ebay and such like. Has anyone tried these?? Any good.

Would like to hear from people.

R

 06/21/2011 07:39 AM
User is offline View Users Profile Print this message

Author Icon
Stephan Kuepper
Top-Tier Member

Posts: 470
Joined: 10/04/2006

Rag,

could you provide us with a link to ebay? 

Speciality vocabularies are usually created by Dragon resellers, using the Dragon VocTool. IMHO, only these have a chance to be any good. Obviously, a lot depends on the skill of the maker.

A vocabulary is not just a word list but contains information on how these words are used in context. Therefore it takes a huge amount of text to create a good speciality vocab. Prices are accordingly high, but the increase in efficiency is worth it.

I'm writing this as a Dragon reseller/distributor who has both sold and created speciality vocabs in various areas. Personally, I'd be extremely wary of any Dragon stuf on ebay.

Stephan



-------------------------

www.egs-vertrieb.de - Speech Recognition Blog - Forum: www.immer-eine-Nuance-besser.de

 06/21/2011 08:39 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Quote:
A vocabulary is not just a word list but contains information on how these words are used in context. Therefore it takes a huge amount of text to create a good speciality vocab. Prices are accordingly high, but the increase in efficiency is worth it.

I'm writing this as a Dragon reseller/distributor who has both sold and created speciality vocabs in various areas. Personally, I'd be extremely wary of any Dragon stuf on ebay.

Stephan,

While I agree 100% with your last statement, allow me to correct you about the difference between vocabulary and context.

The vocabulary is a lexicon. That is, as far as DNS is concerned it only contains the list of words and phrases that are actively stored in memory (Active Vocabulary). It does not contain any context information.

Context is contained in the Language Model.

When someone uses Voctool.exe, which is no longer available in DNS 11, it is possible to create a Middle slot Language Model containing specialty contexts. However, I would agree with you that the creator has to know what they're doing or any specialty vocabulary created in this manner won't provide any additional viable contexts (specialty Language Model). It does require skill and a thorough knowledge of how to use the Middle slot.

In addition, there will also be a significant change in DNS 11.5 Re: this issue. At this particular point, that's all I can say. However, keep in mind that under the EULA (In the User License Agreement) in DNS 11-11.5, it's a violation of the EULA to do this and sell such anywhere, let alone on eBay.

Regardless, don't confuse the Vocabulary with the Language Model. They are not equivalent and the Active Vocabulary does not contain any context information. Most vocabularies that are still currently valid because they were created prior to the changes in the EULA and DNS 11 do not contain a Language Model. The only contain vocabulary words and phrases. While it certainly doesn't hurt to use these, end-users would be better off in the long run to create their own vocabularies by simply collecting a word list and a unique selection of documents pertinent to those words and using the appropriate features and functions in the Accuracy Center. It's cheaper and it's just as effective. If the end-user doesn't have a collection of texts that can be analyzed for writing style (Language Model – contexts), obtaining over creating specialty list of words and phrases based on the user's needs is generally sufficient. DNS will learn new contexts and adapt contexts over time.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: Simply powerful, powerfully simple

"If the automobile had followed the same development cycle as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and explode once a year, killing everyone inside." -- Robert X. Cringely



-------------------------

 06/21/2011 09:24 AM
User is offline View Users Profile Print this message

Author Icon
Stephan Kuepper
Top-Tier Member

Posts: 470
Joined: 10/04/2006

Chuck,

absolutely correct. From a sales perspective, however, a "vocabulary" is, at least in these parts of the world, a set of .to* files containing words, alternative forms, rules for the use of alternative forms, and a language model, and usually created using the VocTool (which is still available for certified resellers).  A word list is just that - a .txt file containing words and, if necessary, their spoken forms. 

As far as pricing is concerned, I have a strong notion that most users wouldn't even know what they can achieve with Dragon and a good custom vocabulary. These users will, in all probability, never read this forum, either ;-) 

Of course you can do it all yourself. The question is a) how well you can do it and b) how much time it will cost. A lawyer spending 6 hours on a special vocabulary effectively loses money, rates being what they are...

Stephan



-------------------------

www.egs-vertrieb.de - Speech Recognition Blog - Forum: www.immer-eine-Nuance-besser.de

 06/21/2011 10:31 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Quote:
absolutely correct. From a sales perspective, however, a "vocabulary" is, at least in these parts of the world, a set of .to* files containing words, alternative forms, rules for the use of alternative forms, and a language model, and usually created using the VocTool (which is still available for certified resellers).  A word list is just that - a .txt file containing words and, if necessary, their spoken forms.

Stephan,

Actually, the files that you're talking about are created when you use Manage Vocabularies... to export all of that portion of your user profile. Only part of that is the actual Active Vocabulary. Yes, that does contain the language models, but the use of "vocabulary " in that sense  is actually a misnomer. What is contained in the Active Vocabulary is what you see in the Vocabulary Editor. So, you're talking about the complete user vocabulary, not the lexicon itself. The Acoustic Model and the Language Model are basically compared to the Active Vocabulary when transcribing speech to text. This is why users get confused by some of the terms that Nuance uses. Strictly speaking, there are three speech models: (a) the Active Vocabulary or lexicon, (b) the Acoustic Model, and (c) the Language Model. Vocabulary as you refer to it in your post combines the background dictionary (i.e., the full 450,000 words stored on the hard drive), the Active Vocabulary (i.e., including the base vocabulary and any custom words, properties, spoken forms, ITN rules, etc.), and the base Language Model and adapted Language Model contexts.

These are all copied from the General and the General_container. The General_container contains any user slot and middle slot Language Models. These are located at the following locations:

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking10\Users\<your user name>\current\General__container\lmo.slt

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking10\Users\<your user profile>\current\General__container\usr.slt

Of course, the path is different if you're using Windows Vista or Windows 7, but the file locations are still in the same place in the user profile whether you're using DNS 10 or DNS 11.

Nevertheless, all of these files from the General and the General_container are included in the:

General_-_Large.dat
General_-_Large.To1
General_-_Large.To2
General_-_Large.To3
General_-_Large.tog
General_-_Large.toi
General_-_Large.Top
General_-_Large.tot

Regardless, this is the complete set of files associated with the user profile. The active vocabulary is only part of this collection of files. This is why I distinguish between lexicon and vocabulary. If you use the standard export and import of a list of words and phrases, that is the vocabulary sans properties etc. That is, words and phrases, spoken forms, and their underlying pronunciations, which you can obviously no longer view in DNS 11.

Quote:
As far as pricing is concerned, I have a strong notion that most users wouldn't even know what they can achieve with Dragon and a good custom vocabulary. These users will, in all probability, never read this forum, either

I don't find that to be true at all. Less experienced users may not completely understand, but they do frequent the forum because I've had numerous private messages and even e-mails from users asking for an explanation of these. I think you would be surprised at the number of users who read what we post here regardless of whether they understand it or not.

Quote:
Of course you can do it all yourself. The question is a) how well you can do it and b) how much time it will cost. A lawyer spending 6 hours on a special vocabulary effectively loses money, rates being what they are...

You're preaching to the choir here. Obviously it takes time, effort, and a certain amount of understanding and skill. Nevertheless, more users than you think actually do this on their own.

Also, it may take time to compile a list of words (vocabulary), but it doesn't require 10,000,000 documents being analyzed and adapted to your writing style in order to get a good adapted Language Model. In fact, more than 100 or 200 documents doesn't improve the overall Language Model in the less these documents are unique in terms of their contacts. It actually only takes 20 or 30 documents that are completely unique, properly configured as text files and properly proofread in terms of writing style to create an adequate Language Model adaptation. Too many users think that bigger is better, when, in actuality, bigger is nothing more than redundant and doesn't necessarily improve context analysis.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: Simply powerful, powerfully simple

"The least questioned assumptions are often the most questionable." -- Paul Broca



-------------------------

 06/21/2011 02:16 PM
User is offline View Users Profile Print this message


Matt Chambers
Top-Tier Member

Posts: 379
Joined: 10/02/2006

Quote:
When someone uses Voctool.exe, which is no longer available in DNS 11, it is possible to create a Middle slot Language Model containing specialty contexts. However, I would agree with you that the creator has to know what they're doing or any specialty vocabulary created in this manner won't provide any additional viable contexts (specialty Language Model). It does require skill and a thorough knowledge of how to use the Middle slot.

I'm using 11, and still have voctool.exe. 



-------------------------
 06/21/2011 04:31 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Matt,

Either you have a VAR version of DNS 11, or you're using the DNS 10-10.1 Voctool.exe. The retail versions did not ship with Voctool.exe.

Nuance made that quite clear prior to the release of DNS 11.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: Simply powerful, powerfully simple

"A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila." -- Mitch Ratliffe



-------------------------

 06/21/2011 04:33 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 1988
Joined: 01/14/2008

Or you have the Legal version 11 which did ship with Voctools, at least in some countries.

Lindsay

-------------------------


www.pcbyvoice.com
www.pcbyvoice.co.uk

 06/21/2011 04:52 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Lindsay,

Thanks for reminding me of this. I remember a reference from my contacts regarding the Legal version of DNS 11 and Voctool.exe.

This time around, I didn't bother with the Legal version, so that would be easy to miss for me. My only interest is going to be in the Medical version for DNS 11.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: Simply powerful, powerfully simple

"Kindness is the language which the deaf can hear and the blind can see." -- Mark Twain



-------------------------

 06/21/2011 05:18 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 1988
Joined: 01/14/2008

Well I have to thank Rüdiger as he was the one reminding me just last week I think as I believe he uses Legal.

Lindsay


-------------------------


www.pcbyvoice.com
www.pcbyvoice.co.uk

 06/22/2011 09:03 AM
User is offline View Users Profile Print this message


Matt Chambers
Top-Tier Member

Posts: 379
Joined: 10/02/2006

That explains it, as I have the Legal version.

-------------------------
 06/23/2011 06:30 PM
User is offline View Users Profile Print this message


supee
Senior Member

Posts: 140
Joined: 10/28/2006

In conclusion, does a word list help at all?

-------------------------

Dell Inspiron 7520 SE,  2ghz core i7(Ivy bridge), 8gb ram, 750gb Seagate Momentus XT SSD/HD Hybrid hard disk. SpeechWare 3 in 1, Sennhesser ME3, Buddy USB 6G, Windows 7 Ultimate, Office 2007.

 06/23/2011 06:41 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

Yes

-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

 06/21/2011 02:35 PM
User is offline View Users Profile Print this message


Rag
Top-Tier Member

Posts: 202
Joined: 06/16/2011

I got about 1000 medical words form a med receptionist and added them to dragon in the custom vocab. It was just a word list, nothing more,nothing less. I then tested the accuracy of Dragon following that and it recognised 90% of the new words added. I hadn't trained them either. From waht I can see adding just a word list does work. Maybe not as good as a proper vocab but easy and effective none the less.

Just my two cents worth.

R

 06/23/2011 06:44 PM
User is offline View Users Profile Print this message


Rag
Top-Tier Member

Posts: 202
Joined: 06/16/2011

In my experience and those of my collegues the short answer is yes. I have added a very large list of terms that I bought, just a list, no a language model. My recognition of these words even without training is quite significant. Same with my collegues.
Statistics
27371 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 3 users logged in.
The most users ever online was 2028 on 04/05/2013 at 07:36 PM.
There are currently 98 guests browsing this forum, which makes a total of 101 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2013 FuseTalk™ Inc. All rights reserved.