KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: iPhone dictation system versus Dragon
Topic Summary: When I use Siri iPhone it appears that it is almost perfect in its dictation. It appears that it is better voice recognition than Dragon. Is this Dragon voice recognition or another system. It seems to me that the iPhone dictates very well in fact better
Created On: 04/10/2012 02:03 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 iPhone dictation system versus Dragon   - chas - 04/10/2012 02:03 PM  
 iPhone dictation system versus Dragon   - Rag - 04/10/2012 03:51 PM  
 iPhone dictation system versus Dragon   - Lunis Orcutt - 04/10/2012 05:53 PM  
 iPhone dictation system versus Dragon   - Chucker - 04/11/2012 06:45 AM  
 iPhone dictation system versus Dragon   - Matt Chambers - 04/11/2012 08:37 AM  
 iPhone dictation system versus Dragon   - Chucker - 04/11/2012 09:16 AM  
 iPhone dictation system versus Dragon   - Matt Chambers - 04/11/2012 09:55 AM  
 iPhone dictation system versus Dragon   - monkey8 - 04/11/2012 02:20 PM  
 iPhone dictation system versus Dragon   - GDS - 04/11/2012 04:57 PM  
 iPhone dictation system versus Dragon   - Chucker - 04/11/2012 05:28 PM  
 iPhone dictation system versus Dragon   - bmac - 04/12/2012 11:03 AM  
 iPhone dictation system versus Dragon   - Chucker - 04/11/2012 10:31 AM  
 iPhone dictation system versus Dragon   - wgoren - 08/06/2012 04:48 PM  
 iPhone dictation system versus Dragon   - Lunis Orcutt - 08/06/2012 10:06 PM  
 iPhone dictation system versus Dragon   - NeuroDoc - 08/07/2012 12:27 AM  
Keyword
 04/10/2012 02:03 PM
User is offline View Users Profile Print this message


chas
Power Member

Posts: 60
Joined: 10/16/2006

When I use Siri iPhone it appears that it is almost perfect in its dictation. It appears that it is better voice recognition than Dragon. Is this Dragon voice recognition or another system.

It seems to me that the iPhone dictates very well in fact better recognition than the downloaded Dragon dictation system has anyone had any experience with this.
 04/10/2012 03:51 PM
User is offline View Users Profile Print this message


Rag
Top-Tier Member

Posts: 202
Joined: 06/16/2011

Apparently its the DNS 11 engine with a massive vocab, voice independant model. Nuance it seems have their fingers in many pies. You have to be connected to the net to use it which could be one downfall. Im sure that will change in years to come.

R

 04/10/2012 05:53 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

Note that you can use any existing analog microphone on your iPad or iPhone, to increase your accuracy, when you add the new iPad/iPhone Adapter

-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 

 04/11/2012 06:45 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

When I use Siri iPhone it appears that it is almost perfect in its dictation. It appears that it is better voice recognition than Dragon. Is this Dragon voice recognition or another system.

It seems to me that the iPhone dictates very well in fact better recognition than the downloaded Dragon dictation system has anyone had any experience with this.

chas,

First, be careful mixing apples and oranges. Siri uses the Dragon recognizer via access to the Nuance Dragon NaturallySpeaking server, but it is a query application, which is much simpler in terms of accurate recognition of your queries then simply dictating text using a large vocabulary continuous speech recognition application. The latter is many times more complex because there are many more unknown outcomes the Dragon NaturallySpeaking has to interpret correctly. Querying for information is a much simpler process.

Second, you do not have to be connected to the Internet to use Siri. Surrey uses either a wireless connection or 3G/4G and the Surrey app automatically sends the query to the Nuance server via those protocols. Also, what you say as limitations in terms of the length of your query. Dragon NaturallySpeaking has no limitations. That is, you can dictate for as long as you want. But with Siri if you ask a very long question, you may not get what you expect, which is not a question of accuracy as much as it is matter of understanding what it is that you're asking in terms of querying the Siri database.

There will come a time in the future when smart phones will be obsolete because will carry our computers around in our pockets and such technology will be many times more sophisticated and powerful than even the current desktop/laptop technologies.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: We don't make Dragon NaturallySpeaking, We make it better!

"The least questioned assumptions are often the most questionable." -- Paul Broca



-------------------------



 04/11/2012 08:37 AM
User is offline View Users Profile Print this message


Matt Chambers
Top-Tier Member

Posts: 379
Joined: 10/02/2006

Siri is not simply a "query application".  On the iPhone 4S, you can use it to enter text into documents, e-mails, and text messages.  There is a special key on the keyboard that takes dictation.  There is an article today in the Wall Street Journal in the Personal Technology column about its efficacy.

It is correct that the speech recognition engine is supplied by Nuance.  I'm sure that it is very similar to the engine that we use in Dragon NaturallySpeaking, but the recognition is performed at central servers, as Chuck said.  That should allow a lot more processing power to be applied.

 



-------------------------
 04/11/2012 09:16 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Matt,

Thank you for bringing me up to date. I ditched my previous iPhone because it was only 3G. I'm waiting until next month when I get my new iPhone with 4G. So, I only know Siri through Nuance and their development team that worked on Siri SR. I knew that they were using a central server. However, I didn't know that Siri was more extensive in terms of being able to accept dictation. It appears that Siri is similar to FlexT9, which I have on my android smartphone and am using temporarily.

Question for you. How long can you dictate with Siri? It takes a lot of bandwidth to transmit a significant amount of audio information to the Nuance backend server even if the audio stream is compressed. FlexT9 breaks it up into utterances. Does Siri do the same? Is it capable of doing the same?

Thanks,

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: We don't make Dragon NaturallySpeaking, We make it better!

A creative man is motivated by the desire to achieve, not by the desire to beat others. Ayn Rand



-------------------------

 04/11/2012 09:55 AM
User is offline View Users Profile Print this message


Matt Chambers
Top-Tier Member

Posts: 379
Joined: 10/02/2006

Chuck,

I don't really know how long the utterances can be with Siri.  I haven't used it a lot, for two reasons.  First, as you say, it uses a ton of bandwidth and data, so I don't want to use it very often unless I am connected by Wi-Fi.  Second, there is no easy way to add custom words and no way to customize your vocabulary, so I find it less accurate than dictating on my computer.

I do find that it is pretty accurate for short text messages and e-mails, as long as you're using fairly standard language and not much jargon.

Maybe I will have to experiment some more.

Matt

-------------------------
 04/11/2012 02:20 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 1988
Joined: 01/14/2008

Chuck,

I don't know which Android phone you are using but if you can update to Android 4 you are no longer restricted to short length utterances and the Google servers can now deal with any length (more or less) of utterance.  It's starting to get difficult to keep up.

Lindsay



-------------------------


www.pcbyvoice.com
www.pcbyvoice.co.uk

 04/11/2012 04:57 PM
User is offline View Users Profile Print this message

Author Icon
GDS
Top-Tier Member

Posts: 749
Joined: 01/16/2009

Quote:
Chuck, I don't know which Android phone you are using but if you can update to Android 4 you are no longer restricted to short length utterances and the Google servers can now deal with any length (more or less) of utterance. It's starting to get difficult to keep up.

That, and Google's done something remarkable: the transcription of dictated text is accurate and instant. I know those of us with flawless dictation styles and supercomputers have been taking this for granted in Dragon NaturallySpeaking for awhile, but the bottom line is that instantaneous transcription of "from my mouth to the screen" is available to the average end user for the first time.

I'm long on Nuance specifically, and in general I'm long on speech as the "future" and best method of human-computer interaction. I've got some quibbles with Nuance as a company -- mostly in how it strategically leverages its technology. But that technology is best in class. That said, Google will be a serious competitor. It's no coincidence that the core of Google's speech team is made up of former Nuance staffers (and patent holders).

Quote:
I'm waiting until next month when I get my new iPhone with 4G.

You might have to wait longer than that. Apple will be studied for generations as a masterclass in marketing, business, vertical and horizontal integration, yada yada yada. But for all of Apple's deserved kudos in marketing, they're confusing as hell sometimes. The iPhone 4S does not have 4G cellular data speeds. The only Apple device that can access 4G data networks is the new iPad, (which is officially referred to as The New iPad).

I don't watch Apple all that closely. I don't know when we'll see a 4G iPhone. But I suspect that it's a few months out, yet... mostly because refreshes of the MacBooks are coming before the end of this month, its new operating system is coming in June, and it likes to stagger its major consumer releases.



-------------------------

Eric Wright At work: DNS 12 Pro. At home: DNS 11.5 Pro,  KnowBrainer 2011, and Utter Command by RedStart Systems; Dragon Dictate 3 for Mac


 


Appetite for Dictation - My Blog

 04/11/2012 05:28 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Eric,

I don't know what other vendors are doing relative to the iPhone 4, but Verizon has given me a guarantee in writing that I will have mine by the first week in May. The guarantee says that if they don't deliver, I get it free. I hope they don't make it.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: We don't make Dragon NaturallySpeaking, We make it better!

"In theory, there is no difference between theory and practice. But in practice, there is." - Yogi Berra



-------------------------

 04/12/2012 11:03 AM
User is offline View Users Profile Print this message

Author Icon
bmac
Top-Tier Member

Posts: 478
Joined: 10/02/2006

Quote:
I don't know what other vendors are doing relative to the iPhone 4

Chuck - I believe you meant iPhone 5...



-------------------------

Bill
DNS Pro v12.5, KB 2012, Mtech Desktop PC (i7 960 3.2 gHz with 12 GB RAM), Windows 7 Pro 64-bit, 240 GB SSD, Philips SpeechMike 3500, SpeechWare 3-in-1 TableMike, Philips SpeechMike 5274 Classic, MS Office 2010 Professional

 04/11/2012 10:31 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 9671
Joined: 10/10/2006

Matt et al.,

Here is an article from Speech Technology Magazine, today's issue, that should clarify many of the questions that anyone has about Siri. The link requires a subscription, which is free, but rather than burdening everyone with the process of obtaining a subscription before being able to read the article, I'm reproducing it here with full credit to the author.

Most consumers have encountered speech recognition largely in call  center automation, where the speech recognition can be annoyingly  overstructured. Callers might feel they are being prevented from  speaking to an agent by the automated system, a further source of  annoyance. Part of the negative view of speech recognition has also  been its limitations compared to human speech recognition.

Today, the technology seems to be at a tipping point, with both the  perception of it and capabilities rapidly moving speech recognition  toward an everyday experience. Apple's Siri is a big part of the  attitude change. The model of a friendly personal assistant, easily  available, seemingly always with you in the form of a mobile phone,  and apparently responding to unstructured speech (natural language),  has changed perceptions both of how useful the technology can be  and how far speech recognition has come.

The friendly part of the perception is in part due to Apple's marketing  genius. When the company emphasized the naturalness of the  interaction rather than reminding users they were talking to a  computer, I initially thought that the natural language model would  encourage pushing the service beyond its capabilities. But Apple  foresaw this issue, and made it an advantage. They put in canned  clever answers to many of the testing questions that Siri might be  asked (from "What is the meaning of life?" to "Will you marry me?" ).  As a marketing tool and confidence builder, this insight is proving  tremendously effective.

The speech technology in Siri is remarkable. In speech recognition,  the mobile phone environment is one of the most difficult, with  background noise a typical issue. The iPhone includes noise  cancellation, which helps. Beyond that, the speech recognition  accuracy for unconstrained speech with very few context restrictions is  remarkable. What accounts for this apparent quantum leap in the  capabilities of speech recognition?

Part of the accuracy can be attributed to the speech recognition itself,  and part to the natural language processing of the transcribed  speech. The transcription of the speech to text is displayed so one  can see what Siri "heard," and it is remarkably accurate (based on  personal experience and the reaction of the marketplace). What adds  to the experience, however, is the post-processing, which can  compensate for recognition errors. For example, in a personal  experience, the iPhone responded to one spoken request with the text  interpretation, "Fries electronics near here," but then, without further  interaction, displayed the location of a Fry's Electronics store nearby,  the correct interpretation of intent. The natural language processing  either made its own match to similar-sounding words or is working  from output of the recognizer that includes more than the  highest-scoring option.

Another aspect of this performance is the infrastructure used. The  speech recognition and natural language processing are done in the  network, so they can use the processing power and memory  resources of a server, rather than the limitations of the small device.  The processing also has access to constantly updated large  databases, e.g., local businesses. The core speech recognition  probably uses more than a pure statistical language model, with  entries such as business names or street addresses represented by  a list within the statistical language model, making it easy to update  without rebuilding the entire model.

Beyond the significant technology advances, this tipping point is  supported by consumer enthusiasm for the personal assistant model  of interaction. This attitude change is important because it leads to  consumer tolerance of inaccurate responses when they do occur, and  a willingness to repeat or restate a request.

A subsidiary effect of the personal assistant model is that call centers  will face even more resistance to automation if they don't adopt a  less-structured natural language approach in their operations, since  consumers know now that it is possible. Conversely, they will witness  more acceptance of automation when they do adopt it. Companies  have a chance to build on this major change in attitudes by  recognizing a paradigm shift and adopting the assistant model.

Article from Speech Technology by William Meisel, Ph.D., president  of TMA Associates

Finally, whether it be Apple or Nuance, or both, someone has finally grasped what is important and how to properly market speech recognition so as to move it in the direction of mainstream technology. This is how speech recognition should be marketed. It would appear that there is hope after all!!!

Chuck Runquist
Technical Project Manager
VoiceTeach LLC
Home of VoicePower®: We don't make Dragon NaturallySpeaking, We make it better!

Management is doing things right; leadership is doing the right things. - Peter Drucker (1909 - 2005)



-------------------------

 08/06/2012 04:48 PM
User is offline View Users Profile Print this message

Author Icon
wgoren
New Member

Posts: 2
Joined: 08/06/2012

Will be getting an iPhone 4 s this week. What will DNS do for me? Does it help with hands free?

 

Bill

 08/06/2012 10:06 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 22623
Joined: 10/01/2006

We haven’t heard any rumblings for a while but supposedly, the iPhone 5 is due to be released any day now. You might want to check on the ETA before purchasing the soon to be dated iPhone 4s technology.

Although DNS includes a speech app for both the iPhone and Android phones, it seems about the same quality as Siri which we found to be faster than typing on those miserable little keyboards that only teenagers seem to master. Of course neither Siri or Dragon are a substitute for a real voice recognition application and while we wouldn’t consider using it for any kind of professional work, Siri is very well suited to searches and short emails.



-------------------------


Click KB 2012 REV D to Download a 30 Day Evaluation of KnowBrainer 2012 


 


 


 



 08/07/2012 12:27 AM
User is offline View Users Profile Print this message


NeuroDoc
Junior Member

Posts: 40
Joined: 09/05/2010

I am impressed with the quality of transcription using SIRI or the corresponding function on the Nexus7.  The Nexus7 will work in slo-mo if there is no connection to the servers. 

What is most surprising to me is the fact that despite the time devoted to training, the masses of documents fed to the program for analysis and the body of corrections available, Dragon does not perform discernably better.

I would wonder what advantage, if any the backend servers could have.  I am running Dragon on machines with up to 12 Gigs of RAM, SSD's and fast 4 core processesors and one overclocked machine. 

 

Statistics
27371 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 2 users logged in.
The most users ever online was 2028 on 04/05/2013 at 07:36 PM.
There are currently 199 guests browsing this forum, which makes a total of 201 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2013 FuseTalk™ Inc. All rights reserved.