Speech
technology vendors are heavily marketing the advantages that Automatic
Speech Recognition (ASR) has over its forbear ‘touch-tone
IVR’. Many of these advantages are self evident, although
speech applications may require a bigger up-front investment in
licenses and application development. ICR provides solutions which
utilise both ASR and touch-tone; with no technology vendor’s
axe to grind, our philosophy is to use the most appropriate technology
for each application. We therefore have produced this guide to help
you decide where ASR is mandated, and where touch-tone may have
a role to play.
Speak or press buttons – which would you prefer?
Given the choice, most people would rather speak than press buttons
on a keypad. Hence ASR’s major advantage – it enables
the creation of a much more natural dialogue between the customer
and your organisation. With well-designed applications this leads
to higher user satisfaction ratings, better transaction completion
rates, and a positive reinforcement of brand values. In addition
to this emotional aspect, in many cases speaking is simply far more
practical; many of us have tried to use the touch-tone input method
on a mobile phone, and found that it’s just too difficult.
ASR removes this practical hindrance and again provides a better
customer experience.
Simplify the input with ASR
The sophistication of input options available with ASR means call
durations and effort needed from the customer can often be substantially
reduced compared to touch-tone systems. The use of ASR can help
flatten the complex menu structures in some touch-tone applications
which are often the cause of much customer frustration. Recent developments
utilising so called ‘say anything’ technology,
allow the customer simply to state the service required at the beginning
of a call, and the system will process the request appropriately. |
 |
ASR
provides wider scope for automation
ASR engines are opening up new opportunities to automate processes
which simply could not be addressed utilising touch-tone input.
Consider a fulfillment system to send a brochure to a callers’
address, the data capture aspect can now be automated using ASR
- capturing the caller’s name, address, and Postcode, none
of which would be possible using touch-tone input. Many applications
fit this scenario, for example automated switchboards, password
processing, timetable information and ticket booking, catalogue
ordering and order tracking.
Accuracy - ASR vs Touch-Tone
Past promises of ASR being ‘the next big thing’ largely
failed to materialise. This in turn may have deterred some organisations
from investigating ASR. Now the quality of speech engines meets,
and often surpasses, the expectations of many, and decision makers
are frequently impressed at the accuracy levels of modern ASR engines.
Having said that, for simple input, accuracy is not and is never
likely to be the almost 100% level achieved in processing touch
– tone input.
But this is not the whole story, in addition to processing accuracy
it is important to consider the scope for touch-tone user error.
For example, consider a user who is asked to input a 13 digit credit
card number; it is quite difficult for users to do this using touch-tone
simply because the number string is so long. It is much easier to
read out the numbers from the card. Similarly errors may be introduced
by the user’s inability to remember which numbered option
they need from a long menu – it is much easier to simply speak
what it is they need.
Privacy and Fast - Tracking
As the use of touch-tone IVR systems becomes more common, two factors
come into play. Firstly users become more |
 |
expert,
and secondly the system may be used in a public or work place. Research
shows that when customers use an IVR system regularly, they become
expert, and can fast-track through menus, and input data very quickly.
For these users, speech input may become a hindrance (although some
speech applications can be provided with a touch-tone option). Additionally,
for some applications such as banking, the relative secrecy of touch-tone
input may be seen as an advantage when inputting account numbers
and PINs.
Cost implications
The costs of both touch-tone and ASR systems have fallen quite considerably
over the last couple of years, largely because of increased competition
with the onset of open standards such as VXML. Generally the initial
costs of an ASR system will be greater than a touch-tone system.
This will be exacerbated by the greater effort needed to research
the user profile, design and tune an ASR deployment.
However, this needs to be weighed against the potential greater
savings which ASR can deliver – and organisations such as
ICR can assist in evaluating the overall Return on Investment which
may be possible.
Conclusions
Speech recognition technology continues to mature, and ever more
effective applications are being deployed. However, ASR generally
requires a greater capital investment than touch-tone solutions
and decisions can only be made on a case-by-case basis. In some
cases ASR can be the only viable solution and can enable entirely
new service propositions. Organisations also need to assess customer
service improvements, technical feasibility, image and brand values
and ROI. Finally, the new open standards are making it easier to
implement a migration strategy from touch-tone systems to ASR. |