Browser technology is
changing very fast these days and we are moving from the visual paradigm to the
voice paradigm. Voice browser is the technology to enter this paradigm. A voice
browser is a “device which interprets a (voice) markup language and is capable
of generating voice output and/or interpreting voice input, and possibly other
input/output modalities."This paper describes the requirements for two
forms of character-set grammar, as a matter of preference or implementation;
one is more easily read by (most) humans, while the other is geared toward
machine generation.
A voice browser is a
“device which interprets a (voice) markup language and is capable of generating
voice output and/or interpreting voice input, and possibly other input/output
modalities." The definition of a voice browser, above, is a broad one. The
fact that the system deals with speech is obvious given the first word of the
name, but what makes a software system that interacts with the user via speech
a "browser"? The information that the system uses (for either domain
data or dialog flow) is dynamic and comes somewhere from the Internet. From an
end-user's perspective, the impetus is to provide a service similar to what
graphical browsers of HTML and related technologies do today, but on devices
that are not equipped with full-browsers or even the screens to support them.
This situation is only exacerbated by the fact that much of today's content
depends on the ability to run scripting languages and 3rd-party plug-ins to
work correctly.
Much of the efforts
concentrate on using the telephone as the first voice browsing device. This is
not to say that it is the preferred embodiment for a voice browser, only that
the number of access devices is huge, and because it is at the opposite end of
the graphical-browser continuum, which high lights the requirements that make a
speech interface viable. By the first meeting it was clear that this
scope-limiting was also needed in order to make progress, given that there are
significant challenges in designing a system that uses or integrates with
existing content, or that automatically scales to the features of various
access devices.
There is enormous
economic potential for all players on the Voice Web. This section highlights
the business relationships and revenue opportunities that are being forged.
Business-to-Business Opportunity: Enterprise and e-commerce companies will use
the Voice Web to expand their client base and grow revenue at lower cost. The
Voice Web presents them the opportunity to sell their products and services
through portal providers, and forgo costly advertising campaigns or expensive
call centers to handle the transactions over the phone. Payment to portal
providers will range from simple lead referrals to a percentage of the total
commerce transaction.
Portal providers will
have supplier relationships with infrastructure and content players to purchase
“Voice Web Ready™”7 content and services; enabling technologies and products;
hosting services; professional services; and wholesale network transport.
Application Service Providers will host voice sites and voice portal services
and sell product suites to merchants that make their retail applications Voice
Web Ready. Startup portal providers w ill earn most of their revenues from
V-Commerce transactions as well as from advertising, sponsorship, and
third-party distribution contracts with network service providers.
Business-to-Consumer Opportunity: Voice portal providers will market directly
to the public and manage the customer relationship with individuals and
businesses that sign up for their service. To drive viral adoption, some of
these portal providers will offer these initial services free-of-charge,
thereby depriving the majority of their revenue from V-Commerce transactions,
network usage, and advertising.
If a voice browser is
to converse with the user, then a description, either explicit or derived and
implicit, must exist for the underlying system to "render" into a
dialog. Ultimately, it will be up to solution-providers to take an inventory of
the existing content (if any), development tools, data-access requirements,
deployment platforms, and application goals such as cost, security, richness
and robustness, before they can decide what technology to use. More likely than
not, for the time-being, multiple content types will be required to deliver the
most natural experience on each type of browsing device -- this is both a
technical limitation, and driven by the user's who expect the
latest-and-greatest attributes of each modality to be featured in their
applications.
0 comments: