Beta phase will broaden the program beyond the early adopter trials that have been in progress for the last 6 monthsOn July 7, Microsoft formally announced the next phase in its long-anticipated rollout of its speech applications software, Microsoft Speech Server. Microsoft began its public rollout with the distribution of the first release of the Microsoft Speech Applications Developers Kit in October 2002.
Because Microsoft’s entry into the speech field has long been anticipated, most affected vendors already have incorporated it into their plans. The formation of the SALT Forum by Cisco, Comverse, Intel, Microsoft, SpeechWorks, and Philips to develop a standard markup language for combining speech and graphical applications tackled the problem sooner than the more widespread VoiceXML.
Microsoft’s timing is typical—entering a market once it has matured, offering a lower price, and buying share from existing players. While speech applications became mainstream in the last 2 years, they did not see the typical market expansion due to the economy.
Microsoft has stated the following benefits of Speech Server:
· Lower costs: Microsoft has yet to share any framework on pricing but has repeatedly listed lower costs as a key attribute. We believe Microsoft will target a 30 percent lower price point than competing systems for its software. The fallacy here is that Microsoft claims higher costs come from solutions incorporating a wide array of proprietary hardware, telephony software, and tools. Speech Server does little to change that, since almost all vendors have adopted general-purpose servers, operating systems such as Windows or Linux, and generally available telephony boards.
· Provide easier integration with applications: IVR and speech vendors already have provided Web-based tools, especially in adopting VoiceXML, that provide the same level of application integration that Speech Server does with Web servers. The ability to share applications logic and databases across both voice and graphical applications already exists, so Speech Server’s leveraging of the SALT specification does not bring any new level of application integration to the table.
· Standard development environment expands who can write speech applications: In promoting SALT as a standard, Microsoft is touting that the millions of developers already familiar with Visual Studio .NET can now readily speech-enable any Web application. We agree that both SALT and VoiceXML are advancing the industry by providing a more common set of HTML/XML semantics for developers to deploy speech. But it is dangerous to imply that any Web developer will speech-enable applications, because not all have the proper training and experience in best practices for dialogue design. Speech-enabling applications is not easy. The industry does not need to repeat the problems experienced with the poorly designed IVR applications of the past.
· Create new multimodal (speech plus graphical user interface) speech applications: SALT has very effectively integrated the constructs necessary to support speech input, speech output, and graphical interactions with devices such as PDAs, tablet PCs, and pocket PCs. As the market begins to emerge for these types of applications, the SALT markup language is very well positioned to drive Speech Server sales. But properly designing applications that track all the ways a user can touch, type, speak, and listen is a long way off.