Computer program reads math text aloud for the visually impaired
By Bill Steele
A computer program written by a Cornell University graduate student to help him read his mathematics texts is now helping visually impaired students across the country with their studies. Eventually it may speed the process of recording books for the blind and perhaps lead to an audio browser for the World Wide Web.
The program, called AsTeR (cq) reads text stored in computer files aloud. There's nothing new in that, but AsTeR adds vocal emphasis and pauses to make the meaning clear, and allows the user to move forward and backward through the text, even jumping from one chapter of a book to another. "AsTeR produces an oral analogue of a print book," explained inventor T.V. Raman, who is himself visually impaired. "A print book has a lot of nice features: you can hold it in your hand, flip through the pages, go back and look at something again. AsTeR allows you to do the same thing with synthesized speech."
Raman, who now is employed at Adobe Systems in California, wrote AsTeR as part of his Ph.D. dissertation in computer science at Cornell and designed it particularly to make mathematical language understandable when read aloud. "A plus B divided by C minus D" is ambiguous when you can't see it as a formula. Is that the sum of A and B, divided by C subtracted from B, or the sum of A and B, divided by C, with D subtracted from the result, or one of several other possibilities? AsTeR reads the formula with pauses and emphasis, just as a human reader would, and sometimes adds words to clarify the meaning. This example might come out as "The quantity A plus B, divided by the quantity C minus D." AsTeR sometimes reads better than a human would, Raman said. "Humans vary in the way they speak," he pointed out. "They pause for different lengths of time, for example. To make the meaning clear, talking books end up having to use a lot of extra words, like 'The fraction whose numerator is a plus B, end numerator, divided by the fraction a minus B, end denominator.'" AsTeR's "audio formatting" often makes this unnecessary, he said, because of the flexibility and precise control built into a voice synthesizer. In ordinary text, the program also adds emphasis to italicized or boldfaced words, and can even provide a background hum to indicate underlining.
But the ability to move forward and backward in the text is even more important than the use of vocal emphasis, Raman said, pointing out that you can't do that easily with a tape. "You never read a textbook cover to cover," Raman explained. "With AsTeR you can jump from chapter to chapter and section to section. "You can say, 'Read me the equations,' then if you hear one you like you ask for the text around it. You can do the same with tables or a list of figures." The user can add notations to the computer file to make it easy to return to important places, or to clarify the reading. For example, most texts will state an equation once and then use phrases like "as shown in equation 5.4." The user of AsTeR can set the program to replace that with the name of the equation, or to restate the equation if necessary.
AsTeR works by reading tags embedded in text that has been formatted for printing. The current version is designed for use with computer text markup languages called TeX and LaTeX (cq), which use special formatting tags for mathematical formulas. These languages are widely used by scientists and engineers. However, Raman said, AsTeR could easily be adapted to work with almost any word processing or desktop publishing program, or with HTML, the language of the World Wide Web.
The software has been distributed free to several other universities for use by visually impaired students and has been licensed at no charge to the national organization Recordings for the Blind.
According to Cornell's David Gries, the Cornell computer science professor with whom Raman studied for his Ph.D., it can take up to a year to record a technical book using human readers. AsTeR could cut that down to a few days, Gries said.
Officially, AsTeR stands for "Audio System for Technical Readings," but actually, the program was named for Raman's guide dog Aster, a nine-year-old Labrador retriever. The first book recorded by Recordings for the Blind using AsTeR was Raman's own Ph.D. thesis which describes the software. In the printed thesis, a picture of Aster appears on the dedication page; in the audio version, you hear the sound of a dog panting.
Samples of AsTeR readings may be found on the World Wide Web at http://www.cs.cornell.edu/Info/People/raman/raman.html
Media Contact
Get Cornell news delivered right to your inbox.
Subscribe