Blind engineering student 'reads' color-scaled weather maps using Cornell software that converts color into sound
By David Brand
ITHACA, N.Y. -- A melody of staccato piano notes sings out from the speakers of Victor K. Wong's desktop computer. But it is not a melody made by Bach, or Liberace, or even Alicia Keys. It is the melody of color.
Wong, a Cornell University graduate student from Hong Kong who lost his sight in a road accident at age seven, is helping to develop innovative software that translates color into sound. "Color is something that does not exist in the world of a blind person," explains Wong. "I could see before, so I know what it is. But there is no way that I can think of to give an exact idea of color to someone who has never seen before."
He helped develop the software in Cornell's Department of Electrical and Computer Engineering (ECE) with undergraduate engineering student Ankur Moitra and research associate James Ferwerda from the Program of Computer Graphics.
The inspiration for using image-to-sound software came in early 2004 when Wong realized his problems in reading color-scaled weather maps of the Earth's upper atmosphere -- a task that is a necessary part of his doctoral work in Professor Mike Kelley's ECE research group.
It is a field dubbed "space weather," which attempts to predict weather patterns high over the equator for use by Global Positioning System and other satellite communications. A space weather map might show altitude in the vertical direction (along the "y" axis), time in the horizontal direction (along the "x" axis), and represent density with different colors.
As a scientist, Wong needs to know more than just the general shape of an image. He needs to explore minute fluctuations and discern the numerical values of the pixels so that he can create mathematical models that match the image. "Color is an extra dimension," explains Wong.
At first, the team tried everything from having Kelley verbally describe the maps to Wong to attempting to print the maps in Braille. When none of those methods provided the detail and resolution Wong needed, he and Ferwerda began investigating software. Moitra later became their project programmer."We started with the basic research question of how to represent a detailed color-scaled image to someone who is blind," recalls Ferwerda. "The most natural approach was to try sound, since color and pitch can be directly related and sensitivity to changes in pitch is quite good."
Over the summer of 2004, Moitra wrote a Java computer code that could translate images into sound, and in August he unveiled a rudimentary software program capable of converting pixels of various colors into piano notes of various tones.
Wong test-drove the software by exploring a color photograph of a parrot. He used a rectangular Wacom tablet and stylus -- a computer input device used as an alternative to the mouse -- which gives an absolute reference to the computer screen, with the bottom left-hand corner of the tablet always corresponding to the bottom left-hand corner of the screen.
As Wong guided the stylus about the tablet, piano notes began to sing out. The full range of keys on a piano was employed, allowing color resolution in 88 gradations, ranging from blue for the lowest notes to red for the highest.
The software also has an image-to-speech feature that reads aloud the numerical values of the x and y coordinates as well as the value associated with a color at any given point on the image. "In principle I could turn off the music and just have the software read out the value of each point. I would know what the gradient is in a more absolute sense, but it would get annoying after some time. It keeps reading out 200.1, 200.8, 200.5, and so on," says Wong.
One of the biggest challenges of the project is the so-called "land-and-sea" problem. "Sometimes I just want to know where is the land and where is the sea," says Wong -- meaning that he would like to have an idea where the major boundaries in an image lie, such as the boundary between the parrot and the background. The problem hinges on shape recognition, which for Wong can be difficult.
In the simplest situation, the right half of an image would be completely blue and the left half completely red. To find the boundary Wong has to move the stylus continuously back and forth from one color to the next along the length of the tablet, which is both time-consuming and error prone.
To solve the land-and-sea problem, Wong, Moitra and Ferwerda tried printing the major boundary lines of an image in Braille and then laying the printed sheet over the Wacom tablet, combining both audio and tactile detection. However, they are still working to develop software that can effectively pick out the important boundaries in an image so that it can be printed.
"It is also important that there is no time delay between notes," says Moitra. "That is something we need to improve. Otherwise the image will become shifted and distorted in Victor's mind."
One of the major issues facing the project is funding. "The initial work was done on a shoestring as a side project to grants Kelley and I have received," says Ferwerda, who is preparing a proposal to the National Science Foundation to extend this work and explore other ideas for making images and other technical content accessible to blind scientists and engineers.
Says Wong: "Tackling complex color images is only one problem out of many that blind scientists are facing. But I think this is a pretty important idea."
Reported and written by graduate student Thomas Oberst, a science writer intern with Cornell News Service.
Get Cornell news delivered right to your inbox.
Subscribe