It takes a village – of computers – to deliver your e-mail

It may help to better understand e-mail if you think of the Internet as a huge field dotted with small ponds, connected by a network of canals. To send a message, you write it on the side of a fish and toss it into your pond. In the middle of every pond stands a guy in hip boots who grabs each fish going by, reads the address written on it and tosses it into the canal most likely to go there.

Now zoom in on the pond labeled "Cornell," where some 20,000 people wait around the bank for their messages. Where the canal comes in (from Syracuse, actually), a bunch of guys in hip boots frantically grab fish and toss them hand over hand until they get to the right person.

Enough with the metaphor. The fish are packets of data and the sorters are computers – powerful servers from Sun Microsystems with some 10 terabytes of disk storage and redundant power supplies – operated by Cornell Information Technologies (CIT) that shuffle messages among themselves and to and from local and worldwide users.

According to Jim Howell, CIT's messaging systems manager, a network of 15 Cornell computers processes about 2.5 million messages a day to and from 35,000 campus accounts (some people have more than one), forwarding to 48,000 alumni and distributing to 4,100 e-mail lists. Some mail is handed off to departments that run their own e-mail systems, and a few messages go into a system called Microsoft Exchange that serves Blackberries and certain users in Day Hall, Alumni Affairs and the Johnson School.

Counting attachments, all that adds up to about 100 gigabytes of data each day, Howell says, the equivalent of about 100,000 novels. Maybe 50,000 Stephen King novels.

Mail arriving from the outside world first goes to one of six computers known as external mail hubs. They begin by blocking as much spam and virus traffic as possible. A program called Traffic Control dramatically slows down reception from spam sources identified by the volunteer Spamhaus service, letting legitimate mail go faster and often causing spammers to give up. Another program called PureMessage removes most attached viruses and scans messages for common spam features.

Overall, Howell says, Cornell rejects about 40 million spam messages a month out of 50 million incoming messages. Yes, about 80 percent of incoming mail is spam.

Incoming messages for cornell.edu addresses go to two internal mail hubs that query the Cornell directory – on yet another server – to find out which of five "post offices" the recipient uses. Mail with departmental addresses like cs.cornell.edu (computer science) bypasses the internal hubs and goes directly to departmental mail servers. Requests to manage e-mail lists go to four computers running the Lyris mailing list software.

Outgoing messages from Cornell users also go to the internal mail hubs but first pass through two computers addressed as authusersmtp.cornell.edu, which require senders' Net IDs and passwords, authenticated through yet another computer, the Kerberos server. This prevents virus-infected computers on campus from automatically sending out more viruses or spam with faked return addresses. Messages for the outside world are sent on their way, while local messages go to the internal mail hubs, which route them to the post offices, where each message is stored on a disk until a mail client like Eudora or Thunderbird comes to download it, or a request comes through Webmail.

And you just point and click and read. Simple, wasn't it?

 

Media Contact

Media Relations Office