Authors: Peter W. Singer Allan Friedman,Allan Friedman

Cybersecurity and Cyberwar (5 page)

And soon the mainstream media started to wake up to the fact that something big was happening online. As the
New York Times
reported in 1994 (in a printed newspaper, of course!), “Increasing commercialization of the Internet will accelerate its transformation away from an esoteric communications system for American
computer scientists and into an international system for the flow of data, text, graphics, sound and video among businesses,
their customers and their suppliers
.”

Lo and behold indeed.

How Does the Internet Actually Work?

For a few hours in February 2008, Pakistan held hostage all the world's cute cat videos.

The situation came about when the Pakistani government, in an attempt to prevent its own citizens from accessing what it decided was offensive content, ordered Pakistan Telecom to block access to the
video-sharing website YouTube
. To do so, Pakistan Telecom falsely informed its customers' computers that the most direct route to YouTube was through Pakistan Telecom and then prevented Pakistani users from reaching the genuine YouTube site. Unfortunately, the company's network shared this false claim of identity beyond its own network, and the false news of the most direct way to YouTube spread across the Internet's underlying mechanisms. Soon over two-thirds of all the world's Internet users were being misdirected to the fake YouTube location, which, in turn, overwhelmed Pakistan Telecom's own network.

The effects were temporary, but the incident underscores the importance of knowing how the Internet works. The best way to gain this understanding is to walk through how information gets from one place to another in the virtual world. It's a bit complex, but we'll do our best to make it easy.

Suppose you wanted to visit the informative andâdare we sayâentertaining website of the Brookings Institution, the think tank where we work. In essence, you have asked your device to talk to a computer controlled by Brookings in Washington, DC. Your machine must learn where that computer is and establish a connection to enable communication.

The first thing your computer needs to know is how to find the servers that host the Brookings web page. To do that, it will use the Internet Protocol (IP) number that serves as the address for endpoints on the Internet. Your machine was most likely automatically assigned an IP address by your Internet service provider or
whatever network you are using. It also knows the address of its router, or the path to the broader Internet. Finally, your computer knows the address of a Domain Name System server.

The Domain Name System, or DNS, is the protocol and infrastructure through which computers connect domain names (human memorable names like Brookings.edu) to their corresponding IP addresses (machine data like 192.245.194.172). The DNS is global and decentralized. Its architecture can be thought of as a tree. The “root” of the tree serves as the orientation point for the Domain Name System. Above that are the top-level domains. These are the country codes such as .uk, as well as other domains like .com and .net. Each of these top-level domains is then subdivided. Many countries have specific second-level domains, such as co.uk and ac.uk, to denote business and academic institutions, respectively.

Entry into the club of top-level domains is controlled internationally through the Internet Corporation for Assigned Names and Numbers (ICANN), a private, nonprofit organization created in 1998 to run the various Internet administration and operations tasks that had previously been performed by US government organizations.

Each top-level domain is run by a registry that sets its own internal policies about domains. Organizations, such as Brookings or Apple or the US Department of State, acquire their domains through intermediaries called registrars. These registrars coordinate with each other to ensure the domain names in each top-level domain remain unique. In turn, each domain manages its own subdomains, such as mail.yahoo.com.

To reach the Brookings domain, your computer will query the DNS system through a series of resolvers. The basic idea is to go up the levels of the tree. Starting with the root, it will be pointed to the record for .edu, which is managed by Educause. Educause is the organization of some 2,000 educational institutions that maintains the list of every domain registered in .edu. From this list, your computer will then learn the specific IP address of Brookings's internal name server. This will allow it to address specific queries about content or applications from inside the Brookings domain. Then, the Brookings name server will direct your computer to the specific content it is looking for, by returning the IP address of the machine that hosts it.

In reality, this process is a little more complex. For example, servers often store data locally in caches for future use, so that every query does not have to go to the root, and the protocol includes specific error conditions to handle errors predictably. The rough outline above, however, gives a sense of how it all works.

Now that your computer has the location of the data, how will that data get to your computer? The server at Brookings needs to know that it should send data to your machine, and the data needs to get there.
Figure 1.1
illustrates how your computer requests a web page by breaking down the request into packets and sending them across the Internet. First, at the “layer” of the application, your browser interprets the click of your mouse as a command in the HyperText Transfer Protocol (HTTP), which defines how to ask for and deliver content. This command is then passed down to the transport and network layers. Transport is responsible for breaking the data down into packet-sized chunks and making sure that all of the chunks arrive free of error and reassembled in the correct order for the application layer above. The network layer is responsible for trying its best to navigate the packets across the Internet. If you think of the data you are trying to send and receive as a package of information, the transport layer is responsible for packing and receiving the packages, while the network is responsible for moving them from source to destination. Once at the destination, the packets are reassembled, checked, and then passed back up to the applicationâin this case, a web server sending you the web content you requested.

Figure 1.1

But how do the packets know how to get across the Internet to their destination? Like the DNS that helped your computer find the website it was looking for, the organization of Internet networks can be thought of as a hierarchy. Each computer is part of a network, like the network connecting all the customers of an Internet service provider (ISP). ISPs are essentially the organizations that provide access to the Internet, as well as other related services like e-mail or hosting websites. Most ISPs are private, for-profit companies, including a number of the traditional telephone and cable TV firms that began offering Internet access when the field took off, while others are government or community owned.

Those networks, in turn, form nodes called Autonomous Systems (AS) in the global Internet. Autonomous Systems define the architecture of Internet connections. Traffic is routed locally through the AS and controlled by the policies of that organization. Each AS has a set of contiguous blocks of IP addresses and forms the “home” for these destinations. All have at least one connection to another AS, while large ISPs might have many. So routing to a particular IP address is simply a matter of finding its AS.

There's a problem, though: The Internet is big. There are
over 40,000 AS nodes
on the Internet today, and their interconnections are changing and shifting over time. Given this scale, a global approach to routing everything the same way is impossible.

Instead, the Internet uses a dynamic, distributed system that does not maintain a permanent vision of what the network looks like at any given time. The principle of routing is fairly simple: at each point in the network, a router looks at the address of an incoming packet. If the destination is inside the network, it keeps the packet
and sends it to the relevant computer. Otherwise, it consults a routing table to determine the best next step to send the packet closer to its destination.

The devil is in the details of this routing table's construction. Since there is no global address book, the nodes in the network have to share key information with other routers, like which IP addresses they are responsible for and what other networks they can talk to. This process happens separately from the Internet routing process on what is known as the “control plane.” Routers also pass along information to their neighbors, sharing up-to-date news about the state of the network and who can talk to whom. Each router then constructs its own internal, temporary model of how best to route the traffic coming through. This new model, in turn, is shared so that a router's neighbors now know how it will pass along new traffic.

If this sounds complex, it's because it is! In just a few pages, we've summed up what it took decades of computer science research to create. The takeaway for cybersecurity is that the entire system is based on trust. It is a system that works efficiently, but it can be broken, either by accident or by maliciously feeding the system bad data.

The Pakistan example shows what happens when that trust is abused. The government censors “broke the Internet” by falsely claiming to have direct access to the IP address that serves YouTube. This was a narrow, local, politically motivated announcement. But because of how the Internet works, soon every ISP in Asia was trying to route all their YouTube traffic to Pakistan, solely because they believed it was closer than the real intended destination. The models they were building were based on false information. As more networks did this, their neighbors also came to believe that YouTube was the Pakistani IP address. The whole mess wasn't resolved until Google engineers advertised the correct routes aggressively across the network.

In sum, understanding the Internet's basic decentralized architecture provides two insights for cybersecurity. It offers an appreciation of how the Internet functions without top-down coordination. But it also shows the importance of the Internet's users and gatekeepers behaving properly, and how certain built-in choke points can create great vulnerabilities if they don't.

Who Runs It? Understanding Internet Governance

In 1998, a computer researcher and a respected leader in the networking community named Jon Postel sent an innocuous sounding e-mail to eight people. He asked them to reconfigure their servers so that they would direct their Internet traffic using his computer at the University of Southern California rather than a computer in Herndon, Virginia. They did so without question, as Postel (who had been part of the team that set up the original ARPANET) was an icon in the field, who served as a primary administrator for the network's naming system.

With that one e-mail, Postel committed the
first coup d'Ã©tat of the Internet
. The people he had e-mailed ran eight of the twelve organizations that controlled all the name serversâthe computers ultimately responsible for translating a domain name such as “Brookings.edu” into a computer-addressable IP address. And the computer in Virginia that he had steered two-thirds of the Internet's root servers away from was
controlled by the US government
. While Postel would later say he had only seized control of a majority of the Internet's root servers as a “test,” others think that he did so in protest, showing the US government “that it
couldn't wrest control of the Internet
from the widespread community of researchers who had built and maintained the network over the previous three decades.”

Postel's “coup” illustrates the crucial role of governance issues even for a technical space. As the Internet has grown from a small research network to the global underpinning of our digital society, questions of who runs it have become more and more important. Or, as Eric Schmidt (who went on to become the CEO of a little firm known as Google) told a 1997 programmers conference in San Francisco, “The Internet is the first thing that humanity has built that humanity doesn't understand, the largest
experiment in anarchy
that we have ever had.”

Since digital resources are not “scarce” like traditional resources, its governance questions are a bit different. That is, the main questions of Internet governance are of interoperability and communication rather than the classic issue of distribution, which has consumed political thinkers from Socrates to Marx. However, even in a digital world of seemingly endless resources, traditional issues of governance also arise in cyberspace, including representation, power, and legitimacy.
Key decision chokepoints revolve around the technical standards for interoperability, the distribution of IP numbers that gives computers an address allowing them to send and receive packets, and the management of the Internet's naming system. Interestingly enough, it is this final category, the intersection of the technical and nontechnical aspect of naming, that has produced the most conflict.

Other books

Dead Game by Kirk Russell

The Name of the Rose by Umberto Eco

Catfish and Mandala by Andrew X. Pham

The Importance of Being Wicked (Millworth Manor) by Alexander, Victoria

The Washington Lawyer by Allan Topol

Psyche Moon by Chrissie Buhr

Broken (The Dark Billionaire, #2) by L.N. Pearl

The Opposite of Love by Pace, T.A.

Gooney Bird and All Her Charms by Lois Lowry

The Art of War for Zombies: Ancient Chinese Secrets of World Domination, Apocalypse Edition. by Rene J. Smith, Virginia Reynolds, Bruce Waldman