Sunday, June 15, 2008

The OSI Reference Model

2. THE OSI REFERENCE MODEL

The concept of how a modern day network operates can be understood by dissecting it into seven layers. This seven layer model is known as the OSI Reference Model and defines how the vast majority of the digital networks on earth function. OSI is the acronym for Open Systems Interconnection, which was an effort formed by the International Organization for Standardization in 1982 with the goal of producing a standard reference model for the hardware and software connection of digital equipment. The important concept to realize about the OSI Reference Model is that it does not define a network standard, but rather provides guidelines for the creation of network standards. The OSI has become so accurate a concept that almost all major network standards in use today conform entirely to it's seven layer model. Though seven layers may at first appear to make a network seem overly complex, the seven layer OSI Model has been proven over the past twenty years to be the most efficient and effective way to understand this extremely complex subject.

2.1 OSI LAYER 1: THE PHYSICAL LAYER

The first and foundational layer of a network is the Physical Layer. The Physical Layer is literally what it's name implies: the physical infrastructure of a network. This includes the cabling or other transmission medium and the network interface hardware placed inside of computers and other devices which enable them to connect to the transmission medium. The purpose of the Physical Layer is to take binary1 information from higher layers, translate it into a transmission signal or frequency, transmit the information across the transmission medium, receive this information at the destination, and finally translate it back into binary before passing it up to higher layers. Transmission signals or frequencies vary between network standards and can be as simple as pulses of electricity over copper wiring or as complex as flickers of light on optical lines or amplified radio frequency transmissions. The information that enters and exits the Physical Layer must be bits; either 0s or 1s in binary. The higher layers are responsible for providing the Physical Layer with binary information. Since nearly all information inside of a computer is already digital2, this is not difficult to achieve. The Physical Layer does not examine the binary information nor does it validate it or make changes to it. The Physical Layer is simply intended to transport the binary information between higher layers located at points A and B.

2.2 OSI LAYER 2: The DATA LINK LAYER

The second layer in the OSI Model is the Data Link Layer. The primary focus of the Data Link Layer is revealed in its common nickname, The Physical Address Layer. The only layer in the OSI Model that specifically addresses both hardware and software, the Data Link Layer receives information on its software side from higher layers, places this information inside of "frames", and finally gives this frame to the Physical Layer, Layer 1, for transmission as pure binary. A frame essentially takes the information passed down from a higher layer and surrounds it with Physical Addressing information. This information is important to the Data Link Layer on the receiving end of the transmission. When the frame, in binary form, arrives at the destination node3, it is passed from the transmission medium to the Data Link Layer (Layer 2) by the Physical Layer (Layer 1). The Data Link Layer on the receiving node then checks the frame surrounding the information received to see if it's Physical Address matches that of its own. If the Physical Address does not match, the frame and its encapsulated data is discarded. If the Physical Address is a match, then the information is removed from the frame and passed up to the next highest layer in the OSI Model. The Physical Address check is Obviously not of much use if there are only two nodes on a network, but suddenly becomes extremely valuable when three or more nodes exist. The Physical Addressing system allows multiple nodes to be on the same network medium, but retain the ability to address only a specific node with a transmission. On the simplest networks, all nodes receive every frame transmitted on the network, but discard frames not specifically addressed to them.
The Physical Address used in the Data Link Layer's Physical Addressing system is known as a MAC4 address and is embedded physically into the node's Network Interface Card during manufacturing. Every NIC's MAC address is unique in order to prevent addressing conflicts. It is this relationship that causes the Data Link Layer to be known as the only layer that addresses both hardware and software. This layer is where the information on the network makes the move from the physical infrastructure of the network into the software realm. The remainder of the OSI Reference Model's layers are entirely software.

2.3 OSI LAYER 3: THE NETWORK LAYER

OSI Layer 3 is known as the Network Layer. The first layer to deal entirely in software, the purpose of the Network Layer is to direct network traffic to a destination node who's Physical Address is not known. This is achieved through a system known as Logical Addressing. Logical Addresses are software addresses assigned to a node at Layer 3 of the OSI Model. Since these addresses are able to be defined by software rather than being random and permanent like Physical Addresses, Logical Addresses are able to be hierarchical. This allows extremely large networks to be possible. Up until this point, only small networks would be possible since all traffic was addressed to all nodes. This works fine until more than one person attempts to utilize the network at once, at which point a data "collision" occurs. While OSI Layer 4 protocols may attempt to compensate for this collision by retransmitting packets until they have reached the destination node without issue, this degrades network performance exponentially as the number of nodes on a network grows. The larger the network is the greater this issue becomes. OSI Layer 3 takes on this problem by its Logical Addressing system and a concept known as routing.
The Oxford American Dictionary defines routing as "Sending or directing along a specified course". Layer 3 routing on a network takes this foundational definition and puts it to use to enable millions of computers, rather than just a handful, to communicate at once without interference. This is achieved by having a smart device working at Layer 3 that handles network signals from each node directly rather than nodes just blindly repeating packets at Layer 1 until they happen to reach their destination. Such a device is known as a network router. A network router sits in the center of a network with all nodes having a direct link to it rather than being linked to each other. This strategic position allows the router to intercept and direct all traffic on the network. A routed network can be illustrated by a star formation, as shown in Diagram 1. On a routed network, Layer 3 packets are no longer broadcasted to all nodes, but rather received by the router and passed on only to the appropriate node. This is a valuable concept because it allows for the collision free-transport of packets across a network.

As well as being linked directly to all nodes in a local network, a router can be linked directly to other routers. This allows groups of nodes separated by distance to communicate with each other in a practical way. It would not be practical to have nodes separated by a great distance all connect to a single router. The amount of cabling required would be immense, and depending on the number of nodes involved, the router may not posses the required number of physical connections. Placing a router at each group of nodes and running a single line from router to router, however, is quite practical. Routers can be chained in a line, or as shown in Diagram 2, can be connected by a central router. This concept is virtually infinitely scalable and is very efficient.
When a node starts a transmission, the OSI Layer 3 protocol takes the information passed down from higher layers and encapsulates it with the logical address of the destination node in a unit called a packet. This packet, then passing through the remaining lower layer protocols, is transmitted over the network medium from the node to the router. This router reads the logical address that the packet contains and compares it to a list of physical addresses of nodes directly connected to it. If the packet's destination address matches an entry in this list, the packet is transmitted directly on the line that leads straight to the destination node. If the router does not know of a direct connection to the destination node, the packet is transmitted on a line leading directly to another router. This router then treats the packet much like the first router did upon receipt. The packet's logical address is checked for matches against the list of logical addresses belonging to nodes directly connected to the router. If the packet reaches a router with connections only to other routers, as shown in Diagram 2, the router uses the logical address' orderly numbering scheme to try and determine the closest router to the destination node and then transmits the packet to that router.






IP, undoubtedly the world's most used Layer 3 Protocol, provides an excellent example of how this system works. In IP, logical addresses look like four sets of up to three numbers.5 Diagram 3 shows an example of an IP address. IP addresses are orderly on four levels, from left to right. The first section of the IP address refers to a top level router, or a router that is at the highest level of this particular branch of the network. In Diagram 3, the first number is 66. Therefore all IP addresses between 66.0.0.1 and 66.255.255.255 are managed by this router. Only one router is required in a routed network, but more may exist. A router may have a maximum of 255 nodes, which may be either ordinary nodes or other routers. This effectively means that each branch of a network, a group of nodes that have the first set of numbers in their IP address in common, could theoretically have over sixteen million end nodes and still operate with near peak efficiency6.

As we can now see, the OSI Reference Model Layer 3 is one of the most complex, but most functionally important, parts of the modern day network. The Layer 3 protocol IP stands for Internet Protocol and is the protocol handling virtually all traffic on the internet today. The fashion in which Layer 3 protocols connect computers in a star-shaped, extensible network is much of the reason the internet is commonly called the "web".

2.4 OSI LAYER 4: THE TRANSPORT LAYER

OSI Layer 4 is known as the Transport Layer. Since we are now above Layer 3, all information transfered is assumed to be at the correct destination node and is being passed up to Layer 4. The Transport Layer is responsible for the reliability of the link between two end users and for dividing the data that is being transmitted by assigning port numbers to its Layer 4 packages, known as segments. Ports can be thought of as virtual destination mailboxes or outlets. When information reaches a Layer 4 protocol, the segment is examined to determine the destination port of the data it contains. Once the port is determined, just as all of the past layers have done, the wrapper is discarded and the payload data passed up to the next layer's protocol.
Ports allow more than one set of Layer 5-7 protocols to exist on a single node. This is important if the node has more than one purpose. Modern home computers utilize many ports during everyday use, because the modern computer user demands that a computer serve many purposes at once. Higher layer protocols that provide services such as email, web browsing, text chat, file transfer and more each operate on their own unique Layer 4 port, allowing all of these protocols to be operated at once without interference.
On the reliability front, Transport Layer protocols can be capable of running a checksum7 on the payload data they carry. This allows the protocol to determine the integrity of incoming payload data. If this data has been corrupted or its integrity compromised, the Layer 4 protocol will request the segment be retransmitted.

2.5 OSI LAYER 5: THE SESSION LAYER

While being an optional layer in most protocol packages today, OSI Layer 5, known as the Session Layer, still serves a purpose in the OSI Reference Model. The Session Layer draws the outline for protocols that manage the combination and synchronization of data from two separate higher layers. Layer 5 protocols are responsible for ensuring that the data is synced and consistent before transmitted. A good example situation is the streaming of live multimedia audio and video, where near perfect synchronization between video and audio is desired.

2.6 OSI LAYERS 6 & 7: THE PRESENTATION AND APPLICATION LAYERS

The sixth and seventh layers in the OSI Reference Model are the Presentation Layer and the Application Layer. The primary purpose of these layers is to facilitate the movement of formatted information between applications interacting with end users on nodes by way of the lower layer protocols. Commonly used top layer protocols are HTTPS (for the secure transfer of web page related files), File Transfer Protocol (FTP), Simple Mail Transfer Protocol (SMTP, used for the sending of email messages), and SSH (Secure Shell, used for secure remote shell8 access to a computer operating system).

2.7 SUMMARY OF THE OSI REFERENCE MODEL CONCEPT


The OSI Reference Model exists not to make hard rules or to shape the industry, but to provide a logical, well-researched, and tested model after which the world's best communication protocol "stacks" are modeled. Protocol stacks are a set of two or more protocols that stack on top of each other following the lead of the OSI Reference Model's layered format. The TCP/IP stack is very well-known for being the driving force behind most of the internet, and represents the third (IP) and fourth (TCP) layers of the OSI Model. Every layer in the OSI Model is a reference for a protocol which must facilitate communication between both higher and lower layers. The "U-shaped" example shown in Diagram 4 provides a visual concept of how two users may be linked on a given network in reference to the OSI Model. Data starts and ends with the user. From the Application Layer of the first user, it must travel down through layers 7 to 1, across the transmission medium, then back up layers 1 to 7 to be presented at the Application Layer to the user on the end of the transmission. Diagram 4, of course, only shows an example of a path between two nodes. On node diagrams such as Diagram 1 and Diagram 2, each node is assumed to be operating some stack of OSI based protocols. Protocols defined by this reference are dependent on the next lowest layer protocol. So, for example, one could not run an Application Layer protocol on a node without the presence of Layer 1 through 6 protocols also being utilized on the node. A node could, however, operate with only three layer protocols if it just needed to interact with information in Network Layer (Layer 3) Packets. An example of such a node would be a router, since routers do not need to decipher payload data from layers any higher than Layer 3, the layer which caries the routing information. This stackable concept allows nodes to operate on a scalable range of complexity and capability.

1 Binary; Expressed in a system of numerical notation that has

2 rather than 10 as a base2 meaning 'in binary'

3 Node; A piece of equipment, such as a computer or other device, attached to a network.

4 MAC; Media Access Control. The lower sub-layer of the OSI data link layer. The upper and lower sub-layers of the Data Link Layer are not covered in this document.

5 IPv4 is the standard used for this example. The forthcoming IPv6 uses a different logical addressing scheme, to the same end.

6 IPv4 is the standard used for this example. These figures may not specifically apply in the forthcoming IPv6 standard.

7 Checksum; A digit or digits representing the sum of the correct digits in a piece of stored or transmitted digital data, against which later comparisons can be made to detect errors in the data.

8 The shell of an operating system is a program that presents an interface to various operating system functions and services. The shell is so called because it is an outer layer of interface between the user and the innards of the operating system, the kernel. Source: Wikipedia