The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suite are the Transmission Control Protocol (TCP), the User Datagram Protocol (UDP), and the Internet Protocol (IP). Early versions of this networking model were known as the Department of Defense (DoD) model because the research and development were funded by the United States Department of Defense through DARPA.
The Internet protocol suite provides end-to-end data communication specifying how data should be packetized, addressed, transmitted, routed, and received. This functionality is organized into four abstraction layers, which classify all related protocols according to each protocol's scope of networking. An implementation of the layers for a particular application forms a protocol stack. From lowest to highest, the layers are the link layer, containing communication methods for data that remains within a single network segment (link); the internet layer, providing internetworking between independent networks; the transport layer, handling host-to-host communication; and the application layer, providing process-to-process data exchange for applications.
The technical standards underlying the Internet protocol suite and its constituent protocols are maintained by the Internet Engineering Task Force (IETF). The Internet protocol suite predates the OSI model, a more comprehensive reference framework for general networking systems.
Initially referred to as the DOD Internet Architecture Model, the Internet protocol suite has its roots in research and development sponsored by the Defense Advanced Research Projects Agency (DARPA) in the late 1960s.[1] After DARPA initiated the pioneering ARPANET in 1969, Steve Crocker established a "Networking Working Group" which developed a host-host protocol, the Network Control Program (NCP). In the early 1970s, DARPA started work on several other data transmission technologies, including mobile packet radio, packet satellite service, local area networks, and other data networks in the public and private domains. In 1972, Bob Kahn joined the DARPA Information Processing Technology Office, where he worked on both satellite packet networks and ground-based radio packet networks, and recognized the value of being able to communicate across both. In the spring of 1973, Vinton Cerf joined Kahn with the goal of designing the next protocol generation for the ARPANET to enable internetworking.[2] [3] They drew on the experience from the ARPANET research community, the International Network Working Group, which Cerf chaired, and researchers at Xerox PARC.[4] [5]
By the summer of 1973, Kahn and Cerf had worked out a fundamental reformulation, in which the differences between local network protocols were hidden by using a common internetwork protocol, and, instead of the network being responsible for reliability, as in the existing ARPANET protocols, this function was delegated to the hosts. Cerf credits Louis Pouzin and Hubert Zimmermann, designers of the CYCLADES network, with important influences on this design.[6] [7] The new protocol was implemented as the Transmission Control Program in 1974 by Cerf, Yogen Dalal and Carl Sunshine.
Initially, the Transmission Control Program (the Internet Protocol did not then exist as a separate protocol) provided only a reliable byte stream service to its users, not datagrams.[8] Several versions were developed through the Internet Experiment Note series. As experience with the protocol grew, collaborators recommended division of functionality into layers of distinct protocols, allowing users direct access to datagram service. Advocates included Bob Metcalfe and Yogen Dalal at Xerox PARC;[9] [10] Danny Cohen, who needed it for his packet voice work; and Jonathan Postel of the University of Southern California's Information Sciences Institute, who edited the Request for Comments (RFCs), the technical and strategic document series that has both documented and catalyzed Internet development.[11] Postel stated, "We are screwing up in our design of Internet protocols by violating the principle of layering." Encapsulation of different mechanisms was intended to create an environment where the upper layers could access only what was needed from the lower layers. A monolithic design would be inflexible and lead to scalability issues. In version 4, written in 1978, Postel split the Transmission Control Program into two distinct protocols, the Internet Protocol as connectionless layer and the Transmission Control Protocol as a reliable connection-oriented service.[12] [13] [14] [15]
The design of the network included the recognition that it should provide only the functions of efficiently transmitting and routing traffic between end nodes and that all other intelligence should be located at the edge of the network, in the end nodes. This end-to-end principle was pioneered by Louis Pouzin in the CYCLADES network,[16] based on the ideas of Donald Davies.[17] [18] Using this design, it became possible to connect other networks to the ARPANET that used the same principle, irrespective of other local characteristics, thereby solving Kahn's initial internetworking problem. A popular expression is that TCP/IP, the eventual product of Cerf and Kahn's work, can run over "two tin cans and a string." Years later, as a joke in 1999, the IP over Avian Carriers formal protocol specification was created and successfully tested two years later. 10 years later still, it was adapted for IPv6.
DARPA contracted with BBN Technologies, Stanford University, and the University College London to develop operational versions of the protocol on several hardware platforms.[19] During development of the protocol the version number of the packet routing layer progressed from version 1 to version 4, the latter of which was installed in the ARPANET in 1983. It became known as Internet Protocol version 4 (IPv4) as the protocol that is still in use in the Internet, alongside its current successor, Internet Protocol version 6 (IPv6).
In 1975, a two-network IP communications test was performed between Stanford and University College London. In November 1977, a three-network IP test was conducted between sites in the US, the UK, and Norway. Several other IP prototypes were developed at multiple research centers between 1978 and 1983.[20]
A computer called a router is provided with an interface to each network. It forwards network packets back and forth between them. Originally a router was called gateway, but the term was changed to avoid confusion with other types of gateways.[21]
In March 1982, the US Department of Defense declared TCP/IP as the standard for all military computer networking.[22] [23] [24] In the same year, NORSAR/NDRE and Peter Kirstein's research group at University College London adopted the protocol.[25] The migration of the ARPANET from NCP to TCP/IP was officially completed on flag day January 1, 1983, when the new protocols were permanently activated.[26]
In 1985, the Internet Advisory Board (later Internet Architecture Board) held a three-day TCP/IP workshop for the computer industry, attended by 250 vendor representatives, promoting the protocol and leading to its increasing commercial use. In 1985, the first Interop conference focused on network interoperability by broader adoption of TCP/IP. The conference was founded by Dan Lynch, an early Internet activist. From the beginning, large corporations, such as IBM and DEC, attended the meeting.[27]
IBM, AT&T and DEC were the first major corporations to adopt TCP/IP, this despite having competing proprietary protocols. In IBM, from 1984, Barry Appelman's group did TCP/IP development. They navigated the corporate politics to get a stream of TCP/IP products for various IBM systems, including MVS, VM, and OS/2. At the same time, several smaller companies, such as FTP Software and the Wollongong Group, began offering TCP/IP stacks for DOS and Microsoft Windows.[28] The first VM/CMS TCP/IP stack came from the University of Wisconsin.[29]
Some of the early TCP/IP stacks were written single-handedly by a few programmers. Jay Elinsky and Oleg Vishnepolsky of IBM Research wrote TCP/IP stacks for VM/CMS and OS/2, respectively. In 1984 Donald Gillies at MIT wrote a ntcp multi-connection TCP which runs atop the IP/PacketDriver layer maintained by John Romkey at MIT in 1983–84. Romkey leveraged this TCP in 1986 when FTP Software was founded.[30] [31] Starting in 1985, Phil Karn created a multi-connection TCP application for ham radio systems (KA9Q TCP).[32]
The spread of TCP/IP was fueled further in June 1989, when the University of California, Berkeley agreed to place the TCP/IP code developed for BSD UNIX into the public domain. Various corporate vendors, including IBM, included this code in commercial TCP/IP software releases. For Windows 3.1, the dominant PC operating system among consumers in the first half of the 1990s, Peter Tattam's release of the Trumpet Winsock TCP/IP stack was key to bringing the Internet to home users. Trumpet Winsock allowed TCP/IP operations over a serial connection (SLIP or PPP). The typical home PC of the time had an external Hayes-compatible modem connected via an RS-232 port with an 8250 or 16550 UART which required this type of stack. Later, Microsoft would release their own TCP/IP add-on stack for Windows for Workgroups 3.11 and a native stack in Windows 95. These events helped cement TCP/IP's dominance over other protocols on Microsoft-based networks, which included IBM's Systems Network Architecture (SNA), and on other platforms such as Digital Equipment Corporation's DECnet, Open Systems Interconnection (OSI), and Xerox Network Systems (XNS).
Nonetheless, for a period in the late 1980s and early 1990s, engineers, organizations and nations were polarized over the issue of which standard, the OSI model or the Internet protocol suite, would result in the best and most robust computer networks.[33] [34] [35]
The technical standards underlying the Internet protocol suite and its constituent protocols have been delegated to the Internet Engineering Task Force (IETF).[36] [37]
The characteristic architecture of the Internet protocol suite is its broad division into operating scopes for the protocols that constitute its core functionality. The defining specifications of the suite are RFC 1122 and 1123, which broadly outlines four abstraction layers (as well as related protocols); the link layer, IP layer, transport layer, and application layer, along with support protocols. These have stood the test of time, as the IETF has never modified this structure. As such a model of networking, the Internet protocol suite predates the OSI model, a more comprehensive reference framework for general networking systems.
The end-to-end principle has evolved over time. Its original expression put the maintenance of state and overall intelligence at the edges, and assumed the Internet that connected the edges retained no state and concentrated on speed and simplicity. Real-world needs for firewalls, network address translators, web content caches and the like have forced changes in this principle.[38]
The robustness principle states: "In general, an implementation must be conservative in its sending behavior, and liberal in its receiving behavior. That is, it must be careful to send well-formed datagrams, but must accept any datagram that it can interpret (e.g., not object to technical errors where the meaning is still clear)." "The second part of the principle is almost as important: software on other hosts may contain deficiencies that make it unwise to exploit legal but obscure protocol features."
Encapsulation is used to provide abstraction of protocols and services. Encapsulation is usually aligned with the division of the protocol suite into layers of general functionality. In general, an application (the highest level of the model) uses a set of protocols to send its data down the layers. The data is further encapsulated at each level.
An early pair of architectural documents, and, titled Requirements for Internet Hosts, emphasizes architectural principles over layering. RFC 1122/23 are structured in sections referring to layers, but the documents refer to many other architectural principles, and do not emphasize layering. They loosely defines a four-layer model, with the layers having names, not numbers, as follows:
See main article: article and Link layer. The protocols of the link layer operate within the scope of the local network connection to which a host is attached. This regime is called the link in TCP/IP parlance and is the lowest component layer of the suite. The link includes all hosts accessible without traversing a router. The size of the link is therefore determined by the networking hardware design. In principle, TCP/IP is designed to be hardware independent and may be implemented on top of virtually any link-layer technology. This includes not only hardware implementations but also virtual link layers such as virtual private networks and networking tunnels.
The link layer is used to move packets between the internet layer interfaces of two different hosts on the same link. The processes of transmitting and receiving packets on the link can be controlled in the device driver for the network card, as well as in firmware or by specialized chipsets. These perform functions, such as framing, to prepare the internet layer packets for transmission, and finally transmit the frames to the physical layer and over a transmission medium. The TCP/IP model includes specifications for translating the network addressing methods used in the Internet Protocol to link-layer addresses, such as media access control (MAC) addresses. All other aspects below that level, however, are implicitly assumed to exist and are not explicitly defined in the TCP/IP model.
The link layer in the TCP/IP model has corresponding functions in Layer 2 of the OSI model.
See main article: article and Internet layer. Internetworking requires sending data from the source network to the destination network. This process is called routing and is supported by host addressing and identification using the hierarchical IP addressing system. The internet layer provides an unreliable datagram transmission facility between hosts located on potentially different IP networks by forwarding datagrams to an appropriate next-hop router for further relaying to its destination. The internet layer has the responsibility of sending packets across potentially multiple networks. With this functionality, the internet layer makes possible internetworking, the interworking of different IP networks, and it essentially establishes the Internet.
The internet layer does not distinguish between the various transport layer protocols. IP carries data for a variety of different upper layer protocols. These protocols are each identified by a unique protocol number: for example, Internet Control Message Protocol (ICMP) and Internet Group Management Protocol (IGMP) are protocols 1 and 2, respectively.
The Internet Protocol is the principal component of the internet layer, and it defines two addressing systems to identify network hosts and to locate them on the network. The original address system of the ARPANET and its successor, the Internet, is Internet Protocol version 4 (IPv4). It uses a 32-bit IP address and is therefore capable of identifying approximately four billion hosts. This limitation was eliminated in 1998 by the standardization of Internet Protocol version 6 (IPv6) which uses 128-bit addresses. IPv6 production implementations emerged in approximately 2006.
See also: Transport layer. The transport layer establishes basic data channels that applications use for task-specific data exchange. The layer establishes host-to-host connectivity in the form of end-to-end message transfer services that are independent of the underlying network and independent of the structure of user data and the logistics of exchanging information. Connectivity at the transport layer can be categorized as either connection-oriented, implemented in TCP, or connectionless, implemented in UDP. The protocols in this layer may provide error control, segmentation, flow control, congestion control, and application addressing (port numbers).
For the purpose of providing process-specific transmission channels for applications, the layer establishes the concept of the network port. This is a numbered logical construct allocated specifically for each of the communication channels an application needs. For many types of services, these port numbers have been standardized so that client computers may address specific services of a server computer without the involvement of service discovery or directory services.
Because IP provides only a best-effort delivery, some transport-layer protocols offer reliability.
TCP is a connection-oriented protocol that addresses numerous reliability issues in providing a reliable byte stream:
The newer Stream Control Transmission Protocol (SCTP) is also a reliable, connection-oriented transport mechanism. It is message-stream-oriented, not byte-stream-oriented like TCP, and provides multiple streams multiplexed over a single connection. It also provides multihoming support, in which a connection end can be represented by multiple IP addresses (representing multiple physical interfaces), such that if one fails, the connection is not interrupted. It was developed initially for telephony applications (to transport SS7 over IP).
Reliability can also be achieved by running IP over a reliable data-link protocol such as the High-Level Data Link Control (HDLC).
The User Datagram Protocol (UDP) is a connectionless datagram protocol. Like IP, it is a best-effort, unreliable protocol. Reliability is addressed through error detection using a checksum algorithm. UDP is typically used for applications such as streaming media (audio, video, Voice over IP, etc.) where on-time arrival is more important than reliability, or for simple query/response applications like DNS lookups, where the overhead of setting up a reliable connection is disproportionately large. Real-time Transport Protocol (RTP) is a datagram protocol that is used over UDP and is designed for real-time data such as streaming media.
The applications at any given network address are distinguished by their TCP or UDP port. By convention, certain well-known ports are associated with specific applications.
The TCP/IP model's transport or host-to-host layer corresponds roughly to the fourth layer in the OSI model, also called the transport layer.
QUIC is rapidly emerging as an alternative transport protocol. Whilst it is technically carried via UDP packets it seeks to offer enhanced transport connectivity relative to TCP. HTTP/3 works exclusively via QUIC.
The application layer includes the protocols used by most applications for providing user services or exchanging application data over the network connections established by the lower-level protocols. This may include some basic network support services such as routing protocols and host configuration. Examples of application layer protocols include the Hypertext Transfer Protocol (HTTP), the File Transfer Protocol (FTP), the Simple Mail Transfer Protocol (SMTP), and the Dynamic Host Configuration Protocol (DHCP).[43] Data coded according to application layer protocols are encapsulated into transport layer protocol units (such as TCP streams or UDP datagrams), which in turn use lower layer protocols to effect actual data transfer.
The TCP/IP model does not consider the specifics of formatting and presenting data and does not define additional layers between the application and transport layers as in the OSI model (presentation and session layers). According to the TCP/IP model, such functions are the realm of libraries and application programming interfaces. The application layer in the TCP/IP model is often compared to a combination of the fifth (session), sixth (presentation), and seventh (application) layers of the OSI model.
Application layer protocols are often associated with particular client–server applications, and common services have well-known port numbers reserved by the Internet Assigned Numbers Authority (IANA). For example, the HyperText Transfer Protocol uses server port 80 and Telnet uses server port 23. Clients connecting to a service usually use ephemeral ports, i.e., port numbers assigned only for the duration of the transaction at random or from a specific range configured in the application.
At the application layer, the TCP/IP model distinguishes between user protocols and support protocols. Support protocols provide services to a system of network infrastructure. User protocols are used for actual user applications. For example, FTP is a user protocol and DNS is a support protocol.
Although the applications are usually aware of key qualities of the transport layer connection such as the endpoint IP addresses and port numbers, application layer protocols generally treat the transport layer (and lower) protocols as black boxes which provide a stable network connection across which to communicate. The transport layer and lower-level layers are unconcerned with the specifics of application layer protocols. Routers and switches do not typically examine the encapsulated traffic, rather they just provide a conduit for it. However, some firewall and bandwidth throttling applications use deep packet inspection to interpret application data. An example is the Resource Reservation Protocol (RSVP). It is also sometimes necessary for Applications affected by NAT to consider the application payload.
The Internet protocol suite evolved through research and development funded over a period of time. In this process, the specifics of protocol components and their layering changed. In addition, parallel research and commercial interests from industry associations competed with design features. In particular, efforts in the International Organization for Standardization led to a similar goal, but with a wider scope of networking in general. Efforts to consolidate the two principal schools of layering, which were superficially similar, but diverged sharply in detail, led independent textbook authors to formulate abridging teaching tools.
The following table shows various such networking models. The number of layers varies between three and seven.
Arpanet Reference Model (RFC 871) | Internet Standard (RFC 1122) | Internet model (Cisco Academy[44]) | TCP/IP 5-layer reference model (Kozierok,[45] Comer[46]) | TCP/IP 5-layer reference model (Tanenbaum[47]) | TCP/IP protocol suite or Five-layer Internet model (Forouzan,[48] Kurose[49]) | TCP/IP model (Stallings[50]) | OSI model (ISO/IEC 7498-1:1994[51]) | |
---|---|---|---|---|---|---|---|---|
Three layers | Four layers | Four layers | Four+one layers | Five layers | Five layers | Five layers | Seven layers | |
Application/ Process | Application | Application | Application | Application | Application | Application | Application | |
Presentation | ||||||||
Session | ||||||||
Host-to-host | Transport | Transport | Transport | Transport | Transport | Host-to-host or transport | Transport | |
Internet | Internetwork | Internet | Internet | Network | Internet | Network | ||
Network interface | Link | Network interface | Data link (Network interface) | Data link | Data link | Network access | Data link | |
(Hardware) | Physical | Physical | Physical | Physical |
Some of the networking models are from textbooks, which are secondary sources that may conflict with the intent of RFC 1122 and other IETF primary sources.
The three top layers in the OSI model, i.e. the application layer, the presentation layer and the session layer, are not distinguished separately in the TCP/IP model which only has an application layer above the transport layer. While some pure OSI protocol applications, such as X.400, also combined them, there is no requirement that a TCP/IP protocol stack must impose monolithic architecture above the transport layer. For example, the NFS application protocol runs over the External Data Representation (XDR) presentation protocol, which, in turn, runs over a protocol called Remote Procedure Call (RPC). RPC provides reliable record transmission, so it can safely use the best-effort UDP transport.
Different authors have interpreted the TCP/IP model differently, and disagree whether the link layer, or any aspect of the TCP/IP model, covers OSI layer 1 (physical layer) issues, or whether TCP/IP assumes a hardware layer exists below the link layer. Several authors have attempted to incorporate the OSI model's layers 1 and 2 into the TCP/IP model since these are commonly referred to in modern standards (for example, by IEEE and ITU). This often results in a model with five layers, where the link layer or network access layer is split into the OSI model's layers 1 and 2.
The IETF protocol development effort is not concerned with strict layering. Some of its protocols may not fit cleanly into the OSI model, although RFCs sometimes refer to it and often use the old OSI layer numbers. The IETF has repeatedly stated that Internet Protocol and architecture development is not intended to be OSI-compliant. RFC 3439, referring to the internet architecture, contains a section entitled: "Layering Considered Harmful".
For example, the session and presentation layers of the OSI suite are considered to be included in the application layer of the TCP/IP suite. The functionality of the session layer can be found in protocols like HTTP and SMTP and is more evident in protocols like Telnet and the Session Initiation Protocol (SIP). Session-layer functionality is also realized with the port numbering of the TCP and UDP protocols, which are included in the transport layer of the TCP/IP suite. Functions of the presentation layer are realized in the TCP/IP applications with the MIME standard in data exchange.
Another difference is in the treatment of routing protocols. The OSI routing protocol IS-IS belongs to the network layer, and does not depend on CLNS for delivering packets from one router to another, but defines its own layer-3 encapsulation. In contrast, OSPF, RIP, BGP and other routing protocols defined by the IETF are transported over IP, and, for the purpose of sending and receiving routing protocol packets, routers act as hosts. As a consequence, routing protocols are included in the application layer. Some authors, such as Tanenbaum in Computer Networks, describe routing protocols in the same layer as IP, reasoning that routing protocols inform decisions made by the forwarding process of routers.
IETF protocols can be encapsulated recursively, as demonstrated by tunnelling protocols such as Generic Routing Encapsulation (GRE). GRE uses the same mechanism that OSI uses for tunnelling at the network layer.
The Internet protocol suite does not presume any specific hardware or software environment. It only requires that hardware and a software layer exists that is capable of sending and receiving packets on a computer network. As a result, the suite has been implemented on essentially every computing platform. A minimal implementation of TCP/IP includes the following: Internet Protocol (IP), Address Resolution Protocol (ARP), Internet Control Message Protocol (ICMP), Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Internet Group Management Protocol (IGMP). In addition to IP, ICMP, TCP, UDP, Internet Protocol version 6 requires Neighbor Discovery Protocol (NDP), ICMPv6, and Multicast Listener Discovery (MLD) and is often accompanied by an integrated IPSec security layer.