computology.org

Networks

This page gives some explaination of the internet and the networks that proceeded it. This is made more difficult by the ambiguities in word usage throughout the industry. data-message, data-packet, data-frame, data-gram are all used in slightly different ways at different levels of network communication. This mess partly originates from the very different origins of the words. I shall try to clarify as we go along!

Another confusion is the use of the word "Host" for a computer connected to the network. In the old days when terminals were connected to a central computer, the term "host" made some sense as the terminal users were in effect visitors. In today's world it makes little sense to call a computer using a web-browser to look at pages served by another computer, a "host". In effect the server computer is the host while the browsing computer is the guest. But for some lack of understanding of the language all the computers get called hosts! I shall stick with calling them computers!

You might also come across the word "Octet" instead of "byte". That is because a very very long time ago there was some question as to whether a "byte" should be 8 binery bits. Given that the matter was largely resolved eons ago I will stick with the word "byte".

Communication Networks

Electrical connection

With two wires there are plenty of ways to send a message over long distances. The easiest is just with a bulb, battery and a morse key to switch the bulb on and off. Morse uses long and short pulses to send a message. You could make a new system and use a long pulse to mean a 1 and a short pulse to mean a 0, or a high tone to mean a 1 and a low tone to mean a 0. In practice there are many ways to signal 0s and 1s on wires or wireless systems. When multiple devices are connected to the same physical wires or wireless channel (they share the same transmission medium) then it is called a "network segment".

Circuit-switched networks

In the early days of the telegraph a pair of wires could make a physical connection via an electrical circuit between two devices (a morse key and bulb for example). With the introduction of the ability to send sound over the circuit there was the possibility of providing a general telephone service. Given that people might want to connect with different people at different times, when they first picked up the telephone they were connected to the "operator" at the "telephone exchange" and they told them who they wanted to be connected to. The operator then put the correct plug into the correct socket making the connection (circuit) for each telephone conversation or communication between users of the system, and disconnecting (the circuit) when they finished by removing the plug.

Automatic systems for making the connections were introduced with rotary number dials to dial telephone numbers. The dial sent pulses down the wires that drove mechanical switches (called a Stepping switches) that connected (made the circuit) the users' telephones. These formed the first automatic Telephone exchanges.

Networks like the old telephone network are called circuit-switched networks.

Teleprinters and Teletypes also communicated via the telephone network but using digital signals that effectively could be thought of as 1s and 0s encoded as tones, because telephone networks were designed for sound, a "modem" (MODulator-DEModulator) converted between digits and corresponding sounds. Dialing was done in the normal way any voice call was made and then the MODEM took over. Some lines were dedicated for the use of teletypes and so didn't need modems.

Message-switched networks

Making end-to-end connections over long distances is not the most efficient way to send digital messages as the connection is occupied while the users maintain the connection. With a message-switched network connections are only used during the actual transmission of the message, and upon completion of the transmission the connection is immediately freed again.

Message-switched networks send digital messages with their destination addresses attached. Switches that used to establish long connections can be replaced with digital devices (still called switches!) that can store digital messages and then forward them when the wire is free. Each message is treated as a separate entity. Each message contains address information of its destination, and at each switch this information is read and the transfer path to the next switch is decided.

Before the 1980s storing digital data was expensive and so circuit-switched networks were still common. However the Plan 55-A System used paper tape for message storage back in the 1950s and was a message-switched network!

Packet-switched networks

If a message is large it may not be possible to send it over some networks that can only handle messages up to a certain size. To overcome this it may be necessary to divide a message into a number of shorter messages. These shorter messages have been given the name "packets".

It is only a short step from a message-switched network to a packet-switched network. In a packet switched network the switches can split messages into smaller messages called "packets". A data-packet, as well as the destination address will also have a message number and packet number so that the receiver can re-assemble the whole message from the packets, when they are all recieved.

Virtual circuit-switched networks

Because of the efficiencies of packet-switched networks but the need for circuit-switched networks for things like telephone calls where we need what appears to be an end-to-end connection, and of course easily avalable computing power; It makes practical sense to simulate circuit-switched networks on packet-switched networks. This is exactly what happens when we have audio and video conversations over the internet.

Reality

The world has had many different communication networks in operation operating with multiple different protocols. Wikipedia: Packet-switched networks, Wikipedia: Protocol Wars, Wikipedia: X.25.

Computer networks

Computer networking like communication networking is also the problem is sending data down wires between devices.

Ethernet

A computer network system in common use today is called "ethernet" and is a data-message system for sending between network nodes identified by MAC addresses. The format of an ethernet data-message follows but it is not here for you to learn off by heart but rather to appreciate the general principle;

Here it is clear that the word "Frame" is used to refer to what follows the 7 byte preamble and preceeds the Frame Check Sequence (32-bit CRC). Also the words "Packet" are used to describe the whole thing which strictly is a data-message. Thus for clarity it is best to say "ethernet packet" and "ethernet frame" when talking about ethernet.

"Switches", already described above, have multiple physical connections (network segments) through their physical ports to other devices. When a device sends an ethernet packet to another device the switch looks at the source MAC address and remembers which port that MAC address is connected to. In future when it receives an ethernet packet destined for that MAC address it will send the packet out on just that port and none other.

When a switch receives an ethernet packet but does not know which port the destination MAC address is on, it sends the ethernet packet out on all ports ("broadcasts") except the port the ethernet packet came in on. Of course when a response is sent back, the switch then discovers and remembers which port it should have sent to.

Switches normally only remember for 300 seconds i.e. 5 minutes so when the network connections are changed the switches adapt.

Routers don't have the ability to remember which ports MAC addresses are on and so simply send out on all ports anyway.

Recall that "Switches" are called switches because in the first message-switched networks they replaced the role performed by the actual electrical switches used to form the end-to-end connections of circuit-switched networks. Youtube: How a Switch Forwards and Builds the MAC Address Table

A switch is often referred to as a "bridge" when it connect to only two network segments.

Wikipedia: Ethernet, Wikipedia: Ethernet frame

Ethernet works well on a LAN where "broadcasting" ethernet packets is not a problem but on a very large network like the internet, broadcasting in order to find MAC addresses would lead to millions or even billions of ethernet packets being sent out.

Network Reality

The world has had many different computer network systems with multiple different protocols, in operation during the time in which the internet has been developed and this was a necessary consideration during that time. Wikipedia: Computer network history, Wikipedia: Token Ring, Wikipedia: ARPANET.

Internet Protocol (IP)

Born out of chaos, the Internet Protocol (IP) has become the most popular standard for world communication and works on top of other communication and computer networks. The idea is simple. An Internet Protocol (IP) packet is after all just a data packet and so can be sent as the data (payload) of any other system. There is nothing particularly special about the IP packet format but there is something special about the IP addresses.

MAC Addresses are 6 bytes, 3 indicate the manufacturer and 3 the serial number for the physical device. A MAC address can end up anywhere in the world so the only way to send an eithernet packet to it would be to broadcast the packet across the internet which would lead to millions or even billions of ethernet packets being sent out.

IP addresses, on the other hand, are numbered to give an indication of where they are in the world network thus making the world wide routing problem much easier. Just looking at the number, tells you where to "route" an IP data packet. A special IP switch generally called a "router" does the job of forwarding IP data-packets. Some ethernet switches are also smart and look to see if the payload of their ethernet packets are IP packets in which case they use the IP address information to be a bit smarter about which port they output the packet to.

IP packets are "real" data-packets because they can be parts of a larger data-message which has been split up. This usually happens when the data-message is to big for some part of the network to handle. Standard ethernet packets have a maximum payload size of 1500 bytes for example. The IP packet header is shown below, for interests sake, and consist of 24 bytes with various bit fields as follows. The IP packets payload follows this header. Note the "16 bits - Total Length" field tells the receiver how big that payload will be.

All the computers in IPv4 network are assigned unique IP addresses. When a computer wants to send some data to another computer on the network, it needs the physical (MAC) address of the destination computer. If it does not already have the MAC address, the computer broadcasts an Address Resolution Protocol (ARP) message and asks for the MAC address from whoever is the owner of the IP address. All the computers on that network receive the ARP packet, but only the computer having the matching IP address replies with its MAC address. Once the sender receives the MAC address of the receiving computer the data is sent.

If this is happening on an ethernet then of course the IP packets and ARP message are all sent as the payload of an ethernet packet. Should I say "wrapped" in an ethernet packet.

Address Resolution Protocol (ARP) message has the following format but is wrapped inside an ethernet packet with the EtherType value of 00001000 00000110 (0x0806) indicating an ARP IP packet.

More about the Address Resolution Protocol can be found on Wikipedia.

TutorialsPoint: IPv4 - Example

Transmission Control Protocol (TCP)

The Transmission Control Protocol provides functions for programmers writing programs that communicate over the internet. The functions deal with the issues of packetising, unpacketising and error handling for the programmer.

As a programmer I don't want to send IP packets, I want to send whole messages backwards and forwards with another computer. I may want to establish a connection with another computer, communicate with it for a period of time and then drop the connection. I want the connection to be reliable. TCP gives me the functions I need, that I can call in the program I am writing, to do all this for me.

Because there could be many processes (running programs) running concurrently on a computer all of which might need to use the internet, TCP gives each process's connection a unique 16 bit number called a "port". This name has no correspondance to any real port like the physical ports on a switch for example. The computer is likely to have only one physical network connection. These TCP "Ports" are just numbers to simply allow data-packets arriving and being sent to identify the process they came from and should go to.

As well as the port numbers a TCP packet has a lot of extra data to do with the communication it's self. This TCP packet becomes the payload of an IP packet which may then become the payload of an Ethernet packet. The TCP header can be up to 60 bytes depending on options chosen.

Again the format is here for interests sake only.

This header looks bigger than it is because of all the flag details given here but it is only a maximum of 60 bytes.

Find full details of the packet structure can be found on Wikipedia: Transmission Control Protocol.

Layers

Often the idea that network protocols can be thought of as layered is presented as if it was an explaination of how networks such as the internet work. It clearly is not and doesn't give the fundamentals needed to understand networks. It can simply be regarded as an observation and an attempt to catagorise the various parts that contribute to internet communications.

There are many standards, programs and libraries written by those eager to make a contribution to networking. There is some confusion and disagreement about what fits in what layer. All that we can rely on is perhaps the layer names! Here is a catagrisation copied from Wikipedia: Open Systems Interconnection model page. It seems to miss out a few important things like Ethernet on layer 2 though does mention MAC, but it does give an idea of the amount of stuff people have created.

The TCP/IP system does not agree with the OSI divisions into these layers and so removes the Session and Presentation layers.

It also redefines the Network layer ignoring things like the fundamental Ethernet MAC Address based networking with it's switches, choosing to replace this with the idea of IP based addressing (IPv4 and IPv6) with it's routers. Routers are still nodes on the network but determine where to send data-packets based on IP addresses. This of course often involves wrapping IP packets up as Ethernet packets and dealing with Ethernets ability to deliver data-packets across network segments.

So in reality the whole layers idea is a bit of a mess and should be "taken with a pinch of salt".

As you may be aware IPv4 which is 4 bytes, has address size issues, see Wikipedia: IPv4 address exhaustion which is why IPv6 which is 6 bytes, was setup to replace it. Of course given that MAC addresses are 6 bytes anyway it does rather beg the question why the ethernet MAC addresses haven't replaced the IP addresses for general use?

Remember the reason is because MAC Addresses are 6 bytes, 3 indicate the manufacturer and 3 the serial number for the physical device, and a MAC address can end up anywhere in the world so the only way to send a packet to it would be to exhaustively search the internet which could take years. IP addresses, on the other hand, give an indication of where they are in the network thus making world wide routing possible.