We were in the process of evaluating different AWS Load Balancers which we need to put in front of our Infrastructure. However, given the different options we have within AWS for different load balancers, it was not quite obvious what was the difference between those load balancers and the internals for those load balancers. Also one of the most troubling facts while going through the internals was that I had very little knowledge of the layers at which these load balancers operated.
With that in mind, let me walk you through the different aspects of these network layers and what they are. This is really crucial to understanding the concept of reverse proxying at different layers.
TCP / IP Model Layers
- Different Layers in the Networking Stack
- Physical Layer / Network Interface Layer
- Network Layer / Internet Layer
- Transport Layer
- Application Layer
Let’s now understand the different layers in the OSI model with the help of a very simple example of a web server listening on a 8080 port and a simple GET request on this very port.
On the Client Side
#!usr/bin/python |
- First thing which happens on the client side when we initiate a simple GET request, is the creation of the socket connection between the client and server.
- This Socket Creation is handled by the Transport Layer. And when i mean is handled, you can think of in such a way that the transport layer exposes a simple function to create a socket connection from this client to a particular server. Socket Creation process is quite complicated and involves creation of a TCP connection which involves 3 way handshake
- Once the socket connection has been established, the http client ( in our case CURL client ) sends http data over this socket connection. When I mean sends the http data, it essentially means that the client writes the HTTP data to this socket connection.
- Now when the application layer writes this HTTP data to this socket, it reaches the transport layer. This transport layer is made aware of the source port and the destination port it needs to send over the data and then adds the source port and the destination port to the packet and passes it down to the network layer.
- Now the data packet reaches the network layer. Network layer adds the source IP and the destination IP to the packet and finally passes down the packet to the physical layer.
- Physical layer finally receives the packet from all the above layers. The packet finally looks something like this
- Source IP, Destination IP Added by Network Layer
- Source Port, Destination Port Added by Transport Layer
- HTTP Data Added by Application Layer
- Source Port, Destination Port Added by Transport Layer
- Source IP, Destination IP Added by Network Layer
- Physical Layer sends the data via 0 / 1 electrical signals on the particular interface. Interfaces could vary from eth0 to wlan0 and based on the particular interface , it involves the driver to send streams of 0s and 1s to the other systems.
On the Server Side
#!usr/bin/python |
- Physical Layer on the receiving end will get the byte streams of 0s and 1s. After the full packet has been received on the Physical Layer end, it sends the packet to the network layer
- Network Layer after receiving the packet figures out the destination IP for the packet and sees that the destination IP for the packet is the same as the machine’s IP address. It then passes down the packet to the relevant transport layer protocol i.e. TCP or UDP depending on the protocol bit in the header. It also truncates the source IP field and the destination IP field before sending the packet to the transport layer.
- Transport Layer after receiving the packet checks the destination port for the intended packet. After checking the destination port, it forwards the packet to the relevant process which is listening on the same destination port. By forward I mean it writes the data to the relevant FD so that the process can read from the File Descriptor at its own ease.
data = conn.recv(1024) # Receive client data
- When the application layer reads from the File Descriptor, it figures out from the format of the data it received is a HTTP call and it is a GET request for one of the resources.
References
- https://en.wikipedia.org/wiki/Internet_protocol_suite
- https://www.networkworld.com/
- https://en.wikipedia.org/wiki/Internet_protocol_suite
- https://docs.oracle.com/cd/E19455-01/806-0916/ipov-10/index.html
- https://www.scaler.com/topics/difference-between-transport-layer-and-network-layer/
- https://s.web.umkc.edu/sbs7vc/IT321/Transport/
- https://www.cloudflare.com/en-gb/learning/ddos/glossary/open-systems-interconnection-model-osi/
More Questions to be followed up in coming blog posts
There are still some questions which we have still not answered in this blog post because it will just bloat the blog post and reduce the simplicity of this. So for these questions , I will write another blog post to tackle those questions.
- On a Server listening on a particular port for each incoming client connection request, the server creates a new TCP connection. However the data for all of the clients will be coming on the same port on which the server is listening, so how does the kernel identify on which TCP connection does the packet needs to be written / send.
- How does the Network Layer comes to know of the Destination IP of the packet if the TCP connection has all the information regarding the <Source IP, Destination IP> and TCP connection lives on the Transport Layer
One thought on “Proxy Server – TCP / IP Model Internals”