Networking - ARP - About ARP
Two machines in a network can only communicate with each other if they know each other’s physical address.
Although computer programs use IP addresses to send and receive messages, the actual underlying communication always happens over the physical address.
Let’s first understand how communication happens over the wire.
Let’s try pinging Google's publicly available DNS server from a machine, and try capturing network packets and see what are the source and destination addresses.
ping 8.8.8.8
result:
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=15.4 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=58 time=15.3 ms 64 bytes from 8.8.8.8: icmp_seq=3 ttl=58 time=15.2 ms 64 bytes from 8.8.8.8: icmp_seq=4 ttl=58 time=15.2 ms 64 bytes from 8.8.8.8: icmp_seq=5 ttl=58 time=15.2 ms
Now at the same time, as the above ping is working, let's try capturing network packets using another shell session on the same server.
TCP Dump will be used in this example for capturing network packets but any alternative program can be used instead.
tcpdump -n host 8.8.8.8
- -n host 8.8.8.8 will only capture packets where either the source or the destination is 8.8.8.8 (Also it will show IP addresses in the output rather than DNS names).
result:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 21:39:41.531390 IP 192.168.1.2 > 8.8.8.8: ICMP echo request, id 16331, seq 1, length 64 21:39:41.540342 IP 8.8.8.8 > 192.168.1.2: ICMP echo reply, id 16331, seq 1, length 64 21:39:42.531815 IP 192.168.1.2 > 8.8.8.8: ICMP echo request, id 16331, seq 2, length 64 21:39:42.540840 IP 8.8.8.8 > 192.168.1.2: ICMP echo reply, id 16331, seq 2, length 64
The output of tcpdump command is pretty straight forward.
It shows a continues series of ICMP echo requests going out from our server (indicated by 192.168.1.2), and subsequent replies coming back from google (indicated by 8.8.8.8).
As 8.8.8.8 is not in the same network, the local server cannot reach there directly without a gateway.
So the ping requests to 8.8.8.8 should flow via your gateway.
Gateway address can be found using the command route -n.
route -n
result:
Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 eth0 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
NOTE: The gateway here is 192.168.1.1.
This is clearly indicated by the very first line in the above output.
For reaching anywhere (indicated by 0.0.0.0), the packets should flow via the gateway address of 192.168.1.1.
So even if we need to reach 8.8.8.8, we need to go via 192.168.1.1 (as it is the gateway).
But why is tcpdump output not showing any trace of 192.168.1.1 (gateway)?
Tcpdump is showing that the source address is 192.168.1.2 and destination is 8.8.8.8.
As 8.8.8.8 is not part of our local network, we will have to go via our gateway address of 192.168.1.1.
So somewhere the destination address should be 192.168.1.1 right?. Else how will our packets reach our gateway?
Our ping is working perfectly. So its surely using the gateway to reach 8.8.8.8 (as there is no other way out).
But where the hell is the gateway address in the packet.
The packet is showing the destination address of 8.8.8.8.
But then how is it reaching the gateway?
This is exactly where physical addresses (MAC Addresses) steps in.
As the ping to 8.8.8.8 is going on, lets execute tcpdump on another session once again (this time with an additional option -e.)
tcpdump -e -n host 8.8.8.8
result:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 21:47:56.820194 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 192.168.1.2 > 8.8.8.8: ICMP echo request, id 16347, seq 1, length 64 21:47:56.829102 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16347, seq 1, length 64 21:47:57.821516 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 192.168.1.2 > 8.8.8.8: ICMP echo request, id 16347, seq 2, length 64 21:47:57.830386 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16347, seq 2, length 64
This time along with the IP addresses, we are able to see physical addresses (mac addresses) as well in the output.
Indicated by 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1 & 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed.
ifconfig eth0
result:
eth0 Link encap:Ethernet HWaddr 12:6e:eb:de:b3:ed inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::106e:ebff:fede:b3ed/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1 RX packets:1200693 errors:0 dropped:0 overruns:0 frame:0 TX packets:945763 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2452613050 (2.4 GB) TX bytes:447161879 (447.1 MB)
From the above ifconfig command output, we can confirm that 12:6e:eb:de:b3:ed is our server's mac address(indicated by HWaddr 12:6e:eb:de:b3:ed in the ifconfig output).
But what is 12:6f:56:c0:c4:c1?
We can use a command called arp -n -a to find out what is 12:6f:56:c0:c4:c1.
ARP stands for address resolution protocol. It does the job of translating IP addresses to MAC addresses.
So arp -n -a will show all the mac addresses and their equivalent IP addresses that our server is aware of.
arp -n -a
result:
? (192.168.1.40) at 12:f7:fd:48:aa:79 [ether] on eth0 ? (172.17.0.2) at 02:42:ac:11:00:02 [ether] on docker0 ? (192.168.1.43) at 12:48:08:aa:a5:bb [ether] on eth0 ? (192.168.1.8) at 12:ab:ed:67:34:79 [ether] on eth0 ? (192.168.1.94) at 12:47:87:c2:60:8d [ether] on eth0 ? (192.168.1.1) at 12:6f:56:c0:c4:c1 [ether] on eth0
12:6f:56:c0:c4:c1 is the mac address of the gateway (192.168.1.1).
So basically even if the destination IP address is 8.8.8.8, the destination mac address will always be of the gateway server.
MAC addresses (Physical addresses) are part of layer 2.
IP addresses are part of layer 3 (source address).
The content of layer 3 is encapsulated inside layer 2.
Layer 2 will have the source mac address of our server, and the destination mac address of the gateway. This is how the packet reaches the gateway.
Gateway will peal the physical layer 2, and as soon as it finds the destination as 8.8.8.8, it will forward that packet again to its gateway (i.e.: Our gateway will forward the packet to the next gateway, well depending upon the routes.).
This is how the packet travels and reaches its final destination of 8.8.8.8. The second last network device in the path to reaching 8.8.8.8, will know the mac address of 8.8.8.8 using ARP protocol.
The bottom line is…If you want to reach a particular destination IP address, the system will be doing a translation of that IP address to equivalent mac address. Because the real communication happens using physical addresses. ARP (Address Resolution Protocol) is used in order to find the physical address associated with an IP address.
Above shown diagram explains how a computer finds out the mac address associated with an IP address using Address Resolution Protocol. The Very first request shown in the above diagram depicts an “ARP request from 192.168.1.2” to find out the MAC address of 192.168.1.1.
This ARP request is a broadcast request. This is the reason why destination MAC address in this request is set to 00:00:00:00:00 (broadcast mac address). When the network device to which all the computers in this network is connected receives such a request with the destination address of 00:00:00:00:00, it will forward that request to all the computers in that network(well that is what broadcast means. Send it to everybody connected).
Although every computer in the network receives that request. Only the computer that has the IP address of 192.168.1.1 will respond back. Everybody else in the network will discard this request after verifying the destination IP address. Only the computer who's IP address matches the destination IP address in the ARP request will respond back.
While responding back, it will send its own mac address. This way 192.168.1.2 finds out the mac address associated with 192.168.1.1.
Below terms about ARP is worth noting:
- ARP Cache: After finding the MAC address associated with an IP, the computer stores it in a table for future reference. All subsequent communication to that IP address can use the mac address from the this table. This is table is also called as ARP Cache.
- ARP Cache Timeout: The entries added to ARP table for future reference will be valid for a specified amount of time. This indicates that time.
- ARP Request: We already saw that above. Its the broadcast request send by a computer to find out the mac associated with an IP address.
- ARP Response: As shown in the above diagram, this is the response from the destination host, containing both IP and MAC.