How to use the OSI Model to Troubleshoot Networks at Layer 2
Press play to listen to the article.
Our previous article discusses how to use the OSI model to troubleshoot network problems at layer 1. We covered how to outline issues with your WiFi Networks and how to troubleshoot them using the OSI model.
Just to refresh your memory, the OSI model helps break down an issue and isolate the problem’s root. Ideally, it’s best taking a layer bottom-up approach as most of the WiFi problems happen in the first two layers of the OSI model. If the problem is not in layer 1 or 2, it is not a WiFi problem. Period!
In this article, we continue our way up in the OSI model with the data link layer.
OSI Model to Troubleshoot Networks at Layer 2
Data link is the second layer of the OSI model. It relates to how the systems using a physical link cooperate with one another.
It helps to transfer data between two devices on the same network. The data is broken down into packets. The data link layer’s job is to define unique sequences to indicate the beginning and the end for each packet. Also, it is directly responsible for flow and error control in intra-network communications.
The data link layer has two sublayers: the Logical Link Control (LLC), which interprets electricity, light, and WiFi into 1s and 0s that become the data packets. The other sublayer is the Media Access Control (MAC) layer, accountable for moving data packets to the Network Interface Card (NIC) to another across a shared channel. Thanks to MAC protocols used in the sublayer, the signals sent from different stations across the same channel don’t collide.
WiFi radios talk via 802.11 frame exchanges at the MAC sub-layer of the data link layer. Therefore, the next layer to look into when troubleshooting networks is layer 2 of the OSI model.
The most common problem in layer 2 is retransmissions that happens at the MAC sublayer. Everything starts when a transmitter device sends a unicast frame to a device. The receiver device uses a cyclic redundancy check, aka ‘CRC,’ to confirm the data packet reception’s integrity. If the CRC passes, it means the data packet has not been corrupted during transmission.
The receiver device will send an 802.11 acknowledgment ‘ACK’ frame back to the transmitter device, as a way to verify the data packet delivery. If a collision happens during the information transmission or part of the unicast frame is corrupted, the CRC will fail. Thus the receiver device won’t be sending an ACK frame to the transmitter device.
In turn, the transmitter device will transmit the frames again, causing retransmission. Retransmissions have a high impact on WiFi networks as it creates extra MAC layer overhead. Also, it consumes additional airtime in the half-duplex medium.
Layer 2 retransmissions have a negative effect. For instance, if the throughput goes down and latency goes up, it would most likely impact voice and video. So, an increase in latency will result in echo problems, and high jitter variations will result in disjointed audio. As a rule of thumb, for WiFi calls, the maximum rate of retransmissions your WiFi network can handle without affecting the service should be less than 2%.
Reasons for layer 2 retransmissions can be quite a few. For example, a radio frequency interference paired with low Signal to Noise Ratio (SNR) due to a lousy WiFi design. Both of them happening at layer 1. Furthermore, there’s the possibility of adjacent cell interference and a hidden node that can also cause higher percentages of layer 2 retries.
Let’s break the reasons down:
SNR (Signal-to-noise ratio)
It is the difference between the received signal power and the noise power expressed in decibels. The retransmissions at layer 2 increase when the background noise is close to the received signal power or if the signal is too low. Stats to live by for WLANs: A good signal quality should be between 20 and 25 dB. Anything below these ranges is considered low signal quality.
It plays a significant role in the retransmissions in layer 2. Excessive retransmissions will happen when frames are corrupted because of RF interference, and therefore, throughput is reduced significantly. If these retransmissions occur frequently, it’s essential to understand the source to remove the interference device.
Let’s go back to basics. When designing the 2.4GHz WLAN channel allocation plan, make sure to use the channels available for 2.4GHz properly. When there’s an overlapping coverage cell, and overlapping frequency space, the chances of having corrupted data and layer 2 retries are remarkably high. Remember to set up a reuse pattern for 2.4GHz channels 1, 6, and 11 (US) or 1, 5, and 9 -sometimes 13 is also used in deployments for Europe. In this way, you prevent adjacent cell interference in your WLANs.
In wireless networking, a ‘hidden node’ means that a specific node ‘talks’ to a WiFi access point but can’t ‘talk’ directly with other nodes already having a ‘conversation’ with that access point. This should ring all the bells, because it leads to problems in the MAC sublayer as multiple nodes send data packets to the access point at the same time, thus creating interference at the AP level, resulting in data packet loss.
When there’s frequent packets loss, and thus retransmissions occur often is crucial to keep an eye on the percentage of packet loss and retransmissions. Tanaza has an embedded ping tool in the cloud management platform that allows you to track data packet loss and network performance to identify connection issues proactively. Our ping tool measures and records the packet round trip time, which lets you know the levels of latency between devices. Additionally, it measures if there are any losses along the way while performing the ping test.
Another common problem in layer 2 is roaming. Sometimes roaming problems occur due to drivers’ issues on the client device side, and sticky devices due to bad WiFi design. Usually, roaming improves for those client devices that support 802.11K protocols.
Furthermore, roaming has a correspondence with WLAN security. When client devices roam from one AP to another, they always need to go through an authentication process with the new AP. When AP’s act independently, establishing an authentication takes place every time the client device roams.
For instance, an end user’s smartphone is connected to the airport’s WiFi – where dozens of AP’s coexist in the same network. If the end-user is on the move, without the inclusion of standards 802.11r/k, the smartphone disconnects from the existing AP before establishing a connection with the new one.
As a result, the end-user experiences WiFi disconnection and latency while reconnecting to a new access point. It translates into dropped WiFi-based calls, websites loading slowly, difficulties in uploading images on social networks, and other negative performance.
The Tanaza WiFi cloud platform supports the current fast roaming IEEE 802.11 protocols. The fast roaming standards are leveraged when a client device is connected to a secured-password or captive SSID in a wireless network. The standards allow the client device to roam quickly from one access point to another seamlessly. The client devices do not need to re-authenticate to the RADIUS server every time they switch access points.
By installing the TanazaOS operating system on access points that do not have roaming within the stock firmware, you can add roaming features following the IEEE 802.11r/k/v standards to the devices. Consequently, the Tanaza Operating System enables the fast roaming feature on top of multi-vendor networks of a variety of WiFi access points its compatible with.