Tuesday, January 2, 2018

Adding Customer Value with Systems Engineers

This blog is written based on my experiences in working for a Wi-Fi access point vendor.  However, the material presented here are general lessons that readily apply to any type of business with technology-based products that are sufficiently complex that customers may not necessarily know or understand how to deploy them optimally. 

Technology vendors are typically focused on the making a "product", i.e. a collection of hardware and software to perform specific tasks.  They generally are not focused on how the product gets deployed as part of a larger system.  To add value to customers, however, it is important for the technology vendor to understand the customer's perspective and speak the customer's language.

The Importance of System Engineers (SEs)

Source: http://images.wisegeek.com/guy-working-on-network-servers.jpg

The function of System Engineers is to work with customers on scoping out new projects (pre-sales) and assisting customers with existing deployments (post-sales).  Most technology enterprise equipment vendors call these SEs (System Engineers or Sales Engineers), though some vendors also use the term FAEs (Field Application Engineers). 

Such SEs need to understand how technology (in my case, Wi-Fi) gets deployed and how customers use the vendor’s products. While the SEs obviously need to have an expert understanding in the product portfolio, they don’t need to know how to design or write code for the product itself, but rather how to configure the product and how the products are implemented by customers in the market.

This is fundamentally different from Product Engineers, who need to understand the intricacies of hardware and firmware code and how to weave that into a functioning mass-producible product.

These are very different skill sets, and require different types of engineers.  Many technology vendors struggle because they don’t understand this distinction, and thus are entirely product-focused, lacking a good understanding of how customers actually deploy the products.  Thus, the technology vendors will push product on customers based on a product view of the market and a very limited understanding of what customers actually want and need.  Sometimes the vendors get lucky, and the product is successful.  More often, the product misses the mark in one or more ways, and is a flop in the marketplace.

While product engineers and system engineers need to work closely together, the technical flow needs to be from the customer to the system engineers to the product engineers, so that the product engineers can ultimately create products that will resonate with the customers’ needs. 

Thus, the role of SEs is to build direct and stable personal relationships between themselves and the customers, ideally all of them but at the very least those large "top-tier" customers.  The customers need to know they have a knowledgeable resource to call on when they run into issues.

Simultaneously, the SEs also need to make the customers as self-sufficient as possible, both for pre-sales designs and for post-sales maintenance, so they’re only reaching out to their designated system engineer when they really have a problem. 

Online training has been very popular with technology vendors over the last few years.  It is relatively cheap and easy to implement, but only goes so far.  On-site training, at least partially customized to the customer’s specific needs, helps build both that personal relationship and enhance the customer’s knowledge and self-sufficiency.  Whenever an SE can get face time with a customer, it is incredibly valuable.

Each customer is generally only going to need to focus on small subset of products that the vendor offers.  That said, most customers in a particular market  The vendor must therefore be extremely wary of having too large of a selection of products, especially if their differences are subtle, especially to those who are not themselves engineers (i.e. most customers).

New products and features should ONLY be introduced if they are going to enhance value for customers over existing offerings, or enable the vendor to move “upmarket” to either related companion products (e.g. offering PoE network switches with wireless access points) or products and features that enable the vendor to be attractive in adjacent markets (e.g. higher end products for larger enterprise deployments). 

It is important that the Product Engineers share this vision, and are responsive to customer needs for the introduction of big fixes, new features, and ultimately new products.  System Engineers can only be of limited effectiveness without full support of Product Engineering.

The Importance of Customer Support


Additionally, a strong customer support team is essential in fielding the more routine pre-sales and post-sales questions from smaller customers.  Not only do technology companies need to cultivate a consistent reputation of excellence in service, but at least some of these smaller customers will have the potential to become big customers.  Projects from smaller customers tend to be (but are not necessarily) simpler.  While the customer support team are not necessarily engineers, they need to be appropriately trained in the product and the customer applications of the product that they can answer relatively straightforward questions, and understand when they need to escalate to the engineering team.  The customer support staff also is the breeding / training ground for future system engineers.  

While tools like customer forums and online knowledge databases have their place, they should not be solely relied upon, as some vendors have advocated.  Our customers are human, so when customers have a problem, they want to get a human being, at least on the phone, to help them fix it as quickly as possible.  By the time the customer is calling into Customer Support, they have likely already searched online for a solution and couldn’t find what they needed.  Hence, they’re already aggravated from the original problem and frustrated that they couldn’t find an easy answer, so they are not starting from an “ideal state of mind”.  Alas, most technology vendors have poorly staffed and poorly trained customer support agents, and sometimes even finding the phone number can be an adventure in itself.  Accordingly, exceptional customer support is an area where a vendor can readily distinguish themselves from the competition.

In summary, to be truly successful, a technology vendor needs to understand the needs of their customers and structure their support and product offerings around those customer needs.

Friday, November 24, 2017

Antennas: Why They Matter and When Do You Want to Use APs with Internal vs. External Antennas

Here is a practical guide on when you want to use APs with internal antennas vs. APs with external antennas.   For the purpose of this blog, I'll be using two indoor APs, the EnGenius EAP1300 (internal antenna, ceiling mount) and the EnGenius EAP1300EXT (external antenna, wall or ceiling mount) for demonstrative purposes.  The content, however, applies to any vendor that has APs of comparable specifications with both internal and external antennas, for both indoor and outdoor applications.

What are Antennas

Antennas serve to shape and focus the radio signal in particular directions.   Antennas, therefore, act like a lens for RF frequency.   Every radio system (Wi-Fi, cellular, cordless, walkie-talkie, etc.) requires antennas on both the transmitter and receiver to shape and focus the signal.   Antennas are passive devices and work in both directions - i.e. an antenna equally increases the radio's ability to talk (transmit) and to hear (receive).  

The signal gain (i.e. strength) of an antenna is measured in "dBi", or decibels relative to an isotropic radiator.  An isotropic radiator is defined as a point-source of RF signal where the energy radiates spherically equally in all directions.  Such an antenna cannot physically be built, but it serves as a useful mathematical reference, as such an antenna has no gain, or a gain of 0 dBi.   

Antenna gain, therefore, is based on their deviation from a perfect sphere. A typical "rubber duck" dipole omni-directonal antenna typically has a doughnut-shaped pattern with the antenna sticking through the hole of the doughnut.  As this shape is "roughly spherical", these antennas typically have fairly low gain (i.e. 2 - 3 dBi).  The gain of such an antenna can be increased by lengthening it.  When doing this, you are increasing the energy propagated horizontally by stealing it from the energy propagated vertically.   One can also make directional antennas, which serve to focus the bulk of the RF energy in one particular direction.  Such antennas have very high gains as they deviate dramatically from a perfect sphere.   

Examples of increasing antenna gain.   Top:  Low gain dipole.  Middle:  High gain dipole.  Bottom:  High gain directional.  
A couple of notes on directional antennas: 
  1. Just as antenna designers cannot build a perfect sphere, they cannot build a perfect cone.   As a result, there is some amount of RF energy that is projected and received in the other directions.  These are known as backlobes and sidelobes, as seen in the figure above.   If two neighboring antennas are placed very close together, they can interfere with each other, thus a certain amount of separation distance (at least a few feet) is generally recommended when placing directional antennas next to each other.  
  2. The beamwidth is defined by where the energy of the antenna drops by 3 dBi (i.e. half) of the peak.   Thus, while the gain of the antenna is less beyond this beamwidth, it is also generally not zero, which needs to be accounted for in Wi-Fi design.

There are various types of antennas that can be constructed, as shown below.

Examples of different antenna types, with their typical potential for beamwidth and gain.
Like lenses, antennas are tuned to work at particular frequencies, as the length of the element is a function of the operational wavelength.  In Wi-Fi, this means that you will often see separate 2.4 GHz and 5 GHz antennas; these may look identical on the outside (i.e. the plastic radome that covers the antenna), but are actually different on the inside.  Some antenna manufacturers are able to make "dual-band" antennas that work at both 2.4 GHz and 5 GHz frequencies, though such dual-band antennas generally require a compromise on the gain of the antenna for each frequency.

APs with Internal Antennas

EnGenius EAP1300
(indoor 802.11ac wave 2, 2x2:2, internal 5 dBi / 5 dBi omni-directional antennae)

An indoor ceiling mount AP with internal antennas is generally designed to be mounted on a ceiling, with most of the antenna energy being projected outwards (horizontally) and downwards (vertically).  Such a device can naturally be mounted on the wall, but then the area of coverage changes to project most of the energy both up and down (vertically) and outwards primarily in one direction (horizontally).   Since the internal antennas are fixed, the area of coverage is also fixed based on how the AP is physically mounted.

Differences in coverage area when mounting an AP on a ceiling vs. on a wall.
(Figure source:  Ruckus Wireless™ ZoneFlex™ Indoor Access Point Release 9.5 User Guide)

For most indoor Wi-Fi deployments in environments such as schools, apartment buildings, hotels, offices, etc., the built-in internal antenna of a ceiling mount AP, like the EnGenius EAP1300 depicted here, provides sufficient coverage for the desired area. It also satisfies aesthetics constraints, as people generally do not like seeing external antennas, especially in most indoor environments. Additionally, for 802.11n/ac features like MIMO to work properly to achieve faster speeds, the relative alignment of the antennas to each other is critical.  For an AP with internal antennas, this alignment is fixed at the factory and cannot be altered.

Naturally, however, when you select an AP with an internal antenna, you lose the ability to change that antenna in your design, such as providing larger areas of bi-directional coverage in particular directions.   

APs with External Antennas

EnGenius EAP1300EXT

(indoor 802.11ac wave 2, 2x2:2, external 5 dBi / 5 dBi omni-directional antenna)

When an AP has external antennas, the antennas can naturally be adjusted to fit the coverage area.   A vendor will typically supply omni-directional dipole antennas.   

In this example, both the EAP1300 and EAP1300EXT include 5 dBi antennas.   Accordingly, the gain is the same, and the effective coverage area will be "approximately equivalent" for both models when configured with the same transmit power settings.  I say "approximately" because there will be some subtle differences in the antenna pattern between the internal and external antennas.  

As mentioned above, aesthetics are often a reason to not go with external antennas.   More importantly, however, the MIMO capabilities of 802.11n/ac take advantage of phase offsets between the antennas.  This necessitates that the multiple antennas are at a fixed separation distance from each other so as to be out of phase with each other.   With internal antennas, this phase offset is fixed at the factory and cannot be changed.  For external antennas, the relative alignment of the antennas can be easily altered, either during install or during the AP's normal lifecycle, which will corrupt the MIMO performance and therefore the ultimate performance of the AP.   

The primary advantage of an AP with external antennas, such as the EnGenius EAP1300EXT, is that you can replace the included antennas and use either higher gain omni-directional or directional antennas.  The antenna that you select will be highly dependent on the particular application.  Typically, external antennas are useful for applications where additional range is critical in a specific direction, such as when covering warehouse aisles or an outdoor area with limited AP / antenna mounting options.  In such applications, external sector or patch antennas are usually suitable. For MIMO APs, sector and patch antennas also have their individual antenna elements pre-aligned and fixed by design internally, thus meeting both aesthetic and MIMO constraints.

Takeaway Message

I generally only recommend using an AP with external antenna ports in cases where the dipole antennas packaged with the AP are NOT actually going to be used, but instead are going to be replaced with an external directional antenna, such as a sector or patch. If omni-directional coverage is the requirement, APs with internal antennas are typically the most suitable.

Monday, October 16, 2017

A Simple Explanation of the KRACK WPA2 Security Vulnerability

What Has Happened

Security researchers have discovered a weakness in the Wi-Fi Protected Access 2 (WPA2) protocol that is used in all modern Wi-Fi networks.  A malicious attacker in range of a potential unpatched victim can exploit this weakness to read information that was previously assumed to be safely encrypted.  The vulnerability is within the Wi-Fi IEEE 802.11 standard itself, and is therefore not unique to any particular access point or client device vendor.  It is generally assumed that any Wi-Fi enabled device is potentially vulnerable to this particular issue.

A Summary of How WPA2 Security Works

WPA2-AES security consists of both authorization and encryption.   The authorization step is used to determine whether a particular client is allowed to access the wireless network, and comes in two flavors, Personal and Enterprise.   In WPA2-AES Personal, a pre-shared key or passphrase is used to provide the essential identifying credential.  In WPA2-AES Enterprise, the Extensible Authentication Protocol (EAP) is used to validate the client credentials against an external RADIUS or Active Directory server.  In either the WPA2-AES Personal or WPA2-AES Enterprise scenario, once the client’s authorization credentials are validated, a unique set of encryption keys are established between that particular access point and that particular client device, so as to encrypt the traffic between them.  This encryption process is done via a four-way handshake, where particular temporal (i.e. temporary) keys are passed back and forth between the access point and the client device so that each can derive the appropriate unique encryption key pair used for that connection.

A Summary of the Vulnerability

The security researchers discovered that they can manipulate and replay the third message in the four-way handshake to perform a key reinstallation attack (KRACK).  Strictly speaking, each temporal key that is passed in the four-way handshake should only be used once and never re-used.  However, in a key reinstallation attack, the attacker pretends to be a valid access point and tricks the client device into reinstalling a temporal key that is already in use, serving to reset the transmit and receive packet numbers.  For WPA2-AES, the attacker can then derive the same encryption key as the client device, and thus decode upstream traffic from the client device to the access point.  For the older (and less secure) WPA-TKIP, the attacker can go even further, and potentially forge and inject new packets into the data stream.

For an attack to be carried out to take advantage of this vulnerability, it must be done by a malicious actor conducting a man-in-the-middle attack (i.e. pretending to be an AP on your network and serving to be a relay between the client device and the legitimate wireless network).

How this Vulnerability Impacts Access Point Products and Networks

As the issue occurs on client devices, the first step for any network operator is to check with your client device manufacturers for security patches and updates and apply these updates as soon as they are available.

This particular vulnerability has no direct impact on any APs operating in “access point” mode.  However, access points that are being used as client devices (i.e. APs operating in “client bridge” mode) or any access points that are being used for point-to-multipoint communications (i.e. APs operating in “WDS bridge” or “WDS station” mode) are potentially impacted by this vulnerability in the IEEE 802.11 protocol.  Furthermore, some advanced applications and features, such as mesh networking and fast roaming (i.e. 802.11r), may also be potentially vulnerable to this issue.

Access point vendors are currently actively investigating the impact of this vulnerability across all of the products in our product portfolio, and will be issuing firmware releases in the coming days and weeks to address this issue.  In the interim, continue to use WPA2-AES Personal or WPA2-AES Enterprise for network security.  Do not use WEP and do not use WPA-TKIP, as the vulnerabilities of those deprecated security protocols are significantly more serious and easier to execute by a malicious attacker.

For More Information

The website https://www.krackattacks.com/ provides a detailed summary of the issue along with links to the research paper and tools detailing the vulnerability.

Friday, September 8, 2017

Oversubscription Ratios and the Types of Bandwidth Throttling

When we build networks, we need to allocate the available bandwidth amongst the client device population in, hopefully, a reasonably fair and equitable manner such that all users are happy (or at least not complaining).    We use bandwidth throttling for this purpose.  

Without bandwidth throttling, one or two abusive users could use applications like BitTorrent and consume the overwhelming majority of the available Internet bandwidth, leaving very little bandwidth for all of the remaining users on your network.  

Bandwidth Throttling 

Generally, three are two types of bandwidth throttling available on network equipment:

  • Per-User Bandwidth Throttling:   This limits the maximum amount of Internet bandwidth that each client device can consume
  • Per-Subnet/VLAN Bandwidth Throttling:  This limits the aggregate maximum amount of internet bandwidth that all client devices on the subnet / VLAN can consume at one time.
By means of a demonstrative example, let’s assume we have a subnet / VLAN with 5 client devices connected.  If the bandwidth throttling is 10 Mbps / 10 Mbps per user, then each user could potentially consume 10 Mbps / 10 Mbps simultaneously, making the total potential consumption the sum, or 50 Mbps / 50 Mbps.   Alternatively, if the bandwidth throttling is 10 Mbps / 10 Mbps per subnet / VLAN, then all users on that subnet / VLAN have to share a 10 Mbps / 10 Mbps bandwidth allocation, meaning each user would get 2 Mbps / 2 Mbps on average, and this average would decrease as more users connect to that VLAN / subnet.

In general, per-user bandwidth throttling is what you want in most practical circumstances.   Obviously, if there are too many users and/or the allocated bandwidth per user is set too high, you eventually run out of Internet bandwidth.

So how do you decide what limits are appropriate?   It ultimately depends on the type of network you are operating (i.e. its requirements) and the total amount of Internet bandwidth you have available (i.e. its constraints).  However, this can be treated quantitatively by using an oversubscription ratio.

Oversubscription Ratio

Oversubscription is a concept that dates back to very early telephony.  Statistically, not all connected users will actually consume their maximum available bandwidth at any particular instant of time.  For example, if we have 5 users and each of them has a per-user bandwidth cap of 10 Mbps / 10 Mbps, it is statistically unlikely that any of them, let alone all of them, will actually be consuming 10 Mbps / 10 Mbps simultaneously.   Most network applications are bursty in nature, meaning that your actual consumption is constantly fluctuating and rarely hitting the maximum allocation.  (Video streaming is, naturally, an important exception to this, as that consumes bandwidth at a fairly constant rate for an extended period of time.  That said, even today only a fraction of devices that are connected to your network are likely to be streaming video at a given instant.)   

Thus, as a service provider, I do not need to supply the additive sum in terms of bandwidth (i.e. # users * promised bandwidth per user), but rather some fraction thereof.  That fraction defines the oversubscription ratio.   

Unfortunately, setting the oversubscription ratio is an empirical exercise, and over time, the oversubscription ratio tends to decrease as more devices, each consuming more bandwidth, are connecting to your networks.  In the pre-smartphone days, a 30:1 or even 40:1 was common for most wired / wireless networks.  

The common oversubscription ratio I use for regular network usage (e.g. hotel, apartment building, etc.) is 20:1.  This would mean that if I promised 200 users each a 10 Mbps / 10 Mbps data rate (and throttled them each to that rate), I could get away with only providing a 100 Mbps Internet bandwidth connection.  The math is as follows:  200 users * 10 Mbps/user * 1/20 = 100 Mbps.  At any instant in time, the average consumption would be 10 Mbps/user / 20 = 500 kbps per user. In reality, some are obviously consuming more, while others will be consuming less (even 0).  

For student housing, which is fairly heavy network utilization, I typically use a 10:1 oversubscription ratio.   For larger high density environments, (e.g. conference centers, event spaces, etc.) you will have a few devices that are doing video streaming, but most attendees will be connected but are not likely to be heavily utilizing their devices.  I therefore typically use a 15:1 oversubscription ratio.

Determining Appropriate Bandwidth Throttling Values

In reality, one is generally constrained by the total amount of bandwidth available, as that is the most expensive part of your network.  Thus, the real calculation is to determine the appropriate bandwidth throttling per user that should be used.  To determine this, one needs to know the peak number of expected users and the bandwidth available.   

As an example, let's assume an event space where we are expecting 500 users and have a 300 Mbps / 300 Mbps Internet circuit available.   Using the 15:1 oversubscription ratio, for 500 users this comes out to a sustainable average service level of 9 Mbps (i.e. 300 Mbps / 500 users * 15:1 oversubscription = 9 Mbps / user).   

 Of course, in reality complex networks are an Animal Farm (i.e. while all client devices are equal, some client devices are “more equal” than others).  Thus, different classes of users will require different levels of service.


In most commercial environments, it is vitally important to operations that they have sufficient bandwidth available, though they usually represent a small fraction of the total number of clients. This is one very good use of having multiple VLANs / subnets, as you can put your different classes of users on to different VLANs, and then allocate bandwidth both per VLAN and per user accordingly.  Where operations activity is critical, we need to provide this small but more important operations segment of the client device population a higher per-user bandwidth allocation, and give the (proletariat) visitors a lower per-user bandwidth allocation.   

It is also useful to have two layers of bandwidth throttling.  The first layer is bandwidth throttling per VLAN / subnet.  For example, limit the guest network to 80% of the total bandwidth, ensuring that the staff / operations network(s) will always have access to at least 20% of the Internet bandwidth, no matter how crowded the guest network becomes.  The second layer is bandwidth throttling per user, to ensure that no abusive user on any VLAN / subnet can take up all of the bandwidth allocated to that VLAN / subnet.

Tuesday, August 15, 2017

The Emergence of Tri-Band APs

In a former blog post, I discussed the limitations of MU-MIMO and hinted at the pending emergence of a competing technology for high-density deployments, called "tri-band".   In this post, I'll be again comparing the technologies and encouraging the use of tri-band APs for high-density deployments.

What is Tri-Band?

Strictly speaking, this AP technology should be called "tri-radio", as two of the radios in the access point are on the 5 GHz band.  Essentially, a tri-band access point is just two co-located 5 GHz 802.11ac wave 1 APs in one box.   The tri-band AP has one 2x2:2 stream 2.4 GHz radio (IEEE 802.11b/g/n) and two 2x2:2 stream 5 GHz radios (IEEE 802.11a/n/ac), with one 5 GHz radio locked on the low portion of the band (channels 36-64) and the other 5 GHz radio locked on the high end of the band (channels 100-165).   

Tri-band is technically not part of the IEEE 802.11 standard.  Broadcom developed the original chipset for this in mid 2015, and QualcommAtheros recently introduced their own version.  Tri-band APs are intended for high density environments (e.g. lecture halls, conference centers, auditoriums, concert halls, etc.) and thus compete directly with IEEE 802.11ac wave 2 with MU-MIMO.  The tri-band approach, however, has several advantages over MU-MIMO.   Nonetheless, there are still very few vendors who have introduced tri-band access points to the market, despite their obvious advantages.   

Why is Tri-Band Better than MU-MIMO?

MU-MIMO requires the use of beam forming, which is a technique used to create particular zones of constructive and destructive interference at particular locations.  By maximizing the signal for each client device at the client devices location (and minimizing the signal for the other clients at each client’s location), a MU-MIMO AP can talk downstream to multiple client devices.   

Multi-User Multi-In Multi-Out (MU-MIMO)

The MU-MIMO technique requires the AP to know the position of the client devices (relative to itself).  The AP gathers that information by periodically transmitting “sounding frames”, essentially tones off of each AP antenna.  Compatible client devices will respond by sending a matrix indicating how well the client device heard the tone from each antenna.  Based on that matrix, the AP can calculate the relative position of the client device.    

MU-MIMO has the following limitations, which do not exist with the tri-band approach:

  1. Increased overhead:  The sounding frames and their responses consume airtime.  While this is less than the presumptive gains of talking to multiple client devices simultaneously, it does indicate a loss.  Most MU-MIMO access points only get a 1.7x - 2.2x increase in speed when talking downstream to three compatible client devices. 
  1. Client device compatibility:  The client devices need to be compatible with MU-MIMO in order to understand the sounding frames and to send the appropriate response.  As of August 2017, there are still surprisingly few MU-MIMO compatible client devices on the market.  There are some USB dongles available for PCs.  The flagship mobile client device for MU-MIMO had been the Samsung Galaxy Note 7, which failed in the market for unrelated incendiary reasons.  The Apple iPhone 7, while originally rumored to support it before its launch, quietly did not support MU-MIMO.  Given Apple's notorious secrecy, we still don't know whether or not the upcoming Apple iPhone 8 will or will not support MU-MIMO.

  1. Client separation:  MU-MIMO requires that the client devices it talks to simultaneously must be physically separated from each other.  If the client devices are in too-close proximity, the beam forming won’t be able to successfully maximize the signal at one client and minimize the signal of the other (neighboring) clients.  

  1. Downstream only:  MU-MIMO only works for downstream traffic, from the AP to the client device(s).  Upstream traffic from each client device to the AP must still happen one at a time, otherwise the AP will hear multiple client devices at once and won’t be able to distinguish between them.
In comparison, all 5 GHz Wi-Fi clients can communicate with the tri-band AP as they would with any other conventional access point, so it is backwards compatible with all current and future Wi-Fi client devices.  There is also no additional overhead on the channel, as sounding frames are not required, and the positions of the two 5 GHz client devices, both relative to the AP and to each other, doesn't matter.  Additionally, since the 5 GHz clients and the channels are independent, the traffic to each client can occur simultaneously in both directions.  The AP itself uses an internal mechanism called “client steering” to encourage 5 GHz clients to connect to one or the other 5 GHz radio, so as to balance the load across the two 5 GHz radios. 

The Takeaway Message

A current four-stream MU-MIMO access point can talk simultaneously to 2-3 compatible client devices on the 5 GHz band downstream, sometimes.   A two-stream tri-band access point, by comparison, can talk simultaneously to any two client devices on the 5 GHz band both downstream and upstream, all the time.

Monday, July 31, 2017

Appropriate RSSI for WDS Bridge Links

A customer recently approached me with a question on how much the minimum RSSI should be for a WDS bridge link between two locations in order to maximize data throughput.   (If you don't know what WDS bridging is, see my blog entry on Wireless Backhaul Best Practices.)

The short answer:  -40 dBm to -50 dBm for optimal performance.   This RSSI is readily achievable over distances of up to approximately 2500 ft when using high-gain directional antennas on each end.

This RSSI target is to ensure that the signal to noise ratio (SNR) can be safely above 37 dB.  This means that the highest MCS rate (MCS9) can be achieved and maintained with an 80 MHz channel, and the throughput of the link maximized.  (See Andrew von Nagy's Wi-Fi SNR to MCS Chart for further reference.)

For 802.11ac 2x2:2 APs, measured data throughputs of about 350-400 Mbps can be achieved with 80 MHz channels, when the RSSI of each link is consistently in the -40 dBm to -50 dBm range.  

If the RSSI is too strong (i.e. > -35 dBm), the electronic amplifiers in the AP start to get saturated and data throughout will actually decrease.  This generally only occurs in very short distance shots (< 50 feet), and in those instances we recommend turning down transmit power to minimum and, if still necessary, purposely misaligning the antennas.  

If the RSSI is too weak (i.e. < -70 dBm) the link speed will be very slow (low SNR leading to low MCS rates and slow speeds).  In the presence of any interference, the wireless backhaul link itself can often become unstable.

A Simplified Explanation of the Physics

A signal leaving a transmitter experiences free space path loss (FSPL), which is a geometric spread of the RF energy as it travels away from a transmitter; the further away the receiver is from the transmitter, the more the energy from the transmitter has spread out, so the less amount of energy is seen at the receiver. The FSPL goes as the square of the distance between the transmitter and receiver. Increasing the gain of the antennas at either (or both) ends of the wireless link help to focus the transmission energy and/or the receive sensitivity in a particular direction, allowing the distance to increase between the transmitter and receiver.  

Hence, for point-to-point wireless backhaul applications, you always want to use high gain directional antennas on both ends of the link (such as the integrated 19 dBi antenna on an EnGenius EnStationAC) to maximize the practical link distance and speed.

A Simplified Explanation of the Math

If you want to estimate the RSSI of a link, you can use the following formula:

or rearranged to compute link distance:

  • RSSI = Received signal strength indicator [dBm]
  • d = distance of wireless link [m]
  • f =  operating frequency of wireless link [Hz]
  • c = speed of light (i.e. 300 billion m/s) [m/s]
  • PTx = transmit power of transmit radio [dBm]
  • GTx = gain of transmit antenna, less cable losses [dBi]
  • GRx = gain of receive antenna, less cable losses [dBi]
Based on these formulas, a point-to-point link utilizing EnGenius EnStationAC access points could achieve a -50 dBm RSSI on each end for a WDS bridge wireless link length of approximately 2500 feet.

Monday, April 10, 2017

Wireless Backhaul Best Practices

This blog post provides guidelines on best practices for configuring and deploying wireless backhaul on Wi-Fi networks, and goes through the differences between and appropriate scenarios for client bridges, repeaters, WDS Bridge links, and mesh networks.

The Options for Wi-Fi Backhaul

In a conventional wireless network, each access point (AP) requires a wired Ethernet connection to provide backhaul to the wired network infrastructure and ultimately the Internet.  In some environments, however, it is either impossible or prohibitively expensive to run an Ethernet cable to each AP.  In such cases, Wi-Fi itself can be used to provide wireless backhaul from the AP (or other network appliance, such as a remote IP camera) to the wired network.  Each Wi-Fi backhaul link is referred to as a hop, and it is possible to have a chain of multiple hops between the remote wireless AP to the root wireless AP that has a wired connection to the network.

There are multiple options for providing Wi-Fi backhaul to the remote APs.  Naturally, each option has both benefits and limitations.  Most critically, each wireless hop introduces latency, which adds in a linear fashion with the number of hops.  Repeaters and mesh also inherently lower with throughput and user capacity, often as a square of the number of hops. 

It is critical to understand your technical requirements and constraints, as well as the benefits and limitations of each wireless backhaul option, when designing a Wi-Fi network and selecting a particular Wi-Fi backhaul approach.

Option 1:  Client Bridge

An access point operating in Client Bridge mode provides Wi-Fi connectivity for a wired client device.  A Client Bridge is intended to connect an individual wired client device to a Wi-Fi network.   This is depicted in Figure 1.

Figure 1:  Example of a network utilizing a client bridge.

When multiple wired client devices are connected through a single Client Bridge, they share the same MAC address on the network, namely the WLAN MAC address of the Client Bridge itself.  The multiple wired client devices can still be configured with different Layer 3 static IP addresses, and each wired device may or may not be able to obtain an independent Layer 3 DHCP address, depending on the DHCP server.

Best Practice:  When using an AP in Client Bridge mode, only connect one wired client device.

For typical applications, Client Bridge mode is only utilized on single-band APs. For dual-band access points, one radio (typically 5 GHz) will be configured to operate in client bridge mode, while the other radio (typically 2.4 GHz) will be used for providing Wi-Fi connectivity on an independent SSID to wireless client devices.  Client Bridge mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  Client Bridge mode is available on all EnGenius® single-band Electron™ and EnStation™ access points, as well as dual-band APs in the Electron™ ECB series.

Option 2:  Repeaters

An access point operating in Repeater mode provides both Wi-Fi connectivity to client devices as well as providing a wireless backhaul connection to one or more wired APs.  This is depicted in Figure 2.  Repeaters are intended for very small networks (e.g. home environments), where individual repeater APs are used to fill in particular coverage gaps.  Individual client MAC addresses are preserved, though the VLAN (if any) is defined by the main access point’s SSID that is being repeated.

Figure 2:  Example of a network utilizing a wireless repeater.

For dual-band access points, one radio (typically 5 GHz) will be configured to operate in repeater mode, while the other radio (typically 2.4 GHz) will be exclusively for providing Wi-Fi connectivity to client devices.  Note that both Wi-Fi bands depend upon the repeater radio for backhaul.  Since the repeater radio must spend half its time providing Wi-Fi connectivity to client devices and half its time providing wireless backhaul, the data capacity of a repeater radio for both backhaul and for Wi-Fi client connectivity is reduced by 50%.  When there are multiple hops, the data capacity is reduced by 50% at each hop.  Thus, for two hops, the total data capacity is only 1/4, for three hops it is 1/8, for four hops it is 1/16, and so forth. 
Repeater mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  Repeater mode is available on all EnGenius® Electron™ ECB series access points.

Option 3:  Point-to-(multi)point WDS Bridge Links 

A dedicated pair of APs, usually with integrated directional antennas (such as the EnGenius® EnStationAC), are configured to operate in WDS Bridge mode to create a point-to-point link to provide wireless backhaul.  The WDS Bridge link on the remote end is connected to the remote AP via its wired Ethernet interface. From the perspective of the rest of the network, this wireless connection looks like a wired connection; in WDS Bridge mode, the wired Ethernet frame is encapsulated and encrypted in a Wi-Fi packet on one end, transmitted across the wireless link, and then de-encapsulated and decrypted on the other end.  Thus, all wired Layer 2 information (i.e. client MAC addresses, VLANs, etc.) are preserved across the WDS Bridge link. Point-to-multipoint WDS Bridge links ae also readily possible, though be aware the remote links collectively share the total available airtime bandwidth of the link.  This is depicted in Figure 3.

Figure 3:  Examples of point-to-point and point-to-multipoint networks utilizing WDS Bridge links.

The WDS Bridge links are statically established, so that each WDS Bridge AP only accepts connections from pre-defined radios.  WDS Bridge usually requires dedicated hardware at each remote location operating on independent channels, though some APs allow for one radio (typically the 5 GHz) to be in WDS bridge mode and the other radio (typically the 2.4 GHz) to be in AP mode to provide Wi-Fi service client devices.

Best Practice:  WDS Bridge with dedicated 5 GHz only access points is generally recommended for most networks requiring both wireless backhaul and high bandwidth and/or high user capacity Wi-Fi.   While each hop adds latency, there is no throughput or user capacity degradation, since the point-to-(multi)point backhaul link is solely dedicated to wireless backhaul, with Wi-Fi access for client devices being handled by separate access points.

For large networks consisting of multiple remote nodes, a WDS Bridge backhaul network requires its own design effort to ensure appropriate bandwidth capacity and channel utilization.  WDS Bridge mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  WDS Bridge mode is available on all EnGenius® Electron™ and EnStation™ access points.

Point-to-Multipoint WDS Bridge Network Example

Figure 4 shows an example of an outdoor Wi-Fi network at an RV park utilizing point-to-(multi)point links to provide wireless backhaul to APs mounted on light poles.  The colored lines indicate the point-to-(multi)point WDS Bridge links implemented with EnGenius® EnStationAC access points. 

Figure 4:  Example of a wireless network utilizing point-to-(multi)point links for backhaul to outdoor wireless APs.

Red markers indicate the location of outdoor dual-band APs, and yellow markers indicate the location of additional light poles that were available at the property.  To maximize wireless backhaul capacity, all of the WDS Bridge links utilized 80 MHz channels in the UNII-2 and UNII-2e bands (i.e. DFS channels 52-64, 100-112, and 116-128).  The 5 GHz radios on the dual-band APs were set to use 40 MHz channels on the UNII-1 and UNII-3 bands (i.e. channels 36-40, 44-48, 149-153, and 157-161), so as to avoid co-channel interference with the point-to-multipoint backhaul network.

Option 4:  Mesh Networks

In a mesh network, the AP uses its own radio to provide a wireless backhaul to other APs on the network, eventually reaching an AP with a wired Ethernet connection to the wired backhaul infrastructure and the network.  In this sense, a mesh network is a network of repeaters, though mesh is designed to operate automatically and more intelligently on a large scale.  A mesh network creates a set of “dynamic WDS Bridge” links, using routing algorithms to automatically calculate the most optimal wireless path through the network back to a wired root node.  This makes mesh networks relatively robust to the failure of an individual AP; in a process referred to as “self-healing”, the routing algorithms will automatically calculate the “next best” path through the network if an AP in the path goes offline.  Since the routing functions are done automatically within the mesh software, mesh networks are actually fairly straightforward to set up and are thus scalable to cover large geographic areas.  All wired Layer 2 information (i.e. client MAC addresses, VLANs, etc.) are preserved across the mesh link.  Examples of mesh networks are shown in Figure 5 (for home / SOHO environments) and Figure 6 (for larger campus-wide environments).

Figure 5:  An example of a home / SOHO mesh network, utilizing EnGenius® EMR3000 mesh routers.

Figure 6:  An example of a large campus mesh network, utilizing EnGenius® EWS1025CAM mesh cameras.

The mesh network control architecture can either be centralized or distributed.  With a centralized control architecture, an AP controller is required to calculate and coordinate the mesh parameters for each AP.  This architecture, however, limits the scalability of the mesh network to the capacity of the AP controller.   In a distributed control architecture, such as the EnGenius® Neutron™ series and EMR3000 product, each AP operationally acts like a router, continuously sharing information about its connection status to its neighbors, and each AP uses this information to compute its own optimal mesh path. In a distributed architecture, an AP controller can be optional, though is generally extremely useful in providing centralized real-time monitoring of the mesh network, as well as establishing the core initial mesh network parameters, such as mesh ID, encryption, etc.

Unfortunately, mesh networks have significant limitations, most notably in the loss of throughput and user capacity, which scales geometrically as the number of wireless hops increase, as well as the increase in latency, which scales linearly as the number of wireless hops increase.  

Accordingly, mesh networks are not suitable for high bandwidth or latency-sensitive applications.  Because of these performance limitations, it is generally recommended that mesh networks be avoided unless no other viable backhaul options are available. Mesh networks should only be used in environments where providing Ethernet data wiring to access points or cameras is impossible or cost-prohibitive. 

Mesh networks were originally trendy in the mid-2000s, as a way of both providing metropolitan Wi-Fi coverage as well as coverage for large outdoor properties where wiring was prohibitively expensive, such as RV parks, garden-style apartment complexes, marinas, etc.  While many mesh networks were successfully deployed, most of these efforts ultimately failed, especially in metropolitan Wi-Fi.  Early mesh networks relied upon single-radio APs on 2.4 GHz using 802.11g.  When dual-band APs were introduced, only 802.11a was available on the 5 GHz band, which still led to very low throughputs as the number of hops increased.  

With the wide adoption of dual-band access points with 802.11ac, there has been renewed interest in mesh for both Wi-Fi access and surveillance applications.  Accordingly, several startup companies, as well as established vendors like EnGenius®, have introduced mesh Wi-Fi products utilizing 802.11ac.  While the data rates of 802.11ac are approximately 25 times larger than the 802.11a data rates of a decade ago, the number of client devices and their bandwidth demands have also grown exponentially during that time.  The fundamental limitations of mesh networks are therefore still the same, and thus mesh may ultimately again prove to be a passing fad.

Nonetheless, mesh networks are the only viable option in many cases.  The sections below highlight how to best design and deploy mesh networks, so as to maximize their performance and mitigate their inherent limitations.

Mesh Network Terminology and Best Practices

The access points in a mesh network are categorized as either root nodes or remote nodes:
  • Root Node (a.k.a. Gateway Node):  This is an access point with a wired connection to the wired switch infrastructure.  The remote nodes establish wireless backhaul connections to the root node.  Note that the wired connection utilized by a root node can either be (1) a direct Ethernet or fiber-optic connection to the wired switch infrastructure or (2) a wired connection to a separate WDS Bridge wireless point-to-(multi)point link on an independent channel.
  • Remote Node:  This is an access point without a wired Ethernet connection.  Backhaul to the network is established via a wireless connection to a root node or to other remote nodes.  Note that the remote AP still requires electrical power, so an Ethernet connection to a PoE injector is common, though the “network” end of the PoE injector may not be connected at all or may only be connected to a wired client device, such as an IP camera.

The path from a particular remote node back to a particular root node can require connections via multiple intermediate remote nodes, and this wireless link in this chain is referred to as a hop.  The mesh routing algorithm selects the most optimal route through the network.  The optimization function used by the mesh APs is generally proprietary to each AP vendor, but typically attempts to balance several, often conflicting, parameters, such as the following:
  1. Minimize the number of hops, so as to minimize the total wireless latency and throughput penalty of the network
  2. Maximize the signal strength of each hop, so as to maximize the achievable Wi-Fi data rates between the mesh radios on each hop.    For maximum data rates in 802.11ac, the received signal strength indicator (RSSI) would ideally be in the -40 dBm to -50 dBm range, though this is usually unachievable in practice since omni-directional antennas are typically used to create the widest field of view to neighboring APs.  Data rates should be above -65 dBm for decent data rate performance between hops.
  3. Balance the load on each AP, so as to account for the number of associated client devices and the total throughput consumption on each AP.  The throughput load stacks as the number of hops increase, so intermediate remote nodes that are heavily utilized with client traffic will not give as many resources to downstream remote nodes. 

Because of the competing tradeoffs in this optimization process, mesh networks can often result in counter-intuitive and/or sub-optimal topologies.

Best Practice:  The network design should cluster the APs into groups consisting of up to four remote nodes that are only one hop away from a root node.  Thus, at least 20% of your APs, distributed roughly evenly throughout the property, should be root nodes.  Each remote node is therefore nominally only one hop away from a root node.  In the event of a failure of a root node, the nearby remote nodes will then only be 2-3 hops away from another root node.  This approach generally requires creating additional root nodes, which can be done either by running Ethernet or fiber-optic cable to the particular remote locations, or by establishing dedicated point-to-(multi)point WDS Bridge links to create “wireless wires” from the root AP back to the wired network.

Best Practice:  Each root node should be set on a static independent channel, and each remote node should be set to “auto channel”.  This is done to maximize the airtime capacity of the overall network, so that multiple neighboring root nodes do not create self-interference.  The remote nodes are set to auto-channel so that they can fail over to a different root nodes in the event of the failure of their primary root node.  When utilizing point-to-(multi)point WDS Bridge links to establish root nodes, these must also be on static independent channels, and thus must be accounted for in the overall channelization plan. 

Both root nodes and remote nodes can generally operate in one of two modes:
  • Mesh AP Mode: In this mode, the wireless radio acts like a repeater, providing both Wi-Fi connectivity to client devices as well as providing a backhaul connection to one or more remote APs.  For single-band mesh access points, this is the only operational mode available.  For dual-band access points, one of the bands (typically 5 GHz) will be configured to operate in this mode.  The other band (typically 2.4 GHz) will be exclusively for providing Wi-Fi connectivity to client devices.  Note that both Wi-Fi bands depend upon the mesh radio for backhaul.  Since the mesh radio must spend half its time providing connectivity to client devices and half its time providing backhaul, the data capacity of the mesh radio for both backhaul and for Wi-Fi client connectivity is reduced by 50%.   When there are multiple hops, the data capacity is reduced by 50% per hop.  Thus, for two hops, the total data capacity is only 1/4, for three hops it is 1/8, for four hops it is 1/16, and so forth.  
  • Mesh Point Mode:  In this mode, available only in dual-band APs, the wireless mesh radio (typically 5 GHz) only provides wireless backhaul, and the other radio (typically 2.4 GHz) only provides Wi-Fi connectivity to client devices. Operationally, the mesh radio operates like a dynamic WDS bridge link, so while each hop still introduces latency which adds linearly, there is no 50% throughput penalty per hop, since the mesh radio is not also servicing client devices on the same radio and can be devoted exclusively to backhaul.  Since Wi-Fi access to client devices is restricted to only one radio (typically 2.4 GHz), the overall client capacity of the AP is that of a single-band AP.  Furthermore, even dual-band 802.11ac client devices will only be able to connect at 802.11n data rates on the 2.4 GHz radio.

Best Practice:  Mesh APs should generally be configured to operate in Mesh Point mode.  The loss of bandwidth capacity from lacking wireless 5 GHz wireless connectivity is minor compared to the loss of bandwidth capacity from losing 50% of bandwidth per hop.  This also allows for the transmit power of the mesh radios to be set at their maximum value, so as to provide the maximum signal strength between nodes without being imbalanced with the low transmit power capability of most 5 GHz client devices.

In both operational modes, the overall data capacity of a mesh AP is reduced as compared to the same AP operating in a conventional configuration with a wired Ethernet connection to a wired switch infrastructure.  Accordingly, a mesh Wi-Fi network will never have the same level of throughput and client capacity of a conventional Wi-Fi network.

Mesh Network Example

Figure 7 shows an example mesh network deployed using the Best Practices highlighted above.  This is an RV park with 437 spaces spread across a roughly 2000’ x 1000’ area.  The main distribution frame (MDF) is in the southwest corner of the property, and trees in parts of the property preclude direct line-of-sight to many locations.

Figure 7:  Example of a mesh network, utilizing point-to-multipoint links to create additional root nodes.

The red links and bubbles indicate WDS Bridge links from the MDF to each of the root APs.  In some cases, multiple WDS Bridge links in series need to be established.  The point to point links are designated by Master or Slave with a letter and number index.  (For example, the WDS Bridge link going between the MDF and G8-R is designated link D, with [Master D] connected to [Slave D1]).

The other colors and bubbles represent the root and remote APs in Mesh Point mode, and the nominal mesh links between the remote APs and the root APs.  In the figure, each group is designated with a group number and an index to indicate that it is a root node or remote node.  (For example, in the right, the root node is designated [G8-R] and the nominal remote nodes are designated [G8-1] to [G8-4].) 

The point-to-(multi)point WDS Bridge utilizing 80 MHz channels on the UNII-2 and UNII-2e bands (i.e. channels 52-64, 100-112, 116-128).  Each root AP is set to a static 40 MHz channel on the 5 GHz band in the UNII-1 and UNII-3 bands (i.e. channels 36-40, 44-48, 149-153, and 157-161).