Monday, April 10, 2017

Wireless Backhaul Best Practices

This blog post provides guidelines on best practices for configuring and deploying wireless backhaul on Wi-Fi networks, and goes through the differences between and appropriate scenarios for client bridges, repeaters, WDS Bridge links, and mesh networks.

The Options for Wi-Fi Backhaul

In a conventional wireless network, each access point (AP) requires a wired Ethernet connection to provide backhaul to the wired network infrastructure and ultimately the Internet.  In some environments, however, it is either impossible or prohibitively expensive to run an Ethernet cable to each AP.  In such cases, Wi-Fi itself can be used to provide wireless backhaul from the AP (or other network appliance, such as a remote IP camera) to the wired network.  Each Wi-Fi backhaul link is referred to as a hop, and it is possible to have a chain of multiple hops between the remote wireless AP to the root wireless AP that has a wired connection to the network.

There are multiple options for providing Wi-Fi backhaul to the remote APs.  Naturally, each option has both benefits and limitations.  Most critically, each wireless hop introduces latency, which adds in a linear fashion with the number of hops.  Repeaters and mesh also inherently lower with throughput and user capacity, often as a square of the number of hops. 

It is critical to understand your technical requirements and constraints, as well as the benefits and limitations of each wireless backhaul option, when designing a Wi-Fi network and selecting a particular Wi-Fi backhaul approach.

Option 1:  Client Bridge

An access point operating in Client Bridge mode provides Wi-Fi connectivity for a wired client device.  A Client Bridge is intended to connect an individual wired client device to a Wi-Fi network.   This is depicted in Figure 1.

Figure 1:  Example of a network utilizing a client bridge.

When multiple wired client devices are connected through a single Client Bridge, they share the same MAC address on the network, namely the WLAN MAC address of the Client Bridge itself.  The multiple wired client devices can still be configured with different Layer 3 static IP addresses, and each wired device may or may not be able to obtain an independent Layer 3 DHCP address, depending on the DHCP server.

Best Practice:  When using an AP in Client Bridge mode, only connect one wired client device.

For typical applications, Client Bridge mode is only utilized on single-band APs. For dual-band access points, one radio (typically 5 GHz) will be configured to operate in client bridge mode, while the other radio (typically 2.4 GHz) will be used for providing Wi-Fi connectivity on an independent SSID to wireless client devices.  Client Bridge mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  Client Bridge mode is available on all EnGenius® single-band Electron™ and EnStation™ access points, as well as dual-band APs in the Electron™ ECB series.

Option 2:  Repeaters

An access point operating in Repeater mode provides both Wi-Fi connectivity to client devices as well as providing a wireless backhaul connection to one or more wired APs.  This is depicted in Figure 2.  Repeaters are intended for very small networks (e.g. home environments), where individual repeater APs are used to fill in particular coverage gaps.  Individual client MAC addresses are preserved, though the VLAN (if any) is defined by the main access point’s SSID that is being repeated.

Figure 2:  Example of a network utilizing a wireless repeater.

For dual-band access points, one radio (typically 5 GHz) will be configured to operate in repeater mode, while the other radio (typically 2.4 GHz) will be exclusively for providing Wi-Fi connectivity to client devices.  Note that both Wi-Fi bands depend upon the repeater radio for backhaul.  Since the repeater radio must spend half its time providing Wi-Fi connectivity to client devices and half its time providing wireless backhaul, the data capacity of a repeater radio for both backhaul and for Wi-Fi client connectivity is reduced by 50%.  When there are multiple hops, the data capacity is reduced by 50% at each hop.  Thus, for two hops, the total data capacity is only 1/4, for three hops it is 1/8, for four hops it is 1/16, and so forth. 
Repeater mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  Repeater mode is available on all EnGenius® Electron™ ECB series access points.

Option 3:  Point-to-(multi)point WDS Bridge Links 

A dedicated pair of APs, usually with integrated directional antennas (such as the EnGenius® EnStationAC), are configured to operate in WDS Bridge mode to create a point-to-point link to provide wireless backhaul.  The WDS Bridge link on the remote end is connected to the remote AP via its wired Ethernet interface. From the perspective of the rest of the network, this wireless connection looks like a wired connection; in WDS Bridge mode, the wired Ethernet frame is encapsulated and encrypted in a Wi-Fi packet on one end, transmitted across the wireless link, and then de-encapsulated and decrypted on the other end.  Thus, all wired Layer 2 information (i.e. client MAC addresses, VLANs, etc.) are preserved across the WDS Bridge link. Point-to-multipoint WDS Bridge links ae also readily possible, though be aware the remote links collectively share the total available airtime bandwidth of the link.  This is depicted in Figure 3.

Figure 3:  Examples of point-to-point and point-to-multipoint networks utilizing WDS Bridge links.

The WDS Bridge links are statically established, so that each WDS Bridge AP only accepts connections from pre-defined radios.  WDS Bridge usually requires dedicated hardware at each remote location operating on independent channels, though some APs allow for one radio (typically the 5 GHz) to be in WDS bridge mode and the other radio (typically the 2.4 GHz) to be in AP mode to provide Wi-Fi service client devices.

Best Practice:  WDS Bridge with dedicated 5 GHz only access points is generally recommended for most networks requiring both wireless backhaul and high bandwidth and/or high user capacity Wi-Fi.   While each hop adds latency, there is no throughput or user capacity degradation, since the point-to-(multi)point backhaul link is solely dedicated to wireless backhaul, with Wi-Fi access for client devices being handled by separate access points.

For large networks consisting of multiple remote nodes, a WDS Bridge backhaul network requires its own design effort to ensure appropriate bandwidth capacity and channel utilization.  WDS Bridge mode is generally only available on standalone APs, meaning that each AP must be configured individually and cannot be managed or monitored from a centralized controller.  WDS Bridge mode is available on all EnGenius® Electron™ and EnStation™ access points.

Point-to-Multipoint WDS Bridge Network Example

Figure 4 shows an example of an outdoor Wi-Fi network at an RV park utilizing point-to-(multi)point links to provide wireless backhaul to APs mounted on light poles.  The colored lines indicate the point-to-(multi)point WDS Bridge links implemented with EnGenius® EnStationAC access points. 

Figure 4:  Example of a wireless network utilizing point-to-(multi)point links for backhaul to outdoor wireless APs.

Red markers indicate the location of outdoor dual-band APs, and yellow markers indicate the location of additional light poles that were available at the property.  To maximize wireless backhaul capacity, all of the WDS Bridge links utilized 80 MHz channels in the UNII-2 and UNII-2e bands (i.e. DFS channels 52-64, 100-112, and 116-128).  The 5 GHz radios on the dual-band APs were set to use 40 MHz channels on the UNII-1 and UNII-3 bands (i.e. channels 36-40, 44-48, 149-153, and 157-161), so as to avoid co-channel interference with the point-to-multipoint backhaul network.

Option 4:  Mesh Networks

In a mesh network, the AP uses its own radio to provide a wireless backhaul to other APs on the network, eventually reaching an AP with a wired Ethernet connection to the wired backhaul infrastructure and the network.  In this sense, a mesh network is a network of repeaters, though mesh is designed to operate automatically and more intelligently on a large scale.  A mesh network creates a set of “dynamic WDS Bridge” links, using routing algorithms to automatically calculate the most optimal wireless path through the network back to a wired root node.  This makes mesh networks relatively robust to the failure of an individual AP; in a process referred to as “self-healing”, the routing algorithms will automatically calculate the “next best” path through the network if an AP in the path goes offline.  Since the routing functions are done automatically within the mesh software, mesh networks are actually fairly straightforward to set up and are thus scalable to cover large geographic areas.  All wired Layer 2 information (i.e. client MAC addresses, VLANs, etc.) are preserved across the mesh link.  Examples of mesh networks are shown in Figure 5 (for home / SOHO environments) and Figure 6 (for larger campus-wide environments).

Figure 5:  An example of a home / SOHO mesh network, utilizing EnGenius® EMR3000 mesh routers.

Figure 6:  An example of a large campus mesh network, utilizing EnGenius® EWS1025CAM mesh cameras.

The mesh network control architecture can either be centralized or distributed.  With a centralized control architecture, an AP controller is required to calculate and coordinate the mesh parameters for each AP.  This architecture, however, limits the scalability of the mesh network to the capacity of the AP controller.   In a distributed control architecture, such as the EnGenius® Neutron™ series and EMR3000 product, each AP operationally acts like a router, continuously sharing information about its connection status to its neighbors, and each AP uses this information to compute its own optimal mesh path. In a distributed architecture, an AP controller can be optional, though is generally extremely useful in providing centralized real-time monitoring of the mesh network, as well as establishing the core initial mesh network parameters, such as mesh ID, encryption, etc.

Unfortunately, mesh networks have significant limitations, most notably in the loss of throughput and user capacity, which scales geometrically as the number of wireless hops increase, as well as the increase in latency, which scales linearly as the number of wireless hops increase.  

Accordingly, mesh networks are not suitable for high bandwidth or latency-sensitive applications.  Because of these performance limitations, it is generally recommended that mesh networks be avoided unless no other viable backhaul options are available. Mesh networks should only be used in environments where providing Ethernet data wiring to access points or cameras is impossible or cost-prohibitive. 

Mesh networks were originally trendy in the mid-2000s, as a way of both providing metropolitan Wi-Fi coverage as well as coverage for large outdoor properties where wiring was prohibitively expensive, such as RV parks, garden-style apartment complexes, marinas, etc.  While many mesh networks were successfully deployed, most of these efforts ultimately failed, especially in metropolitan Wi-Fi.  Early mesh networks relied upon single-radio APs on 2.4 GHz using 802.11g.  When dual-band APs were introduced, only 802.11a was available on the 5 GHz band, which still led to very low throughputs as the number of hops increased.  

With the wide adoption of dual-band access points with 802.11ac, there has been renewed interest in mesh for both Wi-Fi access and surveillance applications.  Accordingly, several startup companies, as well as established vendors like EnGenius®, have introduced mesh Wi-Fi products utilizing 802.11ac.  While the data rates of 802.11ac are approximately 25 times larger than the 802.11a data rates of a decade ago, the number of client devices and their bandwidth demands have also grown exponentially during that time.  The fundamental limitations of mesh networks are therefore still the same, and thus mesh may ultimately again prove to be a passing fad.

Nonetheless, mesh networks are the only viable option in many cases.  The sections below highlight how to best design and deploy mesh networks, so as to maximize their performance and mitigate their inherent limitations.

Mesh Network Terminology and Best Practices

The access points in a mesh network are categorized as either root nodes or remote nodes:
  • Root Node (a.k.a. Gateway Node):  This is an access point with a wired connection to the wired switch infrastructure.  The remote nodes establish wireless backhaul connections to the root node.  Note that the wired connection utilized by a root node can either be (1) a direct Ethernet or fiber-optic connection to the wired switch infrastructure or (2) a wired connection to a separate WDS Bridge wireless point-to-(multi)point link on an independent channel.
  • Remote Node:  This is an access point without a wired Ethernet connection.  Backhaul to the network is established via a wireless connection to a root node or to other remote nodes.  Note that the remote AP still requires electrical power, so an Ethernet connection to a PoE injector is common, though the “network” end of the PoE injector may not be connected at all or may only be connected to a wired client device, such as an IP camera.

The path from a particular remote node back to a particular root node can require connections via multiple intermediate remote nodes, and this wireless link in this chain is referred to as a hop.  The mesh routing algorithm selects the most optimal route through the network.  The optimization function used by the mesh APs is generally proprietary to each AP vendor, but typically attempts to balance several, often conflicting, parameters, such as the following:
  1. Minimize the number of hops, so as to minimize the total wireless latency and throughput penalty of the network
  2. Maximize the signal strength of each hop, so as to maximize the achievable Wi-Fi data rates between the mesh radios on each hop.    For maximum data rates in 802.11ac, the received signal strength indicator (RSSI) would ideally be in the -40 dBm to -50 dBm range, though this is usually unachievable in practice since omni-directional antennas are typically used to create the widest field of view to neighboring APs.  Data rates should be above -65 dBm for decent data rate performance between hops.
  3. Balance the load on each AP, so as to account for the number of associated client devices and the total throughput consumption on each AP.  The throughput load stacks as the number of hops increase, so intermediate remote nodes that are heavily utilized with client traffic will not give as many resources to downstream remote nodes. 

Because of the competing tradeoffs in this optimization process, mesh networks can often result in counter-intuitive and/or sub-optimal topologies.

Best Practice:  The network design should cluster the APs into groups consisting of up to four remote nodes that are only one hop away from a root node.  Thus, at least 20% of your APs, distributed roughly evenly throughout the property, should be root nodes.  Each remote node is therefore nominally only one hop away from a root node.  In the event of a failure of a root node, the nearby remote nodes will then only be 2-3 hops away from another root node.  This approach generally requires creating additional root nodes, which can be done either by running Ethernet or fiber-optic cable to the particular remote locations, or by establishing dedicated point-to-(multi)point WDS Bridge links to create “wireless wires” from the root AP back to the wired network.

Best Practice:  Each root node should be set on a static independent channel, and each remote node should be set to “auto channel”.  This is done to maximize the airtime capacity of the overall network, so that multiple neighboring root nodes do not create self-interference.  The remote nodes are set to auto-channel so that they can fail over to a different root nodes in the event of the failure of their primary root node.  When utilizing point-to-(multi)point WDS Bridge links to establish root nodes, these must also be on static independent channels, and thus must be accounted for in the overall channelization plan. 

Both root nodes and remote nodes can generally operate in one of two modes:
  • Mesh AP Mode: In this mode, the wireless radio acts like a repeater, providing both Wi-Fi connectivity to client devices as well as providing a backhaul connection to one or more remote APs.  For single-band mesh access points, this is the only operational mode available.  For dual-band access points, one of the bands (typically 5 GHz) will be configured to operate in this mode.  The other band (typically 2.4 GHz) will be exclusively for providing Wi-Fi connectivity to client devices.  Note that both Wi-Fi bands depend upon the mesh radio for backhaul.  Since the mesh radio must spend half its time providing connectivity to client devices and half its time providing backhaul, the data capacity of the mesh radio for both backhaul and for Wi-Fi client connectivity is reduced by 50%.   When there are multiple hops, the data capacity is reduced by 50% per hop.  Thus, for two hops, the total data capacity is only 1/4, for three hops it is 1/8, for four hops it is 1/16, and so forth.  
  • Mesh Point Mode:  In this mode, available only in dual-band APs, the wireless mesh radio (typically 5 GHz) only provides wireless backhaul, and the other radio (typically 2.4 GHz) only provides Wi-Fi connectivity to client devices. Operationally, the mesh radio operates like a dynamic WDS bridge link, so while each hop still introduces latency which adds linearly, there is no 50% throughput penalty per hop, since the mesh radio is not also servicing client devices on the same radio and can be devoted exclusively to backhaul.  Since Wi-Fi access to client devices is restricted to only one radio (typically 2.4 GHz), the overall client capacity of the AP is that of a single-band AP.  Furthermore, even dual-band 802.11ac client devices will only be able to connect at 802.11n data rates on the 2.4 GHz radio.

Best Practice:  Mesh APs should generally be configured to operate in Mesh Point mode.  The loss of bandwidth capacity from lacking wireless 5 GHz wireless connectivity is minor compared to the loss of bandwidth capacity from losing 50% of bandwidth per hop.  This also allows for the transmit power of the mesh radios to be set at their maximum value, so as to provide the maximum signal strength between nodes without being imbalanced with the low transmit power capability of most 5 GHz client devices.

In both operational modes, the overall data capacity of a mesh AP is reduced as compared to the same AP operating in a conventional configuration with a wired Ethernet connection to a wired switch infrastructure.  Accordingly, a mesh Wi-Fi network will never have the same level of throughput and client capacity of a conventional Wi-Fi network.

Mesh Network Example

Figure 7 shows an example mesh network deployed using the Best Practices highlighted above.  This is an RV park with 437 spaces spread across a roughly 2000’ x 1000’ area.  The main distribution frame (MDF) is in the southwest corner of the property, and trees in parts of the property preclude direct line-of-sight to many locations.

Figure 7:  Example of a mesh network, utilizing point-to-multipoint links to create additional root nodes.

The red links and bubbles indicate WDS Bridge links from the MDF to each of the root APs.  In some cases, multiple WDS Bridge links in series need to be established.  The point to point links are designated by Master or Slave with a letter and number index.  (For example, the WDS Bridge link going between the MDF and G8-R is designated link D, with [Master D] connected to [Slave D1]).

The other colors and bubbles represent the root and remote APs in Mesh Point mode, and the nominal mesh links between the remote APs and the root APs.  In the figure, each group is designated with a group number and an index to indicate that it is a root node or remote node.  (For example, in the right, the root node is designated [G8-R] and the nominal remote nodes are designated [G8-1] to [G8-4].) 

The point-to-(multi)point WDS Bridge utilizing 80 MHz channels on the UNII-2 and UNII-2e bands (i.e. channels 52-64, 100-112, 116-128).  Each root AP is set to a static 40 MHz channel on the 5 GHz band in the UNII-1 and UNII-3 bands (i.e. channels 36-40, 44-48, 149-153, and 157-161). 

Wednesday, March 22, 2017

When Do You Trust a Wi-Fi Predictive Model? The Battle of Accuracy vs. Precision

Most non-engineers tend to use the terms accuracy and precision interchangeably, but they are actually very distinct concepts.    Precision is based on the computational power of the software and the underlying mathematics.  Most mathematical models (such as Wi-Fi predictive modeling software using ray traces to compute absorption and reflections of walls and objects, as well as free space path loss over distance) will generally provide a high level of precision.  Accuracy is based on the level of complexity (and underlying assumptions) of the mathematical model, as well as the quality of the input parameters.

One of the tricks of performing any type of engineering modeling or simulation in Wi-Fi (or, for that matter, in any engineering discipline) is that you first need to estimate; i.e. you need know what the answer should be BEFORE you actually start the model, at least to a rough order of magnitude.  The engineering model serves only to add both accuracy and precision to your estimation.  
Without having a good estimate, however, the results of the predictive model can be extremely precise, while simultaneously being wholly inaccurate

The true art and skill of wireless design, therefore, is in understanding the initial estimate. A Wi-Fi predictive model, like any mathematical model, is a very sophisticated idiot!  The model will precisely compute what you tell it to. However, if the inputs to the model are wrong, the model doesn’t know that, and so the model will compute a very precise, and very wrong, answer.  Another common way of saying this is "garbage in, garbage out".

It should also be noted that there are also underlying assumptions in the model itself in the way it works that will impact and limit both precision and accuracy.  For Wi-Fi predictive modeling packages (e.g. Ekahau, Tamograph), the following assumptions and simplifications are typically used:
  • uniform walls with known dB loss values and known reflectivity percentages
  • the models only account for absorption and reflectivity, and generally not diffraction, scattering, or other effects (and when using the “attenuation zones”, the model does not even calculate reflectivity, but only attenuation as a a dB/ft loss coefficient).  This is done to keep computation times reasonable
  • we generally ignore the effects of furniture, wall decorations, appliances, mirrors, people, etc
  • the antenna signal propagation patterns modeled are based on either measurements or design predictions of the antenna manufacturer, which will have its own underlying errors and assumptions
  • we are generally only positioning / placing the access points with an accuracy of several feet at best, and these placements may change slightly during installation, based on how cabling is run

Thus, there are always intrinsic errors and simplifying assumptions in any model.  For most purposes, however, this limited level of accuracy is quite sufficient, so long as these simplifying assumptions are “reasonable”. We generally don’t need an exact model that predicts the absolutely correct signal level at every spot that we will get in the environment. What we need is a model that is “close enough”, so as to tell us how many APs are required, along with their locations and critical settings, such as channel and transmit power.  We can easily be off by +/- 5 dB in any particular location, and still it would be sufficient for most purposes.

So How Does One Create an Initial Estimate?

Unfortunately, the art of good estimation only comes with a lot of experience, usually a lot of negative experience where a project has gone horribly, horribly wrong and you need to fix it, often by trial and error.  

That said, there are several guidelines that can be used to point an engineer in the right direction to develop a reasonable initial estimate.   For Wi-Fi design, these are the guidelines I employ:

  1. Understand Your Requirements:  How is the network going to be used?  Approximately how many, and what type, of client devices?  Where are they going to be located?  What areas do (or do not) require signal coverage?  How many years is this network going to be deployed, and how is usage likely to change over that time?   

    These requirements vary substantially by vertical market, and can vary even by individual project.  Most Wi-Fi engineers tend to specialize in one vertical market or a small set of closely related vertical markets, so trends on previous projects will typically be applicable to your current one.  If, like me, your specialty is "SMB" which spans multiple vertical markets, you need to develop a good understanding of how Wi-Fi gets used across many different environments.

  2. Understand Your Constraints:  What are the building materials?  Can APs be deployed in rooms or can they only go in the hallways?  What kind of budget is allowed on this project?  Are there other Wi-Fi networks or non-Wi-Fi sources of interference in the environment?   Are aesthetics a concern (nobody likes seeing external antennas)? 

    Your requirements define what the network has to do.  Your constraints are what the network has to work around.  The distinction is subtle but important.  Requirements are always independent of each other and are generally inviolate. Constraints can be highly coupled and, in many cases, self-contradictory, but are also potential areas of compromise and push-back.
  3. Understand Your Solutions:  Wi-Fi engineers generally have one, or at most a few, preferred Wi-Fi vendors that they use in most deployments, and a smaller subset of products from that vendor or vendors that they use.  Every product will have its own capabilities, limitations, and idiosyncrasies. Understanding how these products have behaved on previous projects (especially if challenges had to be overcome) will help in knowing how many will be needed on the next product.  

    Most AP vendors will provide and promote a set of Best Practices, guidelines, case studies, etc. on how to best deploy their products in various scenarios.  It is important that the Wi-Fi engineer be able to separate the marketing hype from the technical capabilities of a product, and know how to best tune the configuration settings (i.e. "nerd knobs") for that product for their environment.

  4. Think Creatively; Don't be Constrained by Your Previous Solutions: This may seem contradictory with the prior point, but this is a key aspect of any successful Wi-Fi design.  My graduate thesis advisor, Professor Nam P. Suh (Mechanical Engineering Department Chair at MIT and later President of KAIST), was a controversial advocate of using a "clean sheet of paper" for every engineering design project, no matter the engineering discipline.   He promoted fully understanding your requirements and constraints up front, and then being unconstrained in terms of picking the right solution for the right problem.   Unfortunately, this approach is harder to achieve in practice than it sounds.

    AP vendors are specialized and generally target particular verticals - an AP or vendor that is very good for one environment is often very poorly suited for another.   This can work both ways, in terms of either lacking key features or having too many features and too much complexity, with a correspondingly high price tag. However, a Wi-Fi engineer (and the organization he or she works for) usually is heavily invested in particular vendors with education, experience, and infrastructure. Switching AP vendors can often be a logistical nightmare that most organizations won't employ unless forced to.  

    That said, there are often creative solutions that don't require a radical departure to a new vendor. The use of external directional antennas in some environments can provide some unique solutions that may be appropriate in some environments, especially in areas such as outdoor parks, parking lots, marinas, warehouses, etc.  Additionally, if you understand the "nerd knobs" (pursue your CWNA and beyond if interested), some settings can be tuned for particular environments to optimize performance.  

  5. Establish Good "Rules of Thumb": It can be useful to start with simple (and even "overly simple") assumptions to establish a starting point, so long as it is understood that this is ONLY a starting point and the estimate will need to be refined from there based on actual requirements and constraints.  

    A very good rule of thumb in Wi-Fi is to start by using a fixed transmit power level for 2.4 GHz and 5 GHz (I personally like using 14 dBm for 2.4 GHz and 20 dBm for 5 GHz in most environments), so that all of the APs have roughly the same coverage area on both bands and are not too much stronger than the transmit power of typical client devices such as smartphones and tablets.  This allows the APs to be spaced roughly evenly in the environment, making the development of a static channel plan simpler.

    For example, a common question I get is "how much area in square feet will an access point cover"?  The answer to this is not simple, as it depends upon building materials, building geometry, whether 2.4 GHz or 5 GHz is being discussed, expected client density, etc.   As one colleague of mine puts it, "Wi-Fi is not a can of paint!"   That said, a square footage estimate for a design primarily driven by coverage (vs. high capacity) can be a good starting point.  For an indoor environment (e.g. offices, private homes) with standard drywall, one omni-directional AP per 1500 - 2000 sq ft is not unreasonable.  Similarly, for open outdoor environments, an area of approximately 10,000 - 15,000 sq ft (i.e. a radius of approximately 50' - 75')  would be reasonable.
  6. The Estimate and the Model Must be Consistent: While one wants to make the initial estimate as good as possible, if we could estimate extremely accurately up front, there would be no need for predictive modeling.  You should expect that your estimate will be off by 20% - 25%.  The intent of the predictive model is to refine your estimate in terms of its accuracy and precision.  If the estimate and the predictive model are wildly divergent, then you have a problem somewhere and need to figure it out.

    For example, if you estimate that a project will require 10 APs, and the predictive model tells you that you need 8, or 12, then your estimate was pretty good.   If your model tells you that you only need 5 APs, or that you need 20 APs, then either there is a fundamental flaw in your estimate, or there is a fundamental flaw in your model or its inputs.   

An Example of Predictive Modeling Gone Awry

This is an actual case that came up a few weeks ago to provide Wi-Fi coverage in a 300’ x 300’ RV park with approximately 30-35 RV trailers, no trees, and excellent line of sight everywhere.  

In this case, our sales agent estimated and already sold the property six outdoor omni-directional access points.  The customer then asked us where to place the access points.  The customer was planning on installing 20' poles and running fiber from the MDF at the clubhouse to the poles, but needed to know where the poles should be placed.  My initial reaction was that this estimate was reasonable (and perhaps even slightly high) for such a small space.  I didn’t need the model to tell me this, but rather I know that from applying several years of experience deploying these types of networks.  

However, when I handed this to a junior engineer to model, the engineer came back with a predictive model solution requiring 12 APs, and even then, the coverage was marginal, especially on 5 GHz.

2.4 GHz predicted signal coverage (original model)

5 GHz predicted signal coverage (original model)

From prior experience deploying RV parks, I took one look at this and knew the answer was wrong. Quantifying WHY it was wrong, however, took some detective work.  In the end, this came down to misunderstandings of requirements and constraints, as well as how things should be modeled:
  1. This model assumed 11 dBm power levels on both bands, which we typically only use for high density deployments when many APs need to be co-located in one space for handling user capacity.  With only 30 RVs, even with 3-4 devices per RV, high user density of devices is clearly not a requirement.   

  2. This model assumes that the APs are going to be mounted between the RVs at a height of about 4', and not on poles approximately 12' - 15' in the air, several feet above the tops of the RVs.

  3. The RVs themselves were modeled as solid blocks of metal using the area attenuation zones.  This mode of modeling assumes just an attenuation loss of dB / ft, and not any reflectivity of signal.  While the RVs themselves have an exterior shell of metal or fiberglass, these are thin and have large windows through which the RF can penetrate.  It is my general assertion, therefore, that the walls of the RVs can be ignored from a modeling perspective.  Thus, we are essentially covering an empty field.  Even if you debate the wisdom of this assertion, the next simplifying assumption is to model the RVs as thin, hollow shells of metal that reflect 90% of the signal, not solid metal blocks that do not reflect at all.

Correcting for these misunderstandings and ignoring the outer shell of the RVs leads to only requiring four access points.

5 GHz predicted signal coverage (open field, 20 dBm transmit power)

One can debate that the outer shells of the RVs, even if thin, will have an appreciable signal attenuation, and for appropriate talkback of client devices in the RVs to the APs, more APs would be required.  This can be accomplished by lowering the transmit power levels of the AP, say to 15 dBm, in which case, six APs are needed.

5 GHz predicted signal coverage (open field, 15 dBm tramsmit power)

Personally, I would have used two outdoor APs with sector antennas on the main building to cover the entire RV park, with one indoor AP to provide coverage inside the main building, since it would be in the shadow of the sector antennas.  Such a solution would’ve been both cheaper and easier to install; not only are there fewer APs, but there is no need for any poles or fiber runs.   

5 GHz predictive signal coverage using sector antennas

Given the constraint that the customer already purchased the outdoor APs, and we felt we had already confused the customer enough, we presented the six AP solution with the appropriate configuration settings.

Aside from the 12 AP solution which had flawed requirements and assumptions, none of the other design options presented here are “wrong”.  There are several approaches to doing a design, and therefore several different solutions that will work, i.e. meet the true requirements and expectations of the customer, including adequate coverage, adequate capacity, minimal co-channel interference, etc. Different design solutions are thus “better” or “worse” in comparison to each other, usually based on parameters like cost, ease of installation, ease of maintainability, etc.  

There is the old adage “If all you have is a hammer, everything looks like a nail.”  The design approach we have and the solution we select will generally be constrained by the potential solutions you have to work with (i.e. the types and models of APs, antennas, etc.).   Nonetheless, understanding your requirements and constraints up front, along with the use of rules of thumb and experience, is essential to creating an initial estimate of a design solution, which then can be plugged into a predictive model for refinement.

Friday, February 10, 2017

MU-MIMO is Dead! MIMO and MU-MIMO Explained

The 802.11ac story is quite muddy, somewhat deliberately so by those of us in the business of selling people new hardware, so this blog will try to deconstruct it for general understanding.  MIMO (Multi-In Multi-Out) and MU-MIMO (Multi-User MIMO) are actually very different technologies and are independent of each other.   

As will be shown, MU-MIMO requires quite a lot of things to go "just right" in order to work at all, and without a large client base of support, it is a technology very high on hype and very low on practical deployment performance.


MIMO capability was introduced with 802.11n.  MIMO allows for multiple parallel spatial streams to increase total bandwidth between a single transmitter and single receiver.  MIMO capabilities are specified as follows:

Number of transmit radios  x  Number of receive radios :  Number of spatial streams

There are variations on MIMO where these numbers can be different.  However, to maximize throughput (which is what we are all obsessed with as an industry), the number of transmit radios = the number of receive radios = the number of independent spatial streams.

MIMO communication works in both directions (i.e. from AP to client device and from client device to AP), but the maximum number of spatial streams between the access point and the client device is always limited by whichever device supports FEWER spatial streams.  Thus, if you have a three-stream AP (3x3:3) and a two stream client device (2x2:2), you are only ever going to get two spatial streams, and the last stream from the AP is unused.   Most client devices are only capable of single stream (1x1:1) or dual stream (2x2:2); more streams require more radios and more antennas, which consume more space and more battery, both of which are quite antithetical to the design of smartphones and tablets.   There are a small number of high-end laptops (e.g. MacBook Pros) that support three streams.   

The maximum theoretical half duplex data rate in 802.11ac for MIMO is as follows.   Note, this is the number that all vendors advertise, but these numbers are NOT data throughput.  Under very ideal conditions (single AP and single client device very close to each other, no interference, etc.), the maximum achievable throughput will be approximately 40% of the half duplex data rate.   This is a property of how the IEEE 802.11 protocol works.   Keep in mind that most real-world conditions are also far from ideal, so your actual throughput numbers are usually going to be much lower still.

  • 1x1:1 – 433 Mbps half-duplex data rate (MCS09)
  • 2x2:2 – 867 Mbps half-duplex data rate (MCS09)
  • 3x3:3 – 1299 Mbps half-duplex data rate (MCS09)


MU-MIMO is introduced in 802.11ac wave 2, and is intended to better handle very high density client device environments (e.g. college lecture halls, convention centers, auditoriums, concert venues, etc.).   It is a technique that does not increase raw half-duplex data rates, but rather allows the AP to literally talk to multiple client devices simultaneously.  

It utilizes a technique called transmit beamforming to directionalize an omni-directional signal by inducing deliberate phase offsets of the same signal transmitted across multiple antennas.   This effect creates zones of constructive interference (where the signals are in phase and thus add linearly) and destructive interference (where the signals are out of phase and nullify each other).   

This technique only works uni-directionally downstream from the access point to the client device(s).  In order to determine the correct phase shifts, the access point must also know the position of each client device, relative to itself.  Thus, the client device must support a feedback mechanism – the AP periodically sends out “sounding frames” from each antenna, and each MU-MIMO capable client will respond with a matrix indicating how well it hears the signal from each antenna, and those values are then used by the AP to compute relative position of the client devices.  

There are three key elements for MU-MIMO to work properly:

  • The clients in the same MU-MIMO group must all support the position feedback mechanism. This means the client devices must utilize 802.11ac wave 2 chipsets.  Older chipsets are not upgradable, as the core function is built into the chipset hardware itself
  • The clients in the same MU-MIMO group need to be physically far apart from each other, at least angularly, relative to the AP.  This is so that the signal for each client can be maximized at the correct client’s position and nullified at the other client positions.   It won’t work for client devices that are only a very small distance (i.e. less than 10-20 feet) away from the access point, or are too close to each other in radial direction from the perspective of the AP.
  • The time required to transmit to each client should be (approximately) the same.  This means each client should have a similar amount of data and be able to use a similar data rate.  They don’t need to precisely match, but the efficiency of this technique is vastly reduced if one client in the MU-MIMO group takes significantly longer to talk to than the other client devices.

Thus, there are several things that have to go “just right” in order for MU-MIMO to even do anything.

The 802.11ac wave 2 MU-MIMO access points all currently have four antennas, which enables them to talk to up to three separate clients.  (The direction of the last spatial stream cannot be controlled independently).   What few MU-MIMO capable clients that exist are typically only single-stream or dual-stream.   The Edimax USB adapter you indicate is a MU-MIMO capable client device, in that it supports the required position feedback, but is only a single spatial stream device with a single antenna, hence the half-duplex data rate of 433 Mbps.  

The reality is that very few client devices actually support MU-MIMO.  The only smartphone that supported it to date was the Samsung Galaxy Note 7, which had unrelated incendiary problems and is no longer available.  The iPhone 7 had been rumored to support it, but when that product was finally released by Apple, the capability was quietly omitted.   In fact, the only client devices I know about that support MU-MIMO are the USB dongles like that Edimax EW-7822ULC.  Ironically, this is not terribly useful, since the technique of MU-MIMO is intended for very dense client environments (e.g. auditoriums, college lecture halls, stadiums, convention centers, etc.), which by definition are dominated by smartphone and tablet users.  Thus, while we and every other vendor actively sells and promotes 802.11ac wave 2 access points (and client device dongles), the truth is that MU-MIMO is both unnecessary and unusable for most SMB applications.

Preview: Tri-Band

Many vendors, including EnGenius, are in the process of launching a new AP product known as “tri-band” that incorporates a 2.4 GHz 802.11n radio and two 5 GHz 802.11ac radios, and can therefore simultaneously (and bi-directionally) communicate with two 5 GHz client devices simultaneously, without the need for any additional support by the client devices.  Thus, the product is automatically backwards-compatible with all 5 GHz 802.11n and 802.11ac client devices.  We anticipate this product to be much better suited to high-density client environments, since it can talk to two 5 GHz client devices, bi-directionally, all the time, vs. MU-MIMO which can talk to up to three 5 GHz client devices, downstream only, sometimes.  Watch this space...

Monday, December 12, 2016

Why “Shiny and New” is Bad for SMB Wi-Fi

In our society of the 30-second sound bite, we market our Wi-Fi, as we market most technological innovations, on being shiny and new.   For years, everything in Wi-Fi has been about size and speed – the fastest access point, the most antennas, the largest number of simultaneous clients, the latest technology, etc. 

In Wi-Fi, “shiny” is essentially all about speed.   We are an industry obsessed with faster and faster data rates, pushing more and more data through the unbound medium of radio frequency:  
  • In 802.11n, we introduced MIMO, which allows multiple data streams to be transmitted simultaneously from different antennas.  These different streams are heard by all of the antennas on the receiver, and then mathematically separated to reconstruct the data streams.  While this technique works, it requires a lot more processing capabilities on both the transmitter and the receiver, and it is the limit of what we can do to increase raw speeds.  
  • Next came the need for a throughput fix, and the industry focus on improving airtime efficiency to achieve faster total data rates.   In 802.11ac wave 2, we use even more complex mathematical techniques to do multi-user MIMO (MU-MIMO), where an access point literally transmits independent messages to multiple receiver clients simultaneously.   
  • The standard currently in development by the IEEE, 802.11ax, doubles down on Wi-Fi complexity to provide even more airtime efficiencies with bi-directional MU-MIMO and overlapping channel utilization.

Clearly these innovations come at a price: complexity. And when we increase the complexity of a system, we fundamentally make it more fragile and less robust.  When a system violates the KISS rule (keep it simple, stupid), it is more prone to errors due to normal environmental variations and less likely to be able to handle and recover from problems elegantly.   

When a system is more complex, there are more things that can (and do) go wrong, and the more the system relies upon everything working correctly to even function.  The “overhead” in Wi-Fi is not present by chance – it is architected into the 802.11 protocol to ensure that information is transmitted and received reliably.  When you attempt to simultaneously increase complexity and reduce overhead, you increase the number of problems that can occur and decrease your ability to handle and recover from potential problems that may develop.  Intuitively, therefore, a more complex system is much easier to break and is less reliable than a simple system.

SMBs, and Their Providers, Pay the Price

While this need for new and shiny has driven amazing technological innovation, it has long since surpassed the actual performance needs of most Wi-Fi networks, especially in the small-to-medium business (SMB) sector.  While the SMB sector covers a lot of different vertical markets, the fundamental requirements for a Wi-Fi network in this space are as follows:
  • Ubiquitous and stable coverage
  • Reliable performance
  • Moderate numbers of simultaneous users and bandwidth
  • Low failure rates (MTBF) & fast repair times (MTTR)

Note the use of the terms “reliable,” “stable,” and “moderate,” and the requirements for high MTBF and low MTTR.   Fundamentally, SMB Wi-Fi needs to “just work.”  It should not be variable, it should not fail very often, and if/when it does fail, it should be really easy and quick to get back online again.  

These most critical SMB requirements are not only inconsistent with, but diametrically opposed to, new and shiny Wi-Fi.   SMB owners need to know they can rely upon their Wi-Fi networks, and the people who install and service them. Period.

For the managed service providers (MSPs) who deliver IT installation and maintenance services, reliability and ease of service for Wi-Fi networks are also fundamental requirements.   An MSP requires three things to be successful with Wi-Fi:
  • A robust and reliable network that essentially takes care of itself
  • Repeatability across multiple customers
  • Recurring monthly service monitoring fees  

MSPs make money by being able to install a network quickly and not receive excessive maintenance calls once it’s installed. When problems do occur, they need to be able to resolve them quickly, preferably without a costly truck roll to the customer site.

Recognition of these dynamics is making its way to the core of managed Wi-Fi services. New providers such as Uplevel Systems have designed managed services platforms specifically for the needs of IT consulting firms needing to automate configuration and remote management as much as possible. By combining feature-rich hardware and cloud-based monitoring and troubleshooting, Uplevel equips consultants with varying degrees of familiarity with Wi-Fi to provision secure and robust services with a few choice shiny and new “nice to haves” like easily managed guest access. 

With storage, backup, security and VPNs available on the same platform, the basic value proposition for MSPs is “Simplify, upsell, repeat.” The appeal of keeping it simple—while still delivering better quality than consumer-grade technology – proves ideal for SMBs where all the glitters is more red tape than gold. Or green. 

Monday, November 28, 2016

Proper Wi-Fi Design and Deployment: Required Even In the Home

As more and more wireless appliances and streaming media applications permeate the home, the reliability of the Wi-Fi network as the infrastructure for this traffic becomes critical to performance.  This is especially challenging in a multi-vendor environment. While every vendor  technically follows the IEEE 802 networking standards (which are supposed to provide for interoperability), there is the “standard” and then there is the reality.  

Most importantly, the more intense networking demands of a home network REQUIRE a properly designed and integrated system, which is quite antithetical to the conventional consumer “plug and play” approach. 

Architecturally, a home LAN can be broken into the following functions.  Each of these functions are independent of each other, and thus can generally and safely be satisfied by different equipment from different vendors:

  1. Modem:  Converting the Internet connection from the street (coax cable, fiber, DSL, satellite, etc.) into Ethernet
  2. Router:  Defining the local area network (LAN) and providing routing and NAT functionality between the WAN and the LAN
  3. Switch(es):  Providing interconnectivity between wired network devices on the LAN, including infrastructure devices (e.g. routers, APs, other switches) and wired client devices.
  4. Access Points:  Providing multiple wireless client devices access to the wired LAN

Several companies provide physically-integrated equipment that provide multiple functions (e.g. a cable modem Wi-Fi router which provides all four functions in one box), but these should be treated as separate functions.  Within a particular function, it is generally a bad idea to mix and match vendors.  This is especially true in the “access point” function, because there is quite a lot happening “under the hood”, and while all of the vendors are following the IEEE 802.11 specs, they are all doing it somewhat differently which makes co-existence problematic and, at the very least, difficult to do well.   Thus, if you are installing a managed Wi-Fi system, you want to remove or disable any other third party APs installed by the customer or other service providers to the extent possible.   Co-located Wi-Fi systems will interfere with each other and reduce Wi-Fi capacity, even if they are not in “active use”.

For the modem, the general recommendation is to have the ISP provider put their appliance in “bridge mode”, which shuts off all but the modem function.  Some ISPs have a harder time of doing this than others in practice.   At a minimum, the built-in access point in the cable modem should be disabled.  The built in AP is invariably only consumer grade and has very few knobs to turn in terms of performance tuning, making it inappropriate for the performance challenges of larger home Wi-Fi networks, especially those that require multiple APs for both coverage and capacity. 

The built-in router of a cable modem can be used reasonably safely if VLANs are not required.  If VLANs are required, you are definitely best off putting the cable modem into bridge mode and using a wired-only SMB Enterprise router (and thus avoiding a double NAT scenario which lowers performance for some appliances like gaming consoles).  EnGenius currently does not make this type of product, though it is on our long term roadmap.  SonicWALL SOHO is a good example, as if you get the standalone version (i.e. without the content filtering or AV licenses) it is fairly inexpensive and quite capable, though it is certainly difficult to program.  There are several other vendor products on the market in this category.  The key features you want to look for are VLAN support, DHCP, and dynamic DNS.  Firewalls are also useful in many home and SMB applications.

For the switches and APs, I would recommend that you look into EnGenius’s Neutron product line.  The APs are all centrally managed, which makes it easier for you as a service provider to provision and maintain, and can be managed either from the local EWS switch or from a hosted cloud-based server called ezMaster.

Within the AP for the home, there are three key factors to keep in mind:
  1. Placement:   APs should be placed as close as possible to the client devices, with as little physical structure (e.g. walls) as possible.  In a multi-AP deployment, APs should be placed as far apart from each other as possible, with as much physical structure in between as possible.  This is true three-dimensionally, so stacking in hallways from floor to floor should absolutely be avoided.
  2. Transmit Power:   Most smartphones and tablets have very weak transmitters, on the order of 100x less powerful than access points.  If the APs are set to their maximum power settings, they can create a false sense of coverage.  The client devices at the far end of the coverage area will hear the APs (since the AP is “shouting”) but the AP can have a lot of difficulty hearing the client devices (since the clients are “whispering”).   Coverage should generally be more balanced, and thus turning down the transmit power of the APs will help to ensure more balanced bi-directional communication.  Additionally, 5 GHz does not travel as far as 2.4 GHz and suffers more attenuation when passing through walls and objects, thus the 2.4 GHz power generally needs to be 6 dB lower than the 5 GHz power to have roughly the same area of coverage.  I usually start at 14 dBm for 2.4 GHz and 20 dBm for 5 GHz, tuning from there based on the environment.  I also generally do not recommend “auto power”, as this changes the coverage area over time and can create either gaps in coverage or co-channel interference with neighboring APs.
  3. Channel:  Neighboring APs will overlap and will interfere with each other, unless the neighboring APs are put on static / non-overlapping channels.  Auto-channeling is a very hard optimization problem, and I have yet to see any vendor do it well, despite marketing claims.  APs should always be put on static / non-overlapping channels with the channel scheme staggered as much as possible.
Ironically, there are several new startups on the market offering home mesh products (e.g. Eero).  These are targeted as a “plug and play” approach for consumers to provide larger areas of coverage by placing more APs and then not requiring Ethernet cabling to interconnect them, instead using the Wi-Fi itself as a backhaul.  These APs also inevitably rely upon auto-channel and auto-power in an attempt to keep things simple for the uneducated user.  These products, however, follow the old mantra of providing “coverage”, when in reality a home network needs to provide adequate “capacity” with room for growth.  The problem with the fundamental approach is that mesh reduces throughput by 50% per hop (spending ½ the time servicing customers, ½ the time for backhaul to a connected AP).  Such APs fundamentally cannot meet the performance (bandwidth and channel utilization) demands of a high-performance network, and generally should be avoided in networks where high performance and large client capacity are driving requirements.

The core problem in home networking lies in the fact that ISP modem / routers are consumer products, and not very capable for more complex or more high-demanding networks.  I’ve run into the issue myself where you cannot practically put the modems into bridge mode because the ISP service people cannot handle it over time – even if you get them to configure it in bridge mode, a firmware upgrade or service issue generally breaks that and the ISP returns their modem to their default configuration. Additionally, if the customer is using their VoIP product, you are stuck with their modem.    

Unfortunately, you are usually stuck with whatever cable modem hardware the ISP provides, and every ISP is going to pick something different.  Most ISPs are fairly large behemoths who don’t care about interoperability with any other vendor’s equipment, and have no real incentive to provide compatibility for services that they undoubtedly view as (current or future) competition.   That is a business reality that is not going to change anytime soon. 

The best you can do, therefore, is to work around it.   I’ve done installations where I’ve literally sealed cable modems in steel boxes to block the Wi-Fi signal from it.  More commonly, I’ve also done installations where you just set the cable modem to some random SSID, fix the channel on 2.4 GHz and 5 GHz, and don’t use it for connecting any client devices.  You set up your real Wi-Fi network with the proper equipment and avoid conflicts on the fixed 2.4 GHz and 5 GHz channel of the cable modem, and basically treat the modem as a 3rd party rogue AP. 

The entire Wi-Fi industry has spent nearly 20 years telling consumers how “easy” it is to install Wi-Fi, yet simultaneously made the protocol more complicated (and thus more sensitive and fragile) in order to cheat the RF physics to squeeze throughput performance.   Performance relies upon establishing and maintaining better control over equipment choice, locations, channel, and transmit power settings.  I concur that the situation in home network environments is fairly untenable, and the introduction of more Wi-Fi appliances and infrastructure products in the consumer space only continues to make the situation worse. 

The reality is that a network needs to be designed, controlled, and properly maintained if you want high performance out of it, and that means a careful mix and match (and control) of vendor products for the modem, router, switch, and APs with proper configuration.  

There is no magic bullet.