Tuesday, April 28, 2020

Observations on Why You Should Avoid Auto-Channel and Auto-Power

Many AP vendors tout solutions to making the life of a Wi-Fi design and deployment easier for you by incorporating algorithms that are supposed to automatically, and dynamically, adjust and control both the transmit power and the channels on each band.   This is generally referred to as auto-power and auto-channel, respectively.  Cisco rebrands these into a package called "radio resource management (RRM)", and other vendors use different names.  Nonetheless, whatever the branding, these algorithms are all designed to do the same thing, namely optimize the AP power and AP channel settings on each band.

I will avoid doing any "vendor bashing" in this post.  I have and/or currently do work with most of the major AP vendors, and many of them devote marching armies of engineers legitimately working hard to come up with good solutions. Naturally, some vendors do auto-channel and auto-power  better or worse than others. All of that said, having worked for over 12 years in Wi-Fi designing and deploying well over 1000 separate networks with equipment from numerous AP vendors, I have yet to see any one of them atually get it "correct".  

There are no IEEE 802.11 standards on how to perform auto-channel and auto-power, so every vendor has their own "secret sauce" for doing this.  In. this context, "secret sauce" should be defined, and the best definition of Wi-Fi "secret sauce" I've encountered comes from a tweet from Daniel Johnson (@TheRFWrangler) on March 27, 2020 at 8:54 am CDT.  In was in the context of a conversation on roaming, but it is also applicable here and on several other Wi-Fi topics. The relevant portion of the Tweet, and the definition I use for "secret sauce", is as follows:  "[An AP vendor's] secret sauce [is] their special blend of bugs and bastardized standards"

Auto-channel is actually a very hard nonlinear optimization problem, nearly impossible to solve algorithmically.  Each AP only knows about the immediate neighbors it can “hear” via RF, so does not have the perspective of the whole network.  Thus, when one AP makes a change, it ripples to its neighbors, which ripples to its neighbors, and before you know it, changing the channel on one AP triggers channel changes on all APs across the network.    

Many vendors have therefore focused their efforts in recent years on having the controller collect the RF information from all of the APs on the network. Once collected, the controller can then attempt to optimize the channels for each AP across the whole network. This is an approach touted prominently in the marketing literature of several vendors.  Most vendors have only met with “limited success” in practice. Furthermore, such algorithms generally require the computing power of a dedicated hardware controller or cloud controller, so this is generally beyond the ability of "controller-less networks" where one AP on the network acts as a local controller for management and telemetry, though this may vary by vendor.

While this centralized channel optimization approach makes sense conceptually, it is actually a much harder problem than it sounds.  The fundamental problem is that the central controller doesn’t know anything about walls or other structure in the environment, or how the APs are actually placed in the environment relative to one another.  Thus, key information is really missing.  Without such knowledge, the full set of data collected from the APs tends to be misleading if not outright contradictory, making it obviously very hard to optimize in practice.

Transmit power levels are also coupled to channelization.  If you change the transmit power on any AP, the channelization solution you have is likely no longer optimized.  I’ve seen more than one vendor with an auto-power scheme that completely fails epically for even small networks.  In such cases, one AP is driven to its maximum allowed power where all of its neighbors are driven to their minimized allowed power.  Unless there are tight thresholds set (which is not always a feature available from certain vendors), this can create a large imbalance in transmit power levels, leading  to roaming issues in central overlapping areas as well as dead spots on the edge of your network.  Large networks are more likely to disguise this effect, but a 3-4 AP network demonstrates this failure quite vividly. Other vendors will just operate their APs at  maximum transmit power, even with “auto power” enabled.   

Like with channels, each AP only knows about its immediate neighbors and can only control itself.  Many auto-power algorithms have an implicit assumption of symmetry – an AP assumes that neighboring APs hear it as well as it can hear its neighbors.  This is very much not true in practice, even for simple networks.  In such cases, if an AP hears its neighbor over a certain RSSI threshold, it lowers its own volume.  The neighbor, however, is still just as loud, so the AP lowers its own volume further, until the transmit power of the AP reaches its minimum threshold.  Whichever AP happens to engage its auto-power algorithm “last” is not aware of a problem, since all of its neighbors are at sufficiently low (perhaps even TOO low) power levels, and thus the last AP continues to merrily blast away at maximum power.  A better approach would be for an AP hearing its neighbor too loudly to tell its neighbor to "shut up" (i.e. lower its volume), with the neighbor actually listening to and acting upon that request, telling the first AP to also "shut up" if necessary. If the APs are reasonably evenly spaced out in a regularly laid out environment (e.g. hotel, MDU, office building, etc.), such an approach would serve all drop all APs to a reasonably uniform and moderate transmit power level.  Theoretically, a controller-based algorithm collecting this data from all APs on a network could potentially figure this out.  However, again the data collected from large numbers of APs could be misleading or contradictory, depending on the actual environment and the actual (non-)uniformity of the placement of the APs.

Ironically, as a Wi-Fi network designer with a full view of the network, setting up a static channel and static transmit power plan is actually straightforward.  First, you set all of your APs to be at static transmit power levels, then you alternate your channels on both 2.4 GHz and 5 GHz both horizontally and vertically.   Granted in some environments it is not quite that simple, and occasionally transmit power levels need to be tweaked to boost (or reduce) coverage areas, necessitating an iteration of the channels.  Nonetheless, it turns out to be much easier for a properly-trained human to do it with full perspective of the network and the environment than for a computer algorithm lacking both intuition and key pieces of the puzzle.

Philosophically, there are four "degrees of freedom" (i.e. design knobs) that are available in a Wi-Fi design:   (1) Selection of the AP make and model, including its embedded internal antenna or the type of external antenna if the make/model allows (2) Location of the APs in the environment, (3) channel per band, and (4) transmit power per band.   Thus, selection of appropriate channel and transmit power are really part of the design process, and thus part of the control we have as Wi-Fi engineers in the quality of that design.  When electing to use auto-channel and auto-power, a designer cedes control of those two design knobs to an algorithm that itself varies by AP vendor, model, and firmware version.  

My personal recommendation, based on years of painful experience, is that one should always be doing static channel and static transmit power for any predictive model on all projects.  In Ekahau, the APs are all based on static power settings of 8 dBm and 14 dBm when you first select them, no matter the AP vendor or model, and you have to go and explicitly change each one if you want to use different values. Channels default on 1 and 36 for everything in Ekahau, but are easily adjusted.  I know Ekahau has a channelization feature, though I’ve never used it and admittedly I’m too “old school” to ever trust it.  That said, I’m “open to being convinced” that such an algorithm could provide a reasonable starting point.   

That said, one does not actually need Ekahau or any other software model to create a good static channel plan.  I learned how to set static channels appropriately on both 2.4 GHz and 5 GHz years before I ever even was exposed to predictive modeling software like Tamograph or Ekahau.  Spacing out the APs evenly, setting a consistent and moderate transmit power across all APs, and alternating channels horizontally and vertically on each band is usually sufficient, especially for properties with “regular layouts”.  Predictive modeling software is admittedly really useful at fine-tuning your static channel and transmit power plan so as to minimize or eliminate self-interference, but it is not strictly necessary for many environments.

It is also true that channelization takes a fair amount of time and effort, especially on large projects with hundreds or thousands of APs.  For “regular layouts”, there are methods of doing the vertical channel staggering in a fixed pattern; figure out the best pattern on one floor and then come up with a floor by floor alternating scheme, thus letting a spreadsheet compute the channels for the remaining floors. This can save a fair amount of time and effort, but is also something you always want to look at and check for sanity.  Additionally, one does not want all of that work being handed to a competitor in a bidding environment, so it makes sense to go through the channelization exercise only AFTER winning the deal.