Friday, February 10, 2017

MU-MIMO is Dead! MIMO and MU-MIMO Explained

The 802.11ac story is quite muddy, somewhat deliberately so by those of us in the business of selling people new hardware, so this blog will try to deconstruct it for general understanding.  MIMO (Multi-In Multi-Out) and MU-MIMO (Multi-User MIMO) are actually very different technologies and are independent of each other.   

As will be shown, MU-MIMO requires quite a lot of things to go "just right" in order to work at all, and without a large client base of support, it is a technology very high on hype and very low on practical deployment performance.

MIMO

MIMO capability was introduced with 802.11n.  MIMO allows for multiple parallel spatial streams to increase total bandwidth between a single transmitter and single receiver.  MIMO capabilities are specified as follows:

Number of transmit radios  x  Number of receive radios :  Number of spatial streams


There are variations on MIMO where these numbers can be different.  However, to maximize throughput (which is what we are all obsessed with as an industry), the number of transmit radios = the number of receive radios = the number of independent spatial streams.

MIMO communication works in both directions (i.e. from AP to client device and from client device to AP), but the maximum number of spatial streams between the access point and the client device is always limited by whichever device supports FEWER spatial streams.  Thus, if you have a three-stream AP (3x3:3) and a two stream client device (2x2:2), you are only ever going to get two spatial streams, and the last stream from the AP is unused.   Most client devices are only capable of single stream (1x1:1) or dual stream (2x2:2); more streams require more radios and more antennas, which consume more space and more battery, both of which are quite antithetical to the design of smartphones and tablets.   There are a small number of high-end laptops (e.g. MacBook Pros) that support three streams.   

The maximum theoretical half duplex data rate in 802.11ac for MIMO is as follows.   Note, this is the number that all vendors advertise, but these numbers are NOT data throughput.  Under very ideal conditions (single AP and single client device very close to each other, no interference, etc.), the maximum achievable throughput will be approximately 40% of the half duplex data rate.   This is a property of how the IEEE 802.11 protocol works.   Keep in mind that most real-world conditions are also far from ideal, so your actual throughput numbers are usually going to be much lower still.

  • 1x1:1 – 433 Mbps half-duplex data rate (MCS09)
  • 2x2:2 – 867 Mbps half-duplex data rate (MCS09)
  • 3x3:3 – 1299 Mbps half-duplex data rate (MCS09)

MU-MIMO

MU-MIMO is introduced in 802.11ac wave 2, and is intended to better handle very high density client device environments (e.g. college lecture halls, convention centers, auditoriums, concert venues, etc.).   It is a technique that does not increase raw half-duplex data rates, but rather allows the AP to literally talk to multiple client devices simultaneously.  

It utilizes a technique called transmit beamforming to directionalize an omni-directional signal by inducing deliberate phase offsets of the same signal transmitted across multiple antennas.   This effect creates zones of constructive interference (where the signals are in phase and thus add linearly) and destructive interference (where the signals are out of phase and nullify each other).   



This technique only works uni-directionally downstream from the access point to the client device(s).  In order to determine the correct phase shifts, the access point must also know the position of each client device, relative to itself.  Thus, the client device must support a feedback mechanism – the AP periodically sends out “sounding frames” from each antenna, and each MU-MIMO capable client will respond with a matrix indicating how well it hears the signal from each antenna, and those values are then used by the AP to compute relative position of the client devices.  



There are three key elements for MU-MIMO to work properly:

  • The clients in the same MU-MIMO group must all support the position feedback mechanism. This means the client devices must utilize 802.11ac wave 2 chipsets.  Older chipsets are not upgradable, as the core function is built into the chipset hardware itself
  • The clients in the same MU-MIMO group need to be physically far apart from each other, at least angularly, relative to the AP.  This is so that the signal for each client can be maximized at the correct client’s position and nullified at the other client positions.   It won’t work for client devices that are only a very small distance (i.e. less than 10-20 feet) away from the access point, or are too close to each other in radial direction from the perspective of the AP.
  • The time required to transmit to each client should be (approximately) the same.  This means each client should have a similar amount of data and be able to use a similar data rate.  They don’t need to precisely match, but the efficiency of this technique is vastly reduced if one client in the MU-MIMO group takes significantly longer to talk to than the other client devices.




Thus, there are several things that have to go “just right” in order for MU-MIMO to even do anything.

The 802.11ac wave 2 MU-MIMO access points all currently have four antennas, which enables them to talk to up to three separate clients.  (The direction of the last spatial stream cannot be controlled independently).   What few MU-MIMO capable clients that exist are typically only single-stream or dual-stream.   The Edimax USB adapter you indicate is a MU-MIMO capable client device, in that it supports the required position feedback, but is only a single spatial stream device with a single antenna, hence the half-duplex data rate of 433 Mbps.  

The reality is that very few client devices actually support MU-MIMO.  The only smartphone that supported it to date was the Samsung Galaxy Note 7, which had unrelated incendiary problems and is no longer available.  The iPhone 7 had been rumored to support it, but when that product was finally released by Apple, the capability was quietly omitted.   In fact, the only client devices I know about that support MU-MIMO are the USB dongles like that Edimax EW-7822ULC.  Ironically, this is not terribly useful, since the technique of MU-MIMO is intended for very dense client environments (e.g. auditoriums, college lecture halls, stadiums, convention centers, etc.), which by definition are dominated by smartphone and tablet users.  Thus, while we and every other vendor actively sells and promotes 802.11ac wave 2 access points (and client device dongles), the truth is that MU-MIMO is both unnecessary and unusable for most SMB applications.

Preview: Tri-Band

Many vendors, including EnGenius, are in the process of launching a new AP product known as “tri-band” that incorporates a 2.4 GHz 802.11n radio and two 5 GHz 802.11ac radios, and can therefore simultaneously (and bi-directionally) communicate with two 5 GHz client devices simultaneously, without the need for any additional support by the client devices.  Thus, the product is automatically backwards-compatible with all 5 GHz 802.11n and 802.11ac client devices.  We anticipate this product to be much better suited to high-density client environments, since it can talk to two 5 GHz client devices, bi-directionally, all the time, vs. MU-MIMO which can talk to up to three 5 GHz client devices, downstream only, sometimes.  Watch this space...

Monday, December 12, 2016

Why “Shiny and New” is Bad for SMB Wi-Fi

In our society of the 30-second sound bite, we market our Wi-Fi, as we market most technological innovations, on being shiny and new.   For years, everything in Wi-Fi has been about size and speed – the fastest access point, the most antennas, the largest number of simultaneous clients, the latest technology, etc. 

In Wi-Fi, “shiny” is essentially all about speed.   We are an industry obsessed with faster and faster data rates, pushing more and more data through the unbound medium of radio frequency:  
  • In 802.11n, we introduced MIMO, which allows multiple data streams to be transmitted simultaneously from different antennas.  These different streams are heard by all of the antennas on the receiver, and then mathematically separated to reconstruct the data streams.  While this technique works, it requires a lot more processing capabilities on both the transmitter and the receiver, and it is the limit of what we can do to increase raw speeds.  
  • Next came the need for a throughput fix, and the industry focus on improving airtime efficiency to achieve faster total data rates.   In 802.11ac wave 2, we use even more complex mathematical techniques to do multi-user MIMO (MU-MIMO), where an access point literally transmits independent messages to multiple receiver clients simultaneously.   
  • The standard currently in development by the IEEE, 802.11ax, doubles down on Wi-Fi complexity to provide even more airtime efficiencies with bi-directional MU-MIMO and overlapping channel utilization.

Clearly these innovations come at a price: complexity. And when we increase the complexity of a system, we fundamentally make it more fragile and less robust.  When a system violates the KISS rule (keep it simple, stupid), it is more prone to errors due to normal environmental variations and less likely to be able to handle and recover from problems elegantly.   

When a system is more complex, there are more things that can (and do) go wrong, and the more the system relies upon everything working correctly to even function.  The “overhead” in Wi-Fi is not present by chance – it is architected into the 802.11 protocol to ensure that information is transmitted and received reliably.  When you attempt to simultaneously increase complexity and reduce overhead, you increase the number of problems that can occur and decrease your ability to handle and recover from potential problems that may develop.  Intuitively, therefore, a more complex system is much easier to break and is less reliable than a simple system.

SMBs, and Their Providers, Pay the Price

While this need for new and shiny has driven amazing technological innovation, it has long since surpassed the actual performance needs of most Wi-Fi networks, especially in the small-to-medium business (SMB) sector.  While the SMB sector covers a lot of different vertical markets, the fundamental requirements for a Wi-Fi network in this space are as follows:
  • Ubiquitous and stable coverage
  • Reliable performance
  • Moderate numbers of simultaneous users and bandwidth
  • Low failure rates (MTBF) & fast repair times (MTTR)

Note the use of the terms “reliable,” “stable,” and “moderate,” and the requirements for high MTBF and low MTTR.   Fundamentally, SMB Wi-Fi needs to “just work.”  It should not be variable, it should not fail very often, and if/when it does fail, it should be really easy and quick to get back online again.  

These most critical SMB requirements are not only inconsistent with, but diametrically opposed to, new and shiny Wi-Fi.   SMB owners need to know they can rely upon their Wi-Fi networks, and the people who install and service them. Period.

For the managed service providers (MSPs) who deliver IT installation and maintenance services, reliability and ease of service for Wi-Fi networks are also fundamental requirements.   An MSP requires three things to be successful with Wi-Fi:
  • A robust and reliable network that essentially takes care of itself
  • Repeatability across multiple customers
  • Recurring monthly service monitoring fees  

MSPs make money by being able to install a network quickly and not receive excessive maintenance calls once it’s installed. When problems do occur, they need to be able to resolve them quickly, preferably without a costly truck roll to the customer site.

Recognition of these dynamics is making its way to the core of managed Wi-Fi services. New providers such as Uplevel Systems have designed managed services platforms specifically for the needs of IT consulting firms needing to automate configuration and remote management as much as possible. By combining feature-rich hardware and cloud-based monitoring and troubleshooting, Uplevel equips consultants with varying degrees of familiarity with Wi-Fi to provision secure and robust services with a few choice shiny and new “nice to haves” like easily managed guest access. 

With storage, backup, security and VPNs available on the same platform, the basic value proposition for MSPs is “Simplify, upsell, repeat.” The appeal of keeping it simple—while still delivering better quality than consumer-grade technology – proves ideal for SMBs where all the glitters is more red tape than gold. Or green. 

Monday, November 28, 2016

Proper Wi-Fi Design and Deployment: Required Even In the Home

As more and more wireless appliances and streaming media applications permeate the home, the reliability of the Wi-Fi network as the infrastructure for this traffic becomes critical to performance.  This is especially challenging in a multi-vendor environment. While every vendor  technically follows the IEEE 802 networking standards (which are supposed to provide for interoperability), there is the “standard” and then there is the reality.  

Most importantly, the more intense networking demands of a home network REQUIRE a properly designed and integrated system, which is quite antithetical to the conventional consumer “plug and play” approach. 

Architecturally, a home LAN can be broken into the following functions.  Each of these functions are independent of each other, and thus can generally and safely be satisfied by different equipment from different vendors:

  1. Modem:  Converting the Internet connection from the street (coax cable, fiber, DSL, satellite, etc.) into Ethernet
  2. Router:  Defining the local area network (LAN) and providing routing and NAT functionality between the WAN and the LAN
  3. Switch(es):  Providing interconnectivity between wired network devices on the LAN, including infrastructure devices (e.g. routers, APs, other switches) and wired client devices.
  4. Access Points:  Providing multiple wireless client devices access to the wired LAN


Several companies provide physically-integrated equipment that provide multiple functions (e.g. a cable modem Wi-Fi router which provides all four functions in one box), but these should be treated as separate functions.  Within a particular function, it is generally a bad idea to mix and match vendors.  This is especially true in the “access point” function, because there is quite a lot happening “under the hood”, and while all of the vendors are following the IEEE 802.11 specs, they are all doing it somewhat differently which makes co-existence problematic and, at the very least, difficult to do well.   Thus, if you are installing a managed Wi-Fi system, you want to remove or disable any other third party APs installed by the customer or other service providers to the extent possible.   Co-located Wi-Fi systems will interfere with each other and reduce Wi-Fi capacity, even if they are not in “active use”.

For the modem, the general recommendation is to have the ISP provider put their appliance in “bridge mode”, which shuts off all but the modem function.  Some ISPs have a harder time of doing this than others in practice.   At a minimum, the built-in access point in the cable modem should be disabled.  The built in AP is invariably only consumer grade and has very few knobs to turn in terms of performance tuning, making it inappropriate for the performance challenges of larger home Wi-Fi networks, especially those that require multiple APs for both coverage and capacity. 

The built-in router of a cable modem can be used reasonably safely if VLANs are not required.  If VLANs are required, you are definitely best off putting the cable modem into bridge mode and using a wired-only SMB Enterprise router (and thus avoiding a double NAT scenario which lowers performance for some appliances like gaming consoles).  EnGenius currently does not make this type of product, though it is on our long term roadmap.  SonicWALL SOHO is a good example, as if you get the standalone version (i.e. without the content filtering or AV licenses) it is fairly inexpensive and quite capable, though it is certainly difficult to program.  There are several other vendor products on the market in this category.  The key features you want to look for are VLAN support, DHCP, and dynamic DNS.  Firewalls are also useful in many home and SMB applications.

For the switches and APs, I would recommend that you look into EnGenius’s Neutron product line.  The APs are all centrally managed, which makes it easier for you as a service provider to provision and maintain, and can be managed either from the local EWS switch or from a hosted cloud-based server called ezMaster.

Within the AP for the home, there are three key factors to keep in mind:
(
  1. Placement:   APs should be placed as close as possible to the client devices, with as little physical structure (e.g. walls) as possible.  In a multi-AP deployment, APs should be placed as far apart from each other as possible, with as much physical structure in between as possible.  This is true three-dimensionally, so stacking in hallways from floor to floor should absolutely be avoided.
  2. Transmit Power:   Most smartphones and tablets have very weak transmitters, on the order of 100x less powerful than access points.  If the APs are set to their maximum power settings, they can create a false sense of coverage.  The client devices at the far end of the coverage area will hear the APs (since the AP is “shouting”) but the AP can have a lot of difficulty hearing the client devices (since the clients are “whispering”).   Coverage should generally be more balanced, and thus turning down the transmit power of the APs will help to ensure more balanced bi-directional communication.  Additionally, 5 GHz does not travel as far as 2.4 GHz and suffers more attenuation when passing through walls and objects, thus the 2.4 GHz power generally needs to be 6 dB lower than the 5 GHz power to have roughly the same area of coverage.  I usually start at 14 dBm for 2.4 GHz and 20 dBm for 5 GHz, tuning from there based on the environment.  I also generally do not recommend “auto power”, as this changes the coverage area over time and can create either gaps in coverage or co-channel interference with neighboring APs.
  3. Channel:  Neighboring APs will overlap and will interfere with each other, unless the neighboring APs are put on static / non-overlapping channels.  Auto-channeling is a very hard optimization problem, and I have yet to see any vendor do it well, despite marketing claims.  APs should always be put on static / non-overlapping channels with the channel scheme staggered as much as possible.
Ironically, there are several new startups on the market offering home mesh products (e.g. Eero).  These are targeted as a “plug and play” approach for consumers to provide larger areas of coverage by placing more APs and then not requiring Ethernet cabling to interconnect them, instead using the Wi-Fi itself as a backhaul.  These APs also inevitably rely upon auto-channel and auto-power in an attempt to keep things simple for the uneducated user.  These products, however, follow the old mantra of providing “coverage”, when in reality a home network needs to provide adequate “capacity” with room for growth.  The problem with the fundamental approach is that mesh reduces throughput by 50% per hop (spending ½ the time servicing customers, ½ the time for backhaul to a connected AP).  Such APs fundamentally cannot meet the performance (bandwidth and channel utilization) demands of a high-performance network, and generally should be avoided in networks where high performance and large client capacity are driving requirements.

The core problem in home networking lies in the fact that ISP modem / routers are consumer products, and not very capable for more complex or more high-demanding networks.  I’ve run into the issue myself where you cannot practically put the modems into bridge mode because the ISP service people cannot handle it over time – even if you get them to configure it in bridge mode, a firmware upgrade or service issue generally breaks that and the ISP returns their modem to their default configuration. Additionally, if the customer is using their VoIP product, you are stuck with their modem.    

Unfortunately, you are usually stuck with whatever cable modem hardware the ISP provides, and every ISP is going to pick something different.  Most ISPs are fairly large behemoths who don’t care about interoperability with any other vendor’s equipment, and have no real incentive to provide compatibility for services that they undoubtedly view as (current or future) competition.   That is a business reality that is not going to change anytime soon. 

The best you can do, therefore, is to work around it.   I’ve done installations where I’ve literally sealed cable modems in steel boxes to block the Wi-Fi signal from it.  More commonly, I’ve also done installations where you just set the cable modem to some random SSID, fix the channel on 2.4 GHz and 5 GHz, and don’t use it for connecting any client devices.  You set up your real Wi-Fi network with the proper equipment and avoid conflicts on the fixed 2.4 GHz and 5 GHz channel of the cable modem, and basically treat the modem as a 3rd party rogue AP. 

The entire Wi-Fi industry has spent nearly 20 years telling consumers how “easy” it is to install Wi-Fi, yet simultaneously made the protocol more complicated (and thus more sensitive and fragile) in order to cheat the RF physics to squeeze throughput performance.   Performance relies upon establishing and maintaining better control over equipment choice, locations, channel, and transmit power settings.  I concur that the situation in home network environments is fairly untenable, and the introduction of more Wi-Fi appliances and infrastructure products in the consumer space only continues to make the situation worse. 


The reality is that a network needs to be designed, controlled, and properly maintained if you want high performance out of it, and that means a careful mix and match (and control) of vendor products for the modem, router, switch, and APs with proper configuration.  

There is no magic bullet.   

Tuesday, November 15, 2016

The Misconceptions of Managed IT Services


Small businesses running on shoestring budgets often consider IT services as overhead.  While most rely upon Internet access, Wi-Fi, and shared access to data, they don’t view these services as part of their core business. Having any or all of these fail for an extended period of time could easily drive a small company out of business, yet many shy away from options such as managed services that might add stability and often reduce cost.

Instead, most still follow the antiquated "break / fix" model of IT management, where an IT consultant who was originally called in to set up a system gets called back in to fix things whenever there's a problem.  In most cases, the hesitation to engage for the long haul has to do with misconceptions about the managed services model.  A managed service provider keeps a continuous watch over a network infrastructure to ensure it is always operating smoothly and securely in return for a monthly fee.

Clarifying these misconceptions can help customers overcome misplaced concerns, and create new “win-win” scenarios for consultants and clients alike.

Misconception #1: Managed Services are More Expensive 

In truth, managed services tend to actually be less expensive over time. Clearly, there’s value in being able to plan a predictable and affordable monthly monitoring charge but that won’t be often convince clients to take the plunge. More compelling is the idea that managed service providers can catch most problems while they are still emergent, when they generally can be resolved more quickly and easily.  This produces major savings, and any client that’s suffered a debilitating outage can easily grasp that value.

Contrast this approach to the case where an IT consultant is brought in to fix a problem only after a crisis occurs, or a small problem festers over time and ultimately grows out of control.  The IT consultant can rack up many, many hours to fix a big problem and present a correspondingly large bill for services.  Such emergencies are unpredictable, and can lead to an accusatory and adversarial relationship between you and your IT consultant.

Misconception #2:  “If it Ain't Broke…” 

Most of us have learned at some point that it costs less to keep up with our every-3000-mile oil changes than to wait for the “check engine” light to come on. The same is true for the network infrastructure.

Network equipment vendors constantly release bug fixes and security updates that should be installed and tested. And the cybersecurity landscape never stops changing.  Small businesses may mistakenly feel they’re too small to be targeted, only to find themselves turned into an automated “bot” unwittingly doing the bidding of complex sinister cybercriminals or even have their hard drives encrypted and held for ransom.    

Again, this is an easily demonstrated value-add of subscribing to managed services and having a seasoned IT professional proactively managing the network.

Misconception #3:  I Don't Have to Worry about Downtime, Because All My Data is "in the Cloud" 

The proliferation of cloud services has made it much easier for small businesses to take advantage of state-of-the-art networking technology without having to invest heavily in owning servers and services.  However, relying on cloud services presents its own set of challenges:

  1. A company likely needs more one than one cloud service. Maybe they use email for sending documents to employees outside the office, and Dropbox for sending documents to third parties.  These are disparate services that, like their onsite hardware counterparts, may not work together seamlessly or securely.
  2. The cloud model doesn’t always scale cost-effectively for small businesses because, in reality, there is no one cloud.  For each service, data is stored on a server accessible over the public Internet vs. your private network.  Subscribing to multiple cloud services from multiple vendors may not deliver the same economies of scale as services using one managed service provider and having single point of contact when something goes wrong.
  3. Nobody ever really tests the backups until something goes awry, and suddenly data recovery is needed to keep the business running.  That is not the time you want to discover that something was misconfigured, or that the service doesn’t work the way you thought it would.  
Small businesses need to understand the fact that each disparate provider adds a layer of risk.  Most cloud services, even popular ones, are only small businesses themselves.  What happens to your data if their service gets hacked or the company suddenly goes out of business?  

Misconception #4:  My IT Guy Can Be Here in Ten Minutes 

Here, the misconception is that you can either have a trusted local IT guy, or contract with a local managed service provider but not both at the same time.  In reality, more and more IT consultants are transitioning to become managed service providers that enjoy higher growth and recurring revenues for monitoring and troubleshooting customers remotely.

This trend is being enabled in part by multiple evolving technologies, including the cloud.  One platform from start-up Uplevel Systems is designed from the ground up for IT consultants serving small businesses. The Uplevel model brings together the key elements of small business IT – access, Wi-Fi, security, storage, and management—on one remotely managed platform.  Customers enjoy better-than-consumer-grade features, functionality, and security at a predictable and affordable cost, with the added benefit of their local consultant’s watchful eye.

Using one integrated infrastructure follows industry best practices, simplifying configuration, maintenance, and troubleshooting, and reducing security risks.  Data is still shared and saved to the cloud, but now it’s the trusted IT professional evaluating data and making decisions on the company’s behalf.

Managed services optimized for small businesses offer a win-win by aligning the business goals of IT consultants with the unique needs of the clients they serve.  No more “bundling up” a bunch of IT issues until they become large enough to warrant the cost of a visit by the “IT guy.”   For the IT consultant, managed services mean more predictable revenue streams, faster growth, and opportunity to engage customers at a more strategic level. 

Done right, managed services help both parties avert the major problems that can shut down a small business and create emergencies that ultimately benefit no one. The hurdle to adoption – the misconceptions stated above – can be easily overcome with basic ROI models and a “try it you’ll like it” approach.

Wi-Fi, my frequent focus, is a great place to start.

Wednesday, September 21, 2016

Best Practices for Managed Network Switches

This blog post provides guidelines on best practices for selecting the right quantity and size of managed network switches for particular applications.

Application Guidelines:  Frequently Asked Questions

How do network switches work?

In the early days of networking, wired Ethernet network devices were interconnected through a device called a “hub”, where a wired frame would come in on one port and be broadcast out to all other ports.  While relatively simple, this technology caused a lot of noise and consumed a lot of excess capacity on wired networks, as all messages were broadcast to all devices connected to a hub, regardless of their intended destination.

With network switches, each switch port creates a point-to-point link with the device it is connected to.  The network switch maintains a database of MAC addresses indicating what devices are connected on individual ports.   When a frame enters the switch on a particular port, the switch examines the source MAC address and destination MAC address.   The source MAC address is used to update the database to indicate that the client is accessible on that port.  If the destination MAC address is already in the database as being connected to a different switch port, the frame is only forwarded out along the indicated port.  If the destination MAC address is not in the database, or if is a broadcast message (e.g. DHCP request), the packet is sent out all other ports as is done in a hub.  

What are the differences between unmanaged, smart, and managed network switches?

An unmanaged switch maintains its database but is inaccessible via any interface (e.g. web, CLI, SNMP, etc.).    It is simply present on the network to route frames to appropriate ports.  Unmanaged switches are also incapable of handling any type of advanced Layer 2 features, including VLANs.

A managed switch has a full set of OSI Layer 2 features used for managing the wired traffic on a network.  It is addressable via an IP address and can generally be accessed via both a web interface (e.g. http or https) and a CLI (e.g. telnet or SSH).  Managed switches are capable of supporting a long list of industry-standard OSI Layer 2 features, including but not limited to the following:


  • VLANs
  • Viewable dynamic MAC address table (i.e. the switch port database)
  • Link Aggregation with Link Aggregation Control Protocol (LACP)
  • Spanning Tree Protocol (STP)
  • Access Control Lists (ACLs)
  • SNMP
  • Logging (local and remote)
  • Port mirroring
  • Cable and other diagnostics

A smart switch is a limited managed switch, which is typically less expensive than a managed switch but also typically only supports a subset of features found on a managed switch.   Smart switches will typically only have a web interface and support a limited set of VLANs.  However, unlike managed switches, there is no industry standard for the term “smart switch”, and what constitutes a “smart switch” can vary widely both between vendors and between different switch models from the same vendor.

It is best practice to use managed switches on the LAN side of a network.  This ensures that the full set of OSI Layer 2 features are available, and to facilitate troubleshooting by enabling network devices can be monitored and managed remotely.  Unmanaged switches should generally be avoided.

What are the differences between non-PoE, PoE, and PoE+?

A non-PoE switch is a switch that provides network connectivity only, and does not supply DC power to connected devices.   These switches are suitable when there are a large number of non-powered network devices on the network, such as PCs and laptops.  Such switches are commonly deployed in offices, as well as in hotels, student housing, assisted living, and other multi-dwelling unit (MDU) environments where there is a wired Ethernet wall jack in each unit.  

Power-over-Ethernet (PoE) switches provide both DC power and data connectivity over a single Ethernet wire.  These are extremely useful for connecting powered network devices to a network, as only one cable needs to be run to the device, as opposed to separate cables for data and for power.  Per the IEEE standards, switches are able to detect whether a connected device is powered or not, and will therefore only provide power to devices that are not being powered by an alternate power connection.   When using managed Power-over-Ethernet switches, the connected device can also be rebooted remotely by turning off and on the power on the Ethernet port, which is very useful when doing network troubleshooting.

A PoE switch conforms to the IEEE 802.3af standard, which provides 48V up to 15.4 W per port.  PoE (802.3af) is sufficient for powering older generation access points (i.e. pre-802.11ac) and for most other powered network devices, such as IP cameras, VoIP phones, access control locks, etc.

A PoE+ switch conforms to the IEEE 802.3at standard, which provides 48V up to 30 W per port.  PoE+ (802.3at) is required for 802.11ac access points because of the large number of radio chains required for MIMO and MU-MIMO.  

It is best practice to not fully load a PoE (802.3af) or PoE+ (802.3at) switch, to ensure that the total power budget of the switch is not exceeded.   I generally recommend a “3/4 rule”, meaning that a network design should plan on only using ¾ of the ports for powered network devices, as follows, with remaining ports being reserved for non-powered network devices, backhaul to other infrastructure (e.g. other switches or routers), or spares:

  • 8 port PoE/PoE+:   Only 6 ports should be used for powered network devices
  • 16 port PoE/PoE+:  Only 12 ports should be used for powered network devices
  • 24 port PoE/PoE+:   Only 18 ports should be used for powered network devices
  • 48 port PoE/PoE+:   Only 36 ports should be used for powered network devices

Most PoE and PoE+ switch models come with some non-PoE ports for backhaul, consisting of either Ethernet ports and/or SFP ports (for mini-GBIC fiber modules).  For a detailed explanation of SFP modules, please read the following blog:


How do you determine the number and size (port count) of switches needed for a project?

The telecom wiring in a property is divided into “verticals” and “horizontals”.   The distinction is actually not about the orientation of the cable run, but rather to distinguish backhaul cabling used to interconnect switches in different telecom closets (verticals) vs. cabling that connects a telecom closet to endpoints on the network, such as wall jacks in units, access points, cameras, etc. (horizontals).   This terminology is based on a high-rise building, where each floor has a telecom closet on each floor that are vertically stacked floor-to-floor, with endpoints on each floor connected horizontally to the telecom closet on that floor.

Every property must have one telecom closet which contains the Internet bandwidth circuit from the provider and the router for the network.  This closet is referred to as the main distribution frame (MDF).   For smaller properties, there may only be the one telecom closet, and all endpoint devices are “home run” from this closet to their desired locations.  (In such a scenario, there are no “verticals” but there will be “horizontals”).   As Ethernet wiring has a distance limitation of 100 meters / 328 feet, it is usually necessary and convenient for larger properties to establish additional telecom closets, and have the endpoints connected to these intermediate locations.  These additional telecom closets are referred to as intermediate distribution frames (IDFs).  In a high-rise building, IDFs are commonly stacked in the same location on each floor and are located either on every floor or every third floor.  Larger facilities will even have multiple IDFs on the same floor.  In multi-building environments, each building usually has a telecom closet, so each building is an IDF.  The IDF(s) require some type of backhaul connection to the MDF, typically using either Ethernet, fiber, or wireless point-to-(multi)point links.

Accordingly, the number and size of the network switches required ultimately depends on the number of MDF and IDFs as well as the number of “horizontals”, i.e. the number of powered network devices and unpowered network devices connected into each telecom closet.  


What is link aggregation?

Link aggregation is a feature available in managed switches to have multiple physical ports act as a single virtual port with the aggregated capacity of all of the physical ports.   It is commonly deployed for backhaul between the MDF and IDF(s) in networks requiring very high local data capacity, such as when using storage area networks (SANs) or in networks consisting of several surveillance IP cameras streaming data to a network video recorder (NVR) or in the MDF.   An example application is shown in Figure 1.  An aggregated link can also serve to provide redundancy (at reduced capacity) in case one of the connections should be broken.

Figure 1:  Example of Link Aggregation to connect a switch to a storage area network (SAN).

As an example, on EnGenius switches, a link aggregation group (LAG) can be established under L2 Features --> Link Aggregation --> Port Trunking, as shown in Figure 2.  Up to eight link aggregation groups can be defined on a particular switch, and are referred to as “trunk groups” with port numbers t1 – t8.   A physical port can be a member of only one trunk group.  The ports that make up a group need not be sequential, though it is often convenient to use sequential ports from a wiring perspective.  There is also no limit as to how many physical ports can be aggregated into a single group, until one physically runs out of ports on the switch.

There are two modes defined for establishing a trunk group.  In “static” mode, the ports are always considered part of the trunk group, and the switch will always load balance outbound traffic on the trunk port across all of the physical ports.  In “LACP” mode, the switch uses Link Aggregation Control Protocol (LACP) to periodically verify that each physical link is established end-to-end, so LACP must be running on both sides of the link (i.e. both switches connected via an aggregated link).  It is best practice to use LACP mode to establish an aggregated link between two switches.


Figure 2:  Setting up a link aggregation group on an EnGenius managed switch.

What is spanning tree protocol (STP)?

As mentioned above, a switch maintains a database of MAC addresses and ports.  When there is a wiring loop in the network, there are multiple physical paths between switches and endpoint devices, meaning that a switch will see the same MAC address on multiple ports.    This generally causes a broadcast storm, where the same message for a device will loop through the network repeatedly, eventually filling up the RAM on the switch and causing the switch to slow down significantly or simply crash.
Physical wiring loops in the network can often occur accidentally during a service operation.  Wiring loops are also often desirable from a redundancy standpoint, to ensure that the loss of a single cable or switch in the network does not take down everything downstream in the network.

Spanning tree protocol (STP) is a feature in managed switches that is designed to detect network loops and block redundant paths.  The simple explanation of the protocol is that it calculates a cost function for each path through the network and then only allows the least-cost path to operate, discarding all incoming traffic on higher cost paths.  Should the least-cost path fail (e.g. a physical cable gets disconnected), the algorithm immediately falls back to the next least-cost path.   This is shown in Figure 3.   

The STP algorithm allows a priority number to be set for each switch in the network (and even each port on a switch), where a smaller priority number indicates a lower cost for that switch so that a desired path can be established.  For the algorithm to work, one switch must have the lowest cost, and this switch is designated the “root bridge.”   In most network topologies, the root bridge should be the switch in the MDF connected directly to the LAN port of the router.   If the priorities are not specified (i.e. all switches are left on their default priority values of 32768), the STP algorithm will automatically designate the switch on the network with the smallest numerical MAC address as the root bridge.  



Figure 3:  Generic image of spanning tree blocking a loop in the network.

It is best practice to have spanning tree enabled on all networks.  Most managed switches have rapid spanning tree protocol enabled by default.   The priority of each switch can be set manually under L2 Feature  STP  CIST Instance Settings, as shown in Figure 3.   It is generally only necessary to change the default on the core switch in the MDF.   In networks consisting of several switches in a complex tree topology, it can be desirable to lower the priorities on some of the intermediate core switches.


Figure 4:  Setting up spanning tree protocol (STP) on an EnGenius managed switch.

Thursday, June 2, 2016

WiFi Fast Roaming (IEEE 802.11r), Simplified

This blog post originally appeared in Network Computing on 5/23/2016.

A primer on the 802.11 wireless protocol and when to implement it.

For some reason, all descriptions of 802.11r spectacularly fail to provide a simple explanation of the WiFi fast roaming protocol. Most of them are too high level and are effectively useless, or they tend to quickly get lost in the technical weeds. This blog post attempts to bridge that gap and provide a reasonably simple but thorough explanation.WiFi.jpg
WiFi
(Image: iSergey/iStockphoto)

What is fast roaming?

Fast roaming, also known as IEEE 802.11r or Fast BSS Transition (FT), allows a client device to roam quickly in environments implementing WPA2 Enterprise security, by ensuring that the client device does not need to re-authenticate to the RADIUS server every time it roams from one access point to another. This is accomplished by actually altering the standard authentication, association, and four-way handshake processes used when a device roams (i.e., re-associates) to a new WiFi access point.
The simplest explanation is that, after a client connects to the first AP on the network, the client is "vouched for." When a client device roams to a new AP, information from the original association is passed to the new AP to provide the client with credentials. The new AP therefore knows that this client has already been approved by the authentication server, and thus need not repeat the whole 802.1X/EAP exchange again.
Fast roaming also introduces efficiencies into the process of establishing the new encryption key between the new AP and the client device, which benefits both WPA2 Personal (a.k.a. pre-shared key or passphrase) and WPA2 Enterprise (a.k.a. 802.1X or EAP). Support for 802.11r is advertised in the AP beacon and probe response frames.

The regular association process

Here are the normal, or  pre 802.11r, steps followed by a client device as it connects to an access point or roams from one access point to another.
1.  Authentication (client)
2.  Authentication Response (AP)
3.  (Re)Association Request (client)
4.  (Re)Association Response (AP)
 WPA2 Enterprise 802.1X/EAP (client, AP, and authentication server); skipped in WPA2 Personal
5.  Four-way handshake #1 – AP nonce passed to client (AP)
6.  Four-way handshake #2 – Supplicant nonce passed to AP(client)
6.5  Derivation of encryption key (AP & Client independently)
7.  Four-way handshake #3 – verification of derived encryption key and communication of group transient key (AP)
8.  Four-way handshake #4 – acknowledgement of successful decryption (client)
Note, a nonce is a pseudo-random number generated for the purpose of seeding the encryption algorithm. Both the AP (anonce) and the client supplicant device (snonce) generate their own nonces as part of the negotiation.

Fast roaming re-association process

The following lists the revised --  802.11r -- steps followed by a client device as it uses Fast BSS Transition (FT) to move from one access point to another.
1.  FT authentication; includes PMK seed information from original association and supplicant nonce (client)
2.  FT authentication response – includes PMK seed information and AP nonce (AP)
2.5  Derivation of encryption key (AP & Client independently)
3.  FT re-association request – verification of derived encryption key (client)
4.  FT re-association response – acknowledgement of successful decryption and Group Transient Key (AP)
This process works for both WPA2 Enterprise and WPA2 Personal re-associations. In both cases, the eight messages passed between an AP and a client device for authentication, association, and the four-way handshake are reduced to four messages.
Note that there is an alternative method called over-the-DS fast BSS transition, where the credentials are passed from one AP to the others on the network via FT action management frames over the wired Ethernet network that interconnects them. This is usually one of those details that muddies the waters of the 802.11r story. The essential point remains the same: The first AP "vouches" for the client device to the other APs, so that the remaining APs need not re-verify that the client device is allowed to connect to the network.

When should you use fast roaming?

The human brain generally cannot perceive an event that occurs faster than about 100 milliseconds. An interruption in voice or video service during a roam that occurs faster than this will therefore not be observed by the user. The typical target roam time for a client is half of this value, or 50 ms, and in most well-designed Wi-Fi networks, the eight messages that make up the authentication, association, and four-way handshake collectively will take on the order of 40 ms to 50 ms. Thus, in a network using WPA2 Personal security, shrinking the number of messages from eight to four is naturally helpful for efficient airtime utilization, but is really unimportant to the roaming process from a perceived service-quality perspective.
The real benefit of 802.11r comes from not having to do the 802.1X/EAP exchange when using WPA2 Enterprise security.  Even with a local RADIUS server, this exchange can easily take several hundred milliseconds, and far longer if your RADIUS server is not on your LAN, but requires access over the Internet. Thus, fast roaming should ALWAYS be enabled when you are using WPA2 Enterprise security.
One of the issues with 802.11r is that many older client devices don’t have drivers that support it, and in fact even have trouble properly detecting and associating to networks with 802.11r enabled. While adding new information elements to beacon frames is a scalable part of the 802.11 protocol since the early days of WiFi -- and is an essential element in backwards compatibility of new APs with older client devices -- many older client drivers cannot read and interpret the new FT information element in the beacon frames properly so they see the beacons as corrupted frames. Therefore, to ensure maximum client compatibility, the common recommendation is to disable fast roaming when using WPA2 Personal, and only use it for WPA2 Enterprise networks.

Thursday, April 28, 2016

Commercial Grade Wi-Fi Snobbery

There is a snobbery in the Wi-Fi industry, where the large enterprise competitors paint competitors with a broad brush of being unsuitable for anything beyond consumer applications.  I encounter it frequently within the Wi-Fi community at conferences and on Twitter, as well as from current and potential customers.   It is an ugly attitude and a disparaging and discriminatory marketing practice.


Image Source: http://i152.photobucket.com/albums/s184/spedwybabs/funny-pictures-cat-looks-down-on-yo.jpg

For those of you who don't already know, I currently work as the Manager of the Field Application Engineering group at EnGenius Technologies, Inc.  One of our installer customers came to us today indicating that he is thinking of going with a more expensive competitor for a mid-market chain hotel location, because the hotel is demanding that they use "commercial grade" Wi-fi equipment.  Unfortunately, this hotel customer has bought into marketing hype and Wi-Fi snobbery, and will probably pay more money for a needlessly complex solution, rather than understanding the actual requirements and constraints and picking the right solution at the best price.

The term commercial grade is a distinction between access point designed for the consumer home markets and equipment designed for the enterprise market.  Alas, there is no standard definition of commercial grade, and thus the term has devolved into a widely abused marketing mantra.

Equipment designed for the enterprise generally has significantly more features that are tuned to the needs and requirements of enterprise deployments, such as the following:
  • VLANs
  • Multiple SSIDs
  • Band steering
  • View connected clients and collect usage statistics
  • Detection of external (i.e. rogue) access points
  • WPA2 Enterprise security
  • Use of 5 GHz DFS channels (channels 52-64 and 100-140)
  • Centralized management

Because my company manufactures and sells equipment for both the consumer market (i.e.. ESR series) and the enterprise market (i.e. Neutron and Electron series) with the same brand name, there is unfortunately confusion in the marketplace as to the positioning of our brand. Our competitors, naturally, pounce to take advantage of and further enhance this confusion in their marketing campaigns, in a brazen attempt to convince prospective customers that our equipment is not suitable for their applications.  The fact, however, is that the EnGenius Electron and Neutron equipment is designed specifically for the needs of the small-to-medium business (SMB) market, and deployed successfully in several thousands of such enterprise locations, including many hotel locations within the same chain as the one in question here. 

AP vendors tend to each gravitate to particular market niches, for which they are most suitable.  This is a natural process of market differentiation, and is true in most industries. EnGenius is focused on the SMB enterprise market, and thus has a product that is targeted to be reliable, simple, as well as comparatively inexpensive.  We are not necessarily well suited to applications in other niche markets, such as Wi-Fi in stadiums and arenas. Accordingly, there are, and will always be, certain product features that some of competitors offer that we do not. 

Thus, the selection of the AP vendor should come down to your requirements:   If your project needs particular features that a competitor offers and we do not, then by all means you should go with that competitor!  However, if you don’t need such features, then selecting our competitor means you are going to pay a premium price for unused features, additional complexity, and marketing hype.

This blog post is intended as a rant against the term commercial grade and not as a marketing piece.  However, if you've read this far and are interested to see if EnGenius is a suitable and less expensive alternative for your application, based on your requirements and constraints, feel free to browse the website and/or contact me directly.