How to troubleshoot 802.1X client connectivity issues?
Summary
Troubleshooting an enterprise WLAN that uses port-based network security 802.1x can be challenging. Troubleshooting 802.1X mainly involves understanding the flow of authentication and identifying where it’s breaking. This guide will help you troubleshoot common issues with 802.1X authentication on various Ruckus solutions hosting 802.1x wireless networks.Question
How to troubleshoot 802.1X client connectivity issues?Customer Environment
Virtual SmartZone (vSZ). SmartZone-144 (SZ-144). SmartZone-100 (SZ-100). SmartZone-300 (SZ-300). ZoneDirector-1200 (ZD-1200). Ruckus One (R1). Ruckus Zone Director Ruckus UnleashedSymptoms
- Newly configured 802.1X/ Radius WLAN does not work.
- Clients unable to connect to newly configured 802.1X WLAN
- AAA test does not work
- Clients connect but do not have internet access after they pass AAA authentication
- 802.1x WLAN that was working suddenly stopped working
- With changes made
- Without any significant changes made
- 802.1x WLAN performance issues
- Certain types of clients can connect, and certain ones do not
- Clients drop connections randomly
- Throughput issues on 802.1x/Radius WLAN
- Clients show insecure or warning when connecting to a 802.1x WLAN
Troubleshooting Steps
This article starts with the introduction and explanation of the protocol and the various network nodes that participate in 802.1x.
It also explains how the feature works on different Ruckus Wireless solutions and how we can troubleshoot common issues.
This includes Ruckus Smart Zone, Ruckus One, Ruckus Unleashed / Zone Director.
Introduction to 802.1x and sample network topology and endpoints
Before we start troubleshooting, we need to identify the nodes / network devices involved.The following is a generalized network diagram for a typical 802.1x WLAN implementation.
The three main network components of a 802.1X wlan are:
• Authenticator (Mostly the Wireless Access Points/Switches or Wireless Controller)
• Authentication Server (Ex: Windows NPS, Cloudpath, FreeRadius, and other radius servers)
The wireless client is called the Supplicant.
- The UE uses EAP protocol to communicate with the AP over wireless
- This involves a series of steps were both the client and the server validates each other's identity.
- The different flavors of EAP are.
- EAP – PEAP / EAP-MSCHAP/ EAP-MSCHAPv2
- EAP – TLS
- EAP - TTLS
- EAP – SIM
- EAP- AKA
- The client and the server determine the type/flavor of EAP, the Ruckus Controller/AP acts as the mediator in most scenarios.
The Access point or the wireless controller (based on the deployment type) is the Authenticator.
- In RUCKUS Zone Director Architecture the Zone Director is the Authenticator
- In RUCKUS Smart Zone Architecture, the Smart Zone controller can be the authenticator in Radius Mode Proxy. When Radius Mode is set to non-proxy the Access Point is the authenticator
- RUCKUS Cloud follows the same rules as RUCKUS Smartzone
- RUCKUS Unleashed; master access point is the authenticator.
- The Authenticator talks EAP with the UE and talks Radius Protocol with the Authentication Server
- The authenticator terminates EAP and translates it to Radius
- An authenticator is also called a radius client
The 802.1x server (Network Policy Server / Network Policy Manager)
- RUCKUS CloudPath can act as an Authenticator Server and can integrate with all RUCKUS wireless and wired product line
- OpenRadius based Radius Servers
- Microsoft Network Policy Server (NPS) + AD
- FortiNAC, CISCO ISE and Aruba ClearPass are examples of Network Policy Manager’s
- The authentication server communicates with the AP/Controller using the radius protocol
The Radius Server
Troubleshooting a 802.1x WLAN starts at the radius server.- Note down the MAC and IP address of the following device.
- The mac addresses wireless UE/Client attempting the authentication.
- Rough timestamp of the failures
- Replicate the failure Multiple times if this is a test device.
- Mac and IP address of the authenticator
- If it’s the AP that terminates EAP note down the AP’s MAC and IP
- If it’s the controller note down the controllers MAC and IP address
- Each Authentication server has logging mechanisms to debug a failure scenario, for example if it's a windows NPS server – look at the event viewer for the client MAC in question (calling station ID)
- References for commonly used servers can is attached at the references section
- If not, failure logs are available, it is either because the Radius Client Secret which is added on both sides is mismatched or the Radius Client (the Authenticator) is not able to communicate on the radius UDP port to the Radius server.
- Perform a Wireshark / TCP dump on the Radius Server using the AP’s or Controller IP address as a filter.
- If an incoming radius (default UDP 1812) packet it seen this confirms that the communication for Radius
- If no packet is seen please investigate the Radius Configuration on the Ruckus Controller / AP
- Validate UDP network connectivity between the Authenticator and the Authentication Server
- If there are no Radius Requests received at the Radius Server or if the server is sending a Radius accept messages are sent to the Ruckus Authenticator still clients are unable to connect, then proceed to troubleshoot the Authenticator (The Ruckus Wireless Solution)
Troubleshooting tips and scenarios for various Ruckus Wireless Solutions:
Ruckus Smart Zone
- Ruckus Smart Zone controller can work in two modes when it comes to Radius Authentication
- Proxy
- The Ruckus Smart Zone controller runs a service called the RadiusProxy which is based on OpenRadius
- All radius requests are sent from the controller’s management IP address
- Access points encapsulate authentication requests inside the AP – Controller SSH tunnel.
- Non-Proxy
- The access points terminate EAP and sent Radius requests to the Authentication server
Reference for 802.1X configuration of Smart Zone: SZ 6.1.2- Wlan Management Guide
- Based on the mode selected you can debug the problem on the respective node
- RadiusProxy Logs if Proxy and AP Support Logs if Non-Proxy
- If the mode is Radius Proxy, you can enable debugging on the module in question via the Application Logs (Monitor > Troubleshooting and Diagnostics > Application Logs)
- Click on Warning which should show a drop down option, change this to Debug
- Replicate the failure using the wireless client a couple of times and then download the “radiusd.log” file by clicking on the number right next to the Debug Status
- This file is compressed and can be opened using software similar to 7Zip, once extracted the file can then be opened using a text editor like Notepad++
Reference for downloading application logs: Working with Application Logs
- Search the file using the clients MAC address or the AP’s MAC address or the server IP Address and validate if the Radius Requests are being sent out from the controller
- A sample for a successful authentication will look like the following:
[Tue Jul 09 2024
14:23:13:394][CP][RADIUS][DBG][TID=1645209344][src/main/process.c:698]
Radius Packet Header:
Code: 1
Id: 44
Length: 358
Authenticator: 0x91dec57d946c95bc7035d3dcb6ad5c2c
User-Name = "[email protected]"
NAS-IP-Address = 192.168.29.229
NAS-Identifier = "2C-AB-46-A3-46-A2"
Called-Station-Id = "2C-AB-46-A3-46-A2:Secure"
NAS-Port-Type = Wireless-802.11
Service-Type = Framed-User
NAS-Port = 1
Calling-Station-Id = "42-6C-F7-F8-30-9F"
Connect-Info = "CONNECT 802.11"
Acct-Session-Id = "668D47CF-2346A000"
Acct-Multi-Session-Id = "0711D9805AF08709"
WLAN-Pairwise-Cipher = 1027076
WLAN-Group-Cipher = 1027076
WLAN-AKM-Suite = 1027073
Ruckus-SSID = "Secure"
Ruckus-BSSID = 0x2cab46a346a2
Ruckus-Wlan-Id = 688
Ruckus-Sta-Vlan-Id = 1
Ruckus-SCG-CBlade-IP = 134.242.136.121
Ruckus-Domain-Name = "Cecil"
Ruckus-Zone-Name = "home"
Ruckus-Wlan-Name = "Secure"
EAP-Message = 0x0254001c01746573744062796f642e636c6f7564706174682e6e6574
Chargeable-User-Identity = 0x00
[Tue Jul 09 2024 14:23:40:245][CP][RADIUS][DBG][TID=1300178688][src/main/process.c:612]
Received Access-Accept Id 42 from 72.18.151.76:12474 to 10.9.182.15:47178 length 208
[Tue Jul 09 2024 14:23:40:246][CP][RADIUS][DBG][FID=1,ueMac=42:6C:F7:F8:30:9F,TID=1300178688][wsg_rad_utils.c:1282]
User-Name ([email protected]) received in access-accept, overide in all subscriber context
- The above strings are parameters that can be checked on a packet capture taken on the server or the server logs.
- Here are some sample error snippets for failed radius request for a bad username / password / auth type will look like the following:
No '@' in User-Name = "test", looking up realm NULL
[Tue Jul 09 2024 14:41:22:224][CP][RADIUS][DBG][TID=1291785984][src/main/process.c:612]
Received Access-Reject Id 138 from 72.18.151.76:12474 to 10.9.182.15:47178 length 48
[Tue Jul 09 2024 14:41:22:224][CP][RADIUS][ERR][FID=1,ueMac=98:B3:79:3A:6F:0E,TID=1291785984][wsg_rad.c:1968]
Recvd Access-Reject from AAA Name:[training.cloudpath.net_auth] for UE MAC:[98-B3-79-3A-6F-0E]
[Tue Jul 09 2024 14:41:22:224][CP][RADIUS][DBG][TID=1291785984][src/main/process.c:739]
SRC-IP: 72.18.151.76
SRC-PORT: 12474
DST-IP: 10.9.182.15
DST-PORT: 47178
Radius Packet Header:
Code: 3
Id: 138
Length: 48
Authenticator: 0x2d070fa04d47bc146ddadbbb15385d16
EAP-Message = 0x04080004
Message-Authenticator = 0xf931944e0018e455905be128a2df9671
Proxy-State = 0x3831
- The above logs are also present on the Snapshot logs bundle, which can be shared to Ruckus Support for further insight
The Ruckus SmartZone controller includes an integrated visual troubleshooting tool that can graphically display the aforementioned logs.
- This can be enabled and viewed by navigating to Monitor > Troubleshooting And Diagnostics > Troubleshooting > Choose “Client Connection” as the type and feed in the test client MAC address
Reference for using the troubleshooting tool for client connections: Troubleshooting Client Connections
Troubleshooting tips:
- Most operating systems in the market may randomize its MAC addresses for privacy reasons – always disable this as a troubleshooting step to make both the process easier and logging productive.
- Apple devices call this feature Private MAC Addressing
- Android devices call this MAC Randomization
- Windows devices call this feature MAC Randomization
- Try to choose a smaller AP list in the Select AP’s tab to improve the responsiveness of the tool.
- Disable LTE/4G/5G / Cellular Connections / Mobile Data on devices when troubleshooting to prevent the device from using an alternate uplink to the internet, this can greatly fasten the time taken to connect to the WLAN
Ruckus Analytics/ Ruckus AI
Ruckus Analytics is yet another tool that can provide insights/ stats on Radius client performance
- Using either the client MAC address or the access point MAC address we can trace events and find out where the radius work flow breaks
- https://ruckus.cloud/ai > Clients > Locate the client using its MAC address
- In the below example, the client was attempting to authenticate against a 802.1x WLAN and was using the wrong credentials
- Similar outputs are also seen when clients use wrong auth types
- Ruckus Smart Zone AP’s with non-proxy mode authentication
- The AP Support log collected immediately after an authentication failure can have information about the radius transaction.
- Sample logs for a working client:
Jul 17 14:50:53 Cecil-LTE_AP daemon.info hostapd: @@206,clientAuthorization,"apMac"="2c:ab:46:23:46:a0","clientMac"="44:03:2c:bd:33:32","ssid"="Secure","bssid"="2c:ab:46:a3:46:a2","userId"="","wlanId"="688","iface"="wlan34","tenantUUID"="839f87c6-d116-497e-afce-aa8157abd30c","apName"="Cecil-LTE_AP","apGps"="8.88757,76.60933","userName"="[email protected]","vlanId"="1","radio"="a/n/ac","encryption"="WPA2-AES","band"="5g"
Jul 17 14:50:53 Cecil-LTE_AP daemon.info hostapd: wlan34: STA 44:03:2c:bd:33:32 IEEE 802.1X: wlan34: IEEE 802.1X: authenticated - EAP type: 13 (unknown)
- The smart zone troubleshooting tool can also provide helpful information as explained above for the smart zone example
- The same applies for RuckusAI
Ruckus Cloud / RuckusOne
- Just like Ruckus Smart Zone, RuckusOne also supports both proxy and non proxy modes of authentication for 802.1X
- The modes are selected on the AAA Settings of the WLAN as shown in the below example: Proxy Service
- When using proxy mode the radius requests are generated from device.ruckus.cloud / the cloud controllers public interface
- It is expected that the radius server used must also be routable on the internet / have a public IP with port selected being NAT’d
- Non-Proxy mode is used when the radius server is only reachable via the LAN, the radius requests are generated from the IP address of the access points
- Troubleshooting is done mainly using the AP support logs and Ruckus Analytics built into RuckusOne
- Radius failures are auto detected and highlighted to the admin by RuckusOne
- https:ruckus.cloud/ > AI Assurance > Incidents
- RuckusAI built into RuckusOne can assist with most known scenarios involving radius
- To invoke troubleshooting features, head over to Clients > Client List (X) – Use the search box labelled “Search for connected and historical clients”
- If a radius reject message is seen please start troubleshooting the radius server
- The scope for live debugging directly on the on the access point / cloud controller is limited for a RuckusOne admin
- Access to AP ssh credentials is limited to the support team
- Access to the radius module is also limited to the support team
- The admin can still review the AP support log files from the AP
- Please expect a worst case of 30 sec - 60 second delay in client data showing up on Ruckus AI
Ruckus Unleashed/Zone director:
- Ruckus Unleashed is very similar to the working of the Zone Director and hence this section will be applicable to both
- The Ruckus unleashed Master AP (or the zone director) will interact with the Radius server
- The configuration guide can be located here: 802.1 Unleashed Configuration
- Ruckus unleashed has an inbuilt debug tool similar to the Smart Zone and Ruckus One
- This can be invoked here : Admin & Services > Administration > Diagnostics > Client Troubleshooting, and locate the Client Connection Logs section.
- This will give us a rough idea on the reason for the dissconnect
- The logs can be exported and shared to Ruckus TAC or imported later for review
- Common issues:
- Radius Rejects
- Check the radius server logs for the client transaction
- Check client credentials / certificates / EAP type
- No response or Radius time out
- Validate UDP connectivity for the Radius Port in question between the master AP and the Radius server
- Check the UDP port configured on the Radius Server
- Validate Radius Shared Secret and reconfigure if necessary
- Take packet captures on the AP uplink or on the radius server side
- Look for packets sourced from the Unleashed Master AP
- Validate UDP connectivity for the Radius Port in question between the master AP and the Radius server
- Radius Rejects
Resolution
Common issues and resolution for those scenarios
- Newly configured 802.1X/ Radius WLAN does not work.
- Clients unable to connect to newly configured 802.1X WLAN
- Make sure that the radius server has a valid CA signed certificate from a well known CA like Godaddy/Commodo/Verisigin etc
- Make sure the radius server configuration for the EAP method / flavor of 802.1x matches that configured / selected on the client
- Newer Android / IOS clients have privacy features they sends a bogus / anonymous user name please check with the device manufacturer on instructions to disable it
- Clients unable to connect to newly configured 802.1X WLAN
- AAA test does not work
- The AAA inbuilt AAA test features is based on OpenRadius PAP / CHAP
- Most modern day server have PAP/CHAP disabled
- Clients connect but do not have internet access after they pass AAA authentication
- Check client VLAN assigned post authentication / pre authentication.
- Check client IP options like DNS / Subnet
- 802.1x WLAN that was working suddenly stopped working
- With changes made
- Try to revert changes made and test if system is working
- Without any significant changes made
- Check status of AAA radius server certificate, it could have expired
- Check if connectivity between the nodes that participates in the authentication workflow have broken
- Check outside of both Ruckus and Radius server if there were any changes
- 802.1x WLAN performance issues
- Certain types of clients can connect, and certain ones do not
- Each device vendor has its quirks and features – make sure all advanced features are disabled
- Clients drop connections randomly
- Troubleshoot RF issues
- Throughput issues on 802.1x/Radius WLAN
- Validate client loads on each AP
- Validate if a AAA policy is pushed via the Radius server to limit bandwidth
- Validate and troubleshoot RF
- Certain types of clients can connect, and certain ones do not
- Clients show insecure or warning when connecting to a 802.1x WLAN
- Check AAA server certificate
- Make sure its signed by a known CA
In each of the above scenarios to narrow down where the problem is, we need to capture data on each node to verify where the problem is.
Logs and data to collect when opening a support case for a 802.1X failure issue
- Please collect the AP support logs shortly after the issue replicates while opening a TAC case
- Please also collect the Client MAC address with MAC randomization disabled and a rough timestamp of when the issue occurred
- Smart Zone Snapshot logs with the radius service in debug mode
- Enable Support access to RuckusOne / RuckusAI via the Administration tab
Article Number:
000014389
Updated:
October 09, 2024 01:25 PM (2 months ago)
Tags:
Performance, Configuration, Troubleshooting, ZoneDirector, SmartCell Gateway, SmartCell AP, Cloud Services
Votes:
95
This article is:
helpful
not helpful