A lot of companies use RADIUS or TACACS authentication on a Netscaler for use with Access Gateway (AGEE) which is pretty secure. Sometimes you might have users that complain they can’t login via the Access Gateway. There are a few things you can do to troubleshoot authentication issues. I’m going to run through some screenshots from a NS 9.3 device because that’s what I have in front of me at the moment but the same troubleshooting methods can apply to NS 10.x.
Capturing AAA authentication traffic in real-time
1. SSH into the Netscaler, login with your admin credentials, then enter shell by typing:
shell
2. Now you want to capture the authentication in real time and see exactly what the error looks like. This is handled by the AAA (Authentication, Authorization, and Auditing) on the Netscaler. So type the following:
cat /tmp/aaad.debug
A successful authentication against the Access Gateway would look like this. The user is named User1:
usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/naaad.c[614]:
process_kernel_socket call to authenticate
user :User1, vsid :414
Tue Feb 5 09:53:53 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[138]:
start_radius_auth attempting to auth User1 @ xxx.xxx.xxx.xxx
Tue Feb 5 09:53:53 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[741]:
process_radius radius accepts : User1
Tue Feb 5 09:53:53 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[743]:
process_radius extracted group string :(null)
Tue Feb 5 09:53:53 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/naaad.c[1466]:
send_accept sending accept to kernel for : User1
and an unsuccessful authentication would look like this. The user is named User2:
Tue Feb 5 09:50:11 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/naaad.c[614]:
process_kernel_socket call to authenticate
user :User2, vsid :414
Tue Feb 5 09:50:11 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[138]:
start_radius_auth attempting to auth User2 @ xxx.xxx.xxx.xxx
Tue Feb 5 09:50:11 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[788]:
process_radius radius rejects : User2
Tue Feb 5 09:50:11 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[138]:
start_radius_auth attempting to auth User2 @ xxx.xxx.xxx.xxx
Tue Feb 5 09:50:14 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[731]:
process_radius retransmit radius packet
Tue Feb 5 09:50:17 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[731]:
process_radius retransmit radius packet
Tue Feb 5 09:50:20 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[778]:
process_radius rad_continue_send_request:No valid RADIUS responses received
Tue Feb 5 09:50:20 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[785]:
process_radius unknown return value from rad_continue_send_request :-1
Tue Feb 5 09:50:20 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/radius_drv.c[788]:
process_radius radius rejects : User2
Tue Feb 5 09:50:20 2013
/usr/home/build/rs_93/usr.src/usr.bin/nsaaad/../../netscaler/aaad/naaad.c[1562]:
send_reject sending reject to kernel for : User2
Tue Feb 5 09:50:23 2013
lwagent.c[1107]: main EV_DEBUG: handle time out
you can see very clearly that the authentication server has denied access. This points to an issue on the authentication server. To stop the capture, just hit Ctrl + Z.
Viewing the ns.log using Syslog Viewer
1. You can look at all the failed logins in the ns.log. In the Netscaler admin console/GUI, go to System > Auditing and click the “Syslog messages” button:
2. Select the AAA module and then double click each ns.log file. You will immediately see all the “LOGIN_FAILED” event types as you go through each ns.log (these are the logs stored at /var/log on the Netscaler). Pay attention to the message, it will tell you why the authentication attempt failed. As you can see in this example, the authentication server is the problem and is denying access:
Running a Trace
1. Lastly you can run a network capture/trace when users are experiencing the issue. You can do this via the console/GUI easily by going to System > Diagnostics and clicking on “Start new trace”:
2. Set the packet size to 0 and hit Start:
3. Stop the trace once you feel you have enough traffic captured:
4. Download the nstrace .cap file and you can open it in Wireshark for further analysis:
Hope this helps some of you. Let me know if you have any other methods of troubleshooting authentication issues you would like to share. LDAP troubleshooting is easier since the Netscaler can give you a lot more detail as to what is failing. RADIUS and TACACS is a little trickier since you have something in the middle to troubleshoot but the steps above should give you enough to tell you if the problem resides on the Netscaler or on the authentication server.
Daniel Ruiz
February 6, 2013 at 10:34 AM
Thanks for sharing… excellent stuff and very useful.
Jason Samuel
February 8, 2013 at 3:27 PM
@Daniel Ruiz
You’re very welcome!