In a worst case scenario and all your web servers have failed, what do you do? You could have a standby group of servers or CDN on or off premise to pick up the load or at least display a maintenance page but this is worst case scenario. A catastrophic failure and ALL your servers are down due to a code issue, server configuration issue, database issue, virtual infrastructure failure, SAN failure, maintenance being performed on all servers at once (I hope not on purpose), virus outbreak, or whatever else kind of horrible scenario you can think of. You get traffic all the way up to the Netscaler appliance but since your vserver is down, the user’s browser will timeout as if your company fell off the face of the earth. This is very unprofessional for any organization. Users timing out or seeing a “page could not be displayed” error is unacceptable.
So the solution is to have the Netscaler display a maintenance page with the code hosted on itself somehow. I tried several different methods including content filtering and responder policies using HTML. Originally I even thought I could leverage integrated caching to serve up cached pages and static content like images. I settled only using a responder policy initially which worked. Citrix even has a very nice knowledge center article (CTX117337: How to Configure a Maintenance Web Page by using the Responder Feature of the NetScaler Appliance) which is located here:
http://support.citrix.com/article/CTX117337
In a nutshell, what the author of the article wrote is basically more or less the same conclusion I reached as well. I just did it via GUI and that is what I will show you below. But I was not happy with the result. Keep reading and you will see why. FYI, I did all the screenshots below on an NS 9.1 appliance but it is the same procedure on NS 9.2 or any other version.
1. I am going to assume you have servers, services/service groups, and a vserver already that is UP and running. I will call them the following in this example:
vserver – lb_vsver_mywebsite
service group – svcgrp_myservicegroup
server – svr_mywebserver
Excuse the redactions in my screenshots please, I had some other configurations on this test appliance and I don’t want to confuse you with it:
2. Now create a backup vserver for your existing live vserver. In this example, I have called it “lb_vsvr_bkup_mywebsite”. But instead of giving it an IP, just uncheck directly addressable. This will cause the IP area to become greyed out:
When you click Create, it will show up as running on the IP 0.0.0.0 like below:
3. Now you need to create a service that is always UP and bind it to this backup vserver so that it will always remain UP. Just go under Load Balancing > Services, and click Add. Then create a service called “svc_maintpage” but for the Server, type in the localhost IP of 127.0.0.1, add a ping monitor, and press create.
4. Now go back to your backup vserver and bind this new service to it. Immediately after clicking OK, the backup vserver should go into an UP state. You might need to refresh your window if it doesn’t.
5. Now double click on your live vserver and under the Advanced tab, choose “lb_vsvr_bkup_mywebsite” for the Backup Virtual Server option and press OK:
6. Now under Responder > Action, click Add to create a new action. This is where you get to put some HTML and CSS. It must be very basic, all parenthesis have to be removed when using CSS in the HTML body or it will give you can error, and the whole policy must be under 255 characters total. I will name mine “action_mywebsite_maint_page” and here is an example of my policy I will use with it:
1 2 3 4 5 6 7 8 9 10 |
"HTTP/1.0 200 OK" +"\r\n\r\n" + "<html> <style type=text/css> <!-- .mywebsitefont { font-size: 24px; } --> </style> <body class=mywebsitefont>Sorry, our website is currently not available. Please try again later.</body></html>" + "\r\n" |
7. Now under Responder > Policy, click Add to create a new policy that will call on the action you just created. In this example, all we need is for the HTTP request to be valid and we will display the maintenance page. I will name it “resp_policy_mywebsite_down” in this example. Choose the action you just made in the Action drop down and for the expression, just put:
HTTP.REQ.IS_VALID
8. Now go back to the Load Balancing folder and double click your backup vserver and bind the responder policy to it like below:
9. Now to test. Open up your website in a browser and it should display as normal right now. Now login to your webservers and turn off your websites. Immediately your live vserver should say DOWN for the State but the Effective State should remain UP. This is because all traffic is being forwarded to your backup vserver you specified earlier which is set to always be up:
Refresh your browser and you should now see the maintenance page you created like below:
As you can see, a simple HTML page like above is not very professional. We need more HTML/CSS than 255 characters to work with and we need images working to make it look professional. At least it is better than a page timeout though!
Now with a content filtering policy, you don’t have to worry about a character limit. You can get away with putting HTML/CSS in a content filter policy. But again, where do the images come from?
I decided to call Citrix and see if they have run into a request like this. They had not. Now off the bat both techs I spoke to said what I was trying to do is not supported by Citrix. A Netscaler is not designed to do this. But luckily the second tech Brian at Citrix Support was just as enthusiastic about getting something to work as I am and wasn’t going to give up easily so we went over a few scenarios. The Netscaler does have an Apache web server on board, that is how the admin GUI is display to you. It is also how the Access Gateway portal is displayed to the end user. We needed to figure out a way to leverage the Apache web server on board the Netscaler to host our images, HTML, CSS, etc. The initial thought was to overwrite the Access Gateway portal and create a responder policy that would do a redirect to an Access Gateway vserver you create. The negatives here are that you are limited to SSL traffic only, have to worry about having a valid cert, you can’t bind all the policies you might need to it like you can a load balanced VIP, etc. I didn’t feel that comfortable destroying functionality to gain other functionality either.
In the end, the solution was easy and did not require overwriting the Access Gateway portal. We can host our HTML, CSS, and images on the Netscaler itself and point Apache at it. Brian did a quick proof of concept in his lab. Then I improved on it a bit. Here is the end result which I am sure a lot of you will find pretty handy in your organizations. Steps 1 through 5 are the same as above. Then from there, begin these steps:
1. First we need to get our HTML, CSS, and images on the Netscaler. WinSCP into your Netscaler and go to “/netscaler/ns_gui”. The folders you see called admin_ui, vpn, etc. are what host the Netscaler Admin GUI and Access Gateway respectively. So you have the option of putting something in the root of this folder or even create a separate folder here if you want. In my case, I decided to put a “maintenance.htm” in the root and also create a folder called “static” that will host most static content like CSS and images.
2. Now under Responder > Action, click Add to create a new action. Very important, make sure to change the type from Response to Redirect. The action should be the following (with parenthesis included):
"http://www.mywebsite.com/maintenance.htm"
3. Now under Responder > Policy, click Add to create a new policy that will call on the action you just created.. Your responder policy will need to allow the maintenance page, plus CSS, .gifs, and .jpgs you might use. So the policy I will use is:
!HTTP.REQ.URL.CONTAINS("maintenance.htm") && !HTTP.REQ.URL.CONTAINS(".gif") && !HTTP.REQ.URL.CONTAINS(".jpg") && !HTTP.REQ.URL.CONTAINS(".css")
4. Now go back to the Load Balancing folder and double click your backup vserver and bind this new responder policy to it like I did below:
Now if you disable your service groups and check your maintenance page again, you can see how the website displays the full page with nice HTML, CSS, and images. In this example, I borrowed the Sears.com maintenance page. Notice how showing your company logo keeps your branding intact even on a maintenance page which is the correct way to handle a website issue. Tell your users you are aware of the problem and offer alternatives in the meantime (static links along the bottom to other servers that are up and offering content in this example). You don’t have to go that far but it’s always nice to let your user base know you haven’t disappeared and your infrastructure is solid. This is very professional and above all, automated! 🙂
The only problem here is that when your website is back up, users will still be refreshing on this maintenance.htm page. They will get a 404 error. So you have four options. I usually prefer number 4 personally but it all depends on your needs:
1. Change your maintenance.htm page to say index.htm or whatever page is the default page of the root of your website so when they refresh once the vserver is back up, they will get the live page. You will need to WinSCP into your Netscaler again and change the maintenance.htm file name as well as change it in your Responder Action. The issue here is if let’s say you are using .NET, you can’t call it index.aspx because Apache on the Netscaler can’t parse it.
2. Just create a link on the page that says “Click Here to Try Again” which is pointed at the correct index page. This assumes the end user will actually click the link instead of hitting refresh. You can’t be 100% sure they will do this.
3. Create a maintenance.htm page on your servers and then set IIS, Apache, or whatever web server you use to do a 301 redirect to your live index page. You can leverage the Netscaler to do the redirect too of course.
4. My preferred method. Create a new responder policy saying any maintenance.htm should automatically redirect to index.aspx and bind it only to your real vserver. That way anyone that requests that page when your servers are up will always be redirected to your index page. In this example, I will call my live site’s index page index.asp and call the action policy “action_mywebsite_index_redirect”. I will also make it redirect to SSL in this example because there is a login box on the index.asp page and I want to keep it secure using https:
I will call the responder policy “resp_policy_index_redirect” and for the expression, tell it to redirect any requests to “/maintenance.htm”:
HTTP.REQ.URL.CONTAINS("/maintenance.htm")
Now bind this to your live vserver:
Now you can test it by disabling and enabling your servers or service groups. It should transition automatically between your maintenance page and the live index page. 🙂
One thing I would like to point out. On any of your Responder Policies or Actions, you can always view the hit counter to see if the policy or action is being invoked. This might help you when you are setting this up initially and something goes wrong and you want to see if the policy or action is being hit:
So there it is. Your Netscaler is now an emergency web server that automatically puts up a professional looking maintenance page in a worst case scenario when every backend web server you have is down. A big thank you to Brian at Citrix for the help! If anyone can think of any improvements to this process or has any trouble with it, please reply I would love to hear about your experience.
Ronan
March 3, 2011 at 5:51 AM
The 255 char limit is just for responder action STRINGS .. not for the whole responder action.
Try pasting this into your CLI, then open in the gui to see how it looks, and modify to include your desired maintenance content. 🙂
Jason Samuel
March 3, 2011 at 5:26 PM
Thank you very much Ronan! I just tested and you are absolutely correct. So the trick is to make sure the STRINGS are less than 255 char, but you can have as many of these STRINGs as you want by using a plus sign (+) after each one.
Here is the result of what Ronan has suggested for anyone that might be interested. I did the following via CLI first just for kicks:
Works perfectly as expected. Now delete the action that was just created. We will add it again but this time, increase that same action past 255 char and join them together with plus signs per Ronan’s recommendation:
And there you go, over 255 chars works perfectly! Here is a screenshot of it in the GUI after adding:
http://www.jasonsamuel.com/wp-content/uploads/2011/03/mtn_pg_act.gif
Thanks again for catching this Ronan.
Brad G
March 14, 2011 at 9:10 AM
This is a great example. Thanks for the tip Jason, and also to Ronan for the clarification on the responder action strings. I figured there had to be a way to accomplish this without an additional webserver.
If anyone did try adding files to the netscaler, here is a quick caution;
I’ve dropped files and modified files in /netscaler/ns_gui before (branding and customizing access gateway login pages) and my files didn’t survive an NS firmware upgrade.
Martin
April 1, 2011 at 3:13 AM
Hello, great write up! this is almost exactly what we are doing, but the volume of our sites are SSL … I’d be interested to hear if you got anywhere with Citrix about how to adapt this for SSL sites?
One of our main requirements is to keep the base URL intact rather than redirect to an external site and so the main hurdle we bump into is that the backup SSL vServer must have a certificate bound to it, this means we cannot use a generic vServer for all other vServers to fall back on (as we can with HTTP). The only option i can think of just now is to have a backup vServer for every SSL vServer with the same certificates bound … this however seems overly clunky, i was hoping to do this in one step. Any ideas?
Jason Samuel
April 1, 2011 at 3:50 PM
@Martin
Thanks Martin. For every SSL vserver, I have a backup SSL vserver with the appropriate SSL cert bound to it. It works fine for me and since it’s a one time setup, I don’t mind if I have several I have to setup. Just have a very good naming convention for your LB vservers like in my screenshot and all your “lb_vsver_bkup_xxxxx” vservers will always be grouped together in the GUI. This makes spotting them easy when you have several hundred LB vservers being displayed.
I can’t really think of any other way around it unless the SSL cert you use is a SAN cert with multiple domain names on it. Then you can use just one “generic” SSL vserver and point all your SSL sites to it without getting a cert error message.
Marco Schirrmeister
May 15, 2011 at 6:24 PM
Instead of doing the redirect to /maintenance.htm, I would directly serve all the content of that file via the “respond with” method like in your simple example.
All images you want to load can be relative links and would loaded from the NetScalers Apache from /netscaler/ns_gui/ directory via your dummy backup vserver.
That’s what I do, because I don’t want file names or sub directories in address bar.
I prefer it that everything is served without changing the address bar.
Bud
July 7, 2011 at 7:29 PM
I am a bit of a newbe when in comes to expression scripting so excuse me if this is a stupid question. But in the section where you try to define the expression in the reponderpolicy:
I revieve and error saying compound expression syntax error.
Also can you tell me what this expression is supposed to do. because it looks like the page must conaint all the extension you specfied before the policy will take place. should it be like this?
any help would be appriciated.
Dan Dycus
December 7, 2011 at 12:20 AM
I’m trying this on a site which sees high traffic during maintenance periods. Initially the page draws and all works fine, but over time, performance of the maintenance page degrades severely (over 2 min to render fully). This issue also slows the management GUI. Is there a session limiter somewhere that can be increased? Or perhaps I’m running into a different problem? Thanks for any suggestions.
Brad G
December 13, 2011 at 3:07 PM
Dan, Have you checked to make sure that your responder action doesn’t also match the condition of the responder policy?
I’ve seen where someone activated a policy like this and it caused a redirect loop which had a serious negative impact on the netscaler performance. If you look at the hit count for the responder policy, it will probably be obvious if this is what has happened to you.
Bob Wentworth
January 5, 2012 at 5:25 PM
Why not just specify a redirect URL?
set lb vserver -redirectURL http://maintenancepage
http://support.citrix.com/article/CTX117337
Pingback: Time Based Access with Custom Respond Page on NetScaler | Matthijs' Blog
Art M
March 19, 2013 at 10:30 AM
Has anyone been able to get this to work with HTTPS? If so could you post how?
Martin
March 19, 2013 at 10:43 AM
Hello Art M,
I do the same as Jason said in his earlier comment. For each vServer I create a second vServer, which is non addressable & has the appropriate SSL certs bound, this secondary vServer in my case only has a single responder policy which returns my “sorry page”. The primary vServer then has the sorry vServer defined as its backup virtual server.
You can use whatever naming convention works for you to denote the “sorry” vServer..
Hope this helps
ta,
martin.
Art M
March 19, 2013 at 10:47 AM
I wish it did, I have it working for my HTTP site but for HTTPS. I am missing something here. Any instructions would be great as I am new to the netscaler.
Thanks
Brandon
March 19, 2013 at 12:35 PM
Looking at Marco’s comment from May 15th, 2011, he states being able to serve images up using relative links.
All images you want to load can be relative links and would loaded from the NetScalers Apache from /netscaler/ns_gui/ directory via your dummy backup vserver.
I am able to serve up the HTML. I can view the image via my browser via http://vpx.mydomain.com/static/image.png, but I am unable to view the image as part of the maintenance page.
I have also tried /netscaler/ns_gui/static/image.png with no luck.
Martin
March 20, 2013 at 3:45 AM
Ahh, so we went with concatenating the maintenance page HTML into the responder policy and simply used a backup vServer to server only this responder (we additionally use the same responder policy for trapped 5xx HTTP errors..
Our requirement was the same as Marco’s – i.e. that the URL was not mauled or rewritten. This worked well for us without having to use the onboard dispatchers Apache.
John
July 3, 2013 at 10:10 AM
How can I adjust this policy to load a certain maintenance page (with its own image and message) based on what the URL it is they’re trying to request?
So if the URL a user’s trying to hit is http://www.domain.com/Site123, my maintenance page would be a site that has Site123.png or whatever on it.
And if the URL a user’s trying to hit is http://www.domain.com/Site456, the image in the html is Site456.png, etc…?
Brad G
July 3, 2013 at 1:42 PM
@John
John,
in your responder action, you can use HTTP.REQ.URL.PATH to append the original path to a new domain.
It would looks something like this…
Ryan
August 22, 2013 at 4:33 PM
Even working within the original article’s assumed 255 character limit, we were able to easily send a fully formed HTML document containing a full-viewport IFRAME that displayed an off-site maintanance/outage page:
– – –
“HTTP/1.0 200 OK\r\n\r\n\r\n”
– – –
No mess. No polluting your appliance in unsupported ways with content. Keep the content somewhere your webby folks are free to play with it, CMS it, etc.
That syntax leaves about 45 characters for the IFRAME SRC URL.
We considered just doing a meta-refresh to redirect (even more compact) but did not want to show a different URL to the Customer.
UPDATE: Now I see the comments above this one about concatenating any number of up-to-255-character strings. Beautiful: lets us put more metadata in, and the full complement of currently-recommended no-cache tags. Still highly recommend putting the ACTUAL content on a separate server and merely expose it with the IFRAME.
Ryan
August 22, 2013 at 4:34 PM
Oops. Here’s the CODE block that got stripped out:
Harry
September 23, 2013 at 11:01 AM
Hello everyone,
I have a requirement to display images from the NS and am using above logic to get it done.
Have copied the html page and the images in /netscaler/ns_gui/
If I access the http://NSIP/page.html, I can see the page with images.
Works fine with MIP/SNIP as well. However, if I use http://lb-vip-ip/page.html, it fails to get a response.
Is there any additional configuration required to get this working with VIP.
Theoretically it is similar to http://www.mywebsite.com/maintenance.htm. (I assume http://www.mywebsite.com points to the LB VIP)
I am doing this on 10.0-74.4 code.
Thanks in advance.
chidex
August 20, 2015 at 6:45 AM
It looks like you cannot create service on 127.0.0.1 on Netscaler 10.5. Any Workaround for this?
Oleg
March 16, 2016 at 7:05 PM
Well, you can add any “has to be always up” server/service, e.g. a default gateway of NSIP and monitor it with ping only…