Monday, September 26, 2011

Setup Caching and Proxy with Nginx in Centos/Fedora P3


server {
    listen 80;
    server_name myforums.com alias www.myforuns.com;
     access_log  logs/myforums.com_access.log  main;
    error_log  logs/myforums.com_error.log debug;
     location / {
        proxy_pass http://10.10.6.230;
        proxy_redirect     off;
        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
         proxy_cache               one;
        proxy_cache_key         backend$request_uri;
        proxy_cache_valid       200 301 302 20m;
        proxy_cache_valid       404 1m;
        proxy_cache_valid       any 15m;
        proxy_cache_use_stale   error timeout invalid_header updating;
    }
     location /admin {
        proxy_pass http://10.10.6.230;
        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
    }
}
The obvious changes you want to make are ‘myforums.com’ to whatever domain you are serving, you can append multiple aliases to the server_name string such as ‘server_name domain.com alias www.domain.com alias sub.domain.com;‘. Now, lets take a look at some of the important options in the vhosts configuration:
listen 80;
This is the port which nginx will listen on for this vhost, by default unless you specify an IP address with it, you will bind port 80 on all local IP’s for nginx — you can limit this by setting the value as ‘listen 10.10.3.5:80;‘.
proxy_pass http://10.10.6.230;
Here we are telling nginx where to find our content aka the backend server, this should be an IP and it is also important to not forget setting the ‘proxy_set_header Host’ option so that the backend server knows what vhost to serve.
proxy_cache_valid
This allows us to define cache times based on HTTP status codes for our content, for 99% of traffic it will fall under the ’200 301 302 20m’ value. If you are running allot of dynamic content you may want to lower this from 20m to 10m or 5m, any lower defeats the purpose of caching. The ’404 1m’ value ensures that not found pages are not stored for long in case you are updating the site/have a temporary error but also prevent 404′s from choking up the backend server. Then the ‘any 15m’ value grabs all other content and caches it for 15m, again if you are running a very dynamic site you may want to lower this.
proxy_cache_use_stale
When the cache has stale content, that is content which has expired but not yet been updated, nginx can serve this content in the event errors are encountered. Here we are telling nginx to serve stale cache data if there is an error/timeout/invalid header talking to the backend servers or if another nginx worker process is busy updating the cache. This is really useful in the event your web server crashes, as to clients they will receive data from the cache.
location /admin
With this location statement we are telling nginx to take all requests to ‘http://myforums.com/admin’ and pass it off directly to our backend server with no further interaction — no caching.
That’s it! You can start nginx by running ‘/usr/local/nginx/sbin/nginx’, it should not generate any errors if you did everything right! To start nginx on boot you can append the command into ‘/etc/rc.local’. All you have to do now is point the respective domain DNS records to the IP of the server running nginx and it will start proxy-caching for you. If you wanted to run nginx on the same host as your Apache server you could set Apache to listen on port 8080 and then adjust the ‘proxy_pass’ options accordingly as ‘proxy_pass http://127.0.0.1:8080;’.
Extended Usage:
If you wanted to have nginx serve static content instead of Apache, since it is so horrible at it, we need to declare a new location option in our vhosts/*.conf file. We have two options here, we can either point nginx to a local path with our static content or have nginx cache our static content then retain it for longer periods of time — the later is far simpler.
Serve static content from a local path:
        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {
            root   /home/myuser/public_html;
            expires 1d;
        }
In the above, we are telling nginx that our static content is located at ‘/home/myuser/public_html’, paths must be relative!! When a user requests ‘http://www.mydomain.com/img/flyingpigs.jpg’, nginx will look for it at ‘/home/myuser/public_html/img/flyingpigs.jpg’ . The expires option can have values in seconds, minutes, hours or days — if you have allot of dynamic images on your site then you might consider an option like 2h or 30m, anything lower defeats the purpose. Using this method has a slight performance benefit over the cache option below.
Serve static content from cache:
         location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {
             proxy_cache_valid 200 301 302 120m;
             expires 2d;
             proxy_pass http://10.10.6.230;
             proxy_cache one;
        }
With this setup we are telling nginx to cache our static content just like we did with the parent site itself, except that we are defining an extended time period for which the content is valid/cached. The time values are, content is valid for 2h (nginx updates cache) and every 2 days the content expires (client browsers cache expires and requests again). Using this method is simple and does not require copying static content to a dedicated nginx host.
We can also do load balancing very easily with nginx, this is done by setting an alias for a group of servers, we then define this alias in place of addresses in our ‘proxy_pass’ settings. In the ‘upstream’ option shown below, we want to list all of our web servers that load should be distributed across:
   upstream my_server_group {
    server 10.10.6.230:8000 weight=1;
    server 10.10.6.231:8000 weight=2 max_fails=3  fail_timeout=30s;
    server 10.10.6.15:8080 weight=2;
    server 10.10.6.17:8081
  }
This must be placed in the ‘http { }’ section of the ‘conf/nginx.conf’ file, then the server group can be used in any vhost. To do this we would replace ‘proxy_pass http://208.76.83.135;’ with ‘proxy_pass http://my_server_group;’. The requests will be distributed across the server group in a round-robin fashion with respect to the weighted values, if any. If a request to one of the servers fails, nginx will try the next server until it finds a working server. In the event no working servers can be found, nginx will fall back to stale cache data and ultimately an error if that’s not available.

Conclusion:

This has turned into a longer post than I had planned but oh well, I hope it proves to be useful. If you need any help on the configuration options, please check out http://wiki.nginx.org, it covers just about everything one could need.
Although I noted this nginx setup is deployed on a Xen guest (CentOS 5.4, 2GB RAM & 2 CPU cores), it proved to be so efficient, that these specs were overkill for it. You could easily run nginx on a 1GB guest with a single core, a recycled server or locally on the Apache server. I should also mention that I took apart the MySQL replication cluster and am now running with a single MySQL server without issue — down from 4.

No comments:

Post a Comment