From a presentation by mdille3 on 2009-09-22. Take a look at his slides and a recording of the presentation.
- 3 layers (see diagram in slides)
- load balancers
- everything through here -- lb1 master lb2 backup
- if lb1 down lb2 steals ips
- virtual ips of services get swapped around -- www1 www2 www-contrib1 www-contrib2
- www1 doesn't really correspond to a real machine -- virtual ip -- only certain ports forwarded to a real machine
- DNS A records for www -- go to www1 www2 ips
- do not run AFS clients
- frontends
- www-node-{1,2}
www-contrib-node-{1,2} (these maybe don't exist anymore; tparenti 2016-04-21)
- possibly redirect to localhost instead of back to lbs if accessing lb tier from here?
- what actually run apache
- get sad if AFS gets sad
- serve static web pages (anything that's not CGI in AFS) -- www.club www.contrib ftp rsync
- do magic to figure out requests that are for CGI scripts and forward to contrib-cgi and club-cgi as appropriate
- cgi servers
- never hit directly
- club-cgi contrib-cgi my-contrib (configuration for contrib)
- no lb on my-contrib
- load balancers
- linux virtual services
- tech behind lb
- can be done via NAT (we don't do this)
- all traffic has to go through gateway
- all state lost on failover -- okay to drop connections on short-lived www connections
- can have lb forward requests, backends respond directly
- sort of ip tunneling -- put new header on the top
- machines must support encapsulation of packets
- get more fragmentation
- spoofing prevention can break this, since backend responds with virtual ip of lb
- can do that, but instead of sending ip, can send raw ethernet packets if you're on the same ethernet segment
- can mess up ARP tables -- service servers do not ARP
- this is what we do
- (try to) run debian apache
- lies in /var/apache/etc or /etc/httpd
- main config includes chost files that do a lot of magic
- description of /var/apache/etc/contrib.conf
- document root /afs/club/www/contrib
- magic lies in rewrite rules
- redirects ~foo /user/foo /usr/foo to symlinks via rewrite rules
look at contrib.conf, RewriteRules are well documented
- a lot of little silly fixes to give expected behavior
- scary symlink tree at /var/apache/andrew-contrib
- every andrew user has a symlink to their andrew www
- sync'd up with andrew passwd file
- diffs every day, deleting and adding symlinks
- same setup for orgs -- /afs/andrew/org/ directory of orgs that have requested homedirs
- can have www directory, served via similar rewrite rules and symlink trees
- use symlink trees since apache will try to list /afs/andrew/usr (and explode) if that is in the final path
- magic /var/apache/iscgi.pl
- takes a path to a file, returns 0 or 1 if not cgi or cgi respectively
- every path sent through this perl script
- does a reverse directory walk in order to handle contrib/foo.cgi/arg/arg old-style arguments (which break regexes)
- special-case index.cgi and index.php so you can run them automatically when you list the directory without exploding
- only serve executable files named ending with .php or .cgi as CGI scripts\
- if iscgi returns true and it is in the path such that it is a CGI, *proxy* to contrib-cgi
- thus all requests contrib-cgi sees appear to come from the frontend
- contrib-cgi
- executed by a user on contrib-cgi that is unique to each user
- every andrew user has an account on contrib-cgi
- script takes andrew UID and adds a big offset, goes into /etc/passwd.contrib which ends up via passwd update script in /etc/passwd
- each org gets a cgi user too; generated in a scary way to get (probably) stable names and uids for orgs too
generate contrib-org.conf with a bunch of RewriteRules to send /org/foo to the fake user
- none of this magic happens on club-cgi -- just club users from club passwd file
- suexec
- used by apache to execute scripts with perms of user
- "run this script as this user"
- written in a paranoid way, sanity checks perms
- extensive club modifications
- remove sanity checks which don't work in AFS
- attempts to get krb tickets and afs tokens if you're set up for contrib key (see below)
- sets some rlimits (max processes, max memory) to prevent DoS -- cgi_limits.conf and generate cgi_limits.db
- redirects standard error to logfile -- does not catch forgetting Content-Type header
- contrib key
- contrib/andrewuser@club
- set up via my-contrib
- ssh to gallium with pubkey
- run setuid script with krb admin keytab to generate principal
- spits contrib keytab back out on stdout
- intercepted by my-contrib and written to /var/apache/cgikeys
- my-contrib
- ip vhost on contrib-cgi
- pubcookie
- operates as apache module
- have andrew ssl cert, use to generate symmetric key for pubcookie
- pretty stock setup
- how to use pubcookie documented elsewhere
- scripts in /usr/local/cwscript
- they work
- are reasonably sane
- are in a cronjob on every machine that needs it
- scripts to generate /etc/passwd.contrib, orgtracker mirror
- db keys / db pass -- deprecated
For SSL certificates, see Dealings with Andrew/SSL Certificates
Install on www-node-{1,2} in /etc/apache2/ssl.*