Differences between revisions 1 and 2
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
From a presentation by mdille3 on 2009-09-22
 * (diagram) 3 layers
From a presentation by mdille3 on 2009-09-22. Take a look at [attachment:contrib_talk_summary.pdf his slides] and [attachment:contrib_recording.m4a a recording of the presentation].

 * 3 layers (see diagram in slides)

From a presentation by mdille3 on 2009-09-22. Take a look at [attachment:contrib_talk_summary.pdf his slides] and [attachment:contrib_recording.m4a a recording of the presentation].

  • 3 layers (see diagram in slides)
    • load balancers
      • everything through here -- lb1 master lb2 backup
      • if lb1 down lb2 steals ips
      • virtual ips of services get swapped around -- www1 www2 www-contrib1 www-contrib2
      • www1 doesn't really correspond to a real machine -- virtual ip -- only certain ports forwarded to a real machine
      • DNS A records for www -- go to www1 www2 ips
      • do not run AFS clients
    • frontends
      • www-node-{1,2} www-contrib-node-{1,2}
      • possibly redirect to localhost instead of back to lbs if accessing lb tier from here?
      • what actually run apache
      • get sad if AFS gets sad
      • serve static web pages (anything that's not CGI in AFS) -- www.club www.contrib ftp rsync
      • do magic to figure out requests that are for CGI scripts and forward to contrib-cgi and club-cgi as appropriate
    • cgi servers
      • never hit directly
      • club-cgi contrib-cgi my-contrib (configuration for contrib)
      • no lb on my-contrib
  • linux virtual services
    • tech behind lb
    • can be done via NAT (we don't do this)
      • all traffic has to go through gateway
      • all state lost on failover -- okay to drop connections on short-lived www connections
    • can have lb forward requests, backends respond directly
      • sort of ip tunneling -- put new header on the top
      • machines must support encapsulation of packets
      • get more fragmentation
      • spoofing prevention can break this, since backend responds with virtual ip of lb
    • can do that, but instead of sending ip, can send raw ethernet packets if you're on the same ethernet segment
      • can mess up ARP tables -- service servers do not ARP
      • this is what we do
  • (try to) run debian apache
    • lies in /var/apache/etc or /etc/httpd
    • main config includes chost files that do a lot of magic
    • description of /var/apache/etc/contrib.conf
      • document root /afs/club/www/contrib
      • magic lies in rewrite rules
        • redirects ~foo /user/foo /usr/foo to symlinks via rewrite rules
        • look at contrib.conf, RewriteRules are well documented

        • a lot of little silly fixes to give expected behavior
      • scary symlink tree at /var/apache/andrew-contrib
        • every andrew user has a symlink to their andrew www
        • sync'd up with andrew passwd file
        • diffs every day, deleting and adding symlinks
      • same setup for orgs -- /afs/andrew/org/ directory of orgs that have requested homedirs
        • can have www directory, served via similar rewrite rules and symlink trees
      • use symlink trees since apache will try to list /afs/andrew/usr (and explode) if that is in the final path
  • magic /var/apache/iscgi.pl
    • takes a path to a file, returns 0 or 1 if not cgi or cgi respectively
    • every path sent through this perl script
    • does a reverse directory walk in order to handle contrib/foo.cgi/arg/arg old-style arguments (which break regexes)
    • special-case index.cgi and index.php so you can run them automatically when you list the directory without exploding
    • only serve executable files named ending with .php or .cgi as CGI scripts\
    • if iscgi returns true and it is in the path such that it is a CGI, *proxy* to contrib-cgi
    • thus all requests contrib-cgi sees appear to come from the frontend
  • contrib-cgi
    • executed by a user on contrib-cgi that is unique to each user
    • every andrew user has an account on contrib-cgi
    • script takes andrew UID and adds a big offset, goes into /etc/passwd.contrib which ends up via passwd update script in /etc/passwd
    • each org gets a cgi user too; generated in a scary way to get (probably) stable names and uids for orgs too
    • generate contrib-org.conf with a bunch of RewriteRules to send /org/foo to the fake user

    • none of this magic happens on club-cgi -- just club users from club passwd file
    • suexec
      • used by apache to execute scripts with perms of user
      • "run this script as this user"
      • written in a paranoid way, sanity checks perms
      • extensive club modifications
        • remove sanity checks which don't work in AFS
        • attempts to get krb tickets and afs tokens if you're set up for contrib key (see below)
        • sets some rlimits (max processes, max memory) to prevent DoS -- cgi_limits.conf and generate cgi_limits.db
        • redirects standard error to logfile -- does not catch forgetting Content-Type header
  • contrib key
    • contrib/andrewuser@club
    • set up via my-contrib
      • ssh to gallium with pubkey
      • run setuid script with krb admin keytab to generate principal
      • spits contrib keytab back out on stdout
      • intercepted by my-contrib and written to /var/apache/cgikeys
  • my-contrib
    • ip vhost on contrib-cgi
  • pubcookie
    • operates as apache module
    • have andrew ssl cert, use to generate symmetric key for pubcookie
    • pretty stock setup
    • how to use pubcookie documented elsewhere
  • scripts in /usr/local/cwscript
    • they work
    • are reasonably sane
    • are in a cronjob on every machine that needs it
    • scripts to generate /etc/passwd.contrib, orgtracker mirror
  • db keys / db pass -- deprecated

Services/Contrib Overview (last edited 2016-04-22 01:24:39 by tparenti@CLUB.CC.CMU.EDU)