R.I. Pienaar | R.I.Pienaar

Designing a Single Sign On system – part 1

by R.I. Pienaar | Apr 13, 2008 | Code

I use a PHP development framework that I have been building on and improving for the last 4 or 5 years. I used a framework called Roadsend SiteManager years ago but when the amount of non-backward compatible changes they were making got too much I started working on my own.

My framework is a MVC based system that recently had a good refresh to make good use of the improved OO abilities of PHP 5, it isn’t really generic, its more tailor designed to work the way I do and to help me make things I do often easier. As a result of using this framework for years now I have about 15 or so sites developed with it, some for my personal stuff – like a database of Film rolls – other for clients. Till now everything had their own user database which makes changing passwords a right pain in the butt not to mention a huge job to adjust all the sites if/when I want to add additional authentication features.

So I started looking into Single Sign On systems but came up a bit short for complete ones that fits my bill, my needs are more or less:

Server and Client libraries in PHP but also client libraries in other languages.
Should store Real Name, Email Address, Time Zone and possibly other bits of information and share it on demand with other sites.
Work cross domains, so a single SSO server should be able to serve all my sites and possibly those of 3rd parties should they wish to. This basically means the SSO server shouldn’t just go and set a domain cookie.
Have a good management system where users can manage their identities in a self-service manner.
Client sites should be registered with the server before they can use the SSO system. When a users identity is accessed he should be shown information about description, contact details etc. it should be completely open to the user when his information gets shared and he should be able to say no.
The server should keep a log of all uses of the identity, in practice only the first requests do get logged, after that the user can be kept logged in on the clients and so new requests do not get made to the server.
Authentication should be modular so you can plug any user database into the server. For instance I can share a single database between my imap, pop, smtp, ticketing system and any of my sites using the SSO system.
Legacy applications and third party applications that rely on standard HTTP Authentication should be able to use it, at least when run under Apache. I want to be able to log into Cacti, Nagios, RT and others with a single login.

This is quite a long list of requirements, none of them are particularly difficult to be honest, the hardest part is the HTTP Authentication plugin, mostly because it relies on writing either a C Apache module or something in mod_perl.

I have written a SSO system that complies to these requirements and will over the course of a number of blog posts detail the design of such a system. My intention is to open source the main server library and client library in PHP but won’t, at least for the foreseeable future, release the management application as that is quite a big deal to release a truly end user ready big web application like this.

I don’t know how many posts it will take to go through it all and I also don’t know how long it will take me to do them all, I guess I will in these posts retrofit a spec onto what I developed and hopefully pick up on bits that I missed along the way. I hope this might be of use/interest to someone out there.

UPDATE: Part 2 is here.

Easy transparent PHP input filtering

by R.I. Pienaar | Mar 25, 2008 | Code

I have been working on a site that will have potentially quite a few random third parties accessing it and inserting data into a MySQL database. I am thus quite keen on a good solid input filtering method for PHP to prevent things like XSS and SQL Injection.

There are several options out there, of the ones I found Inspekt is about the closest match to my way of working, it essentially imports $_GET, $_POST etc and wraps them in an object which you then use to access variables in a filtered method. It by default then NULLs the original variables so you cannot access them anymore, if backward compatibility is desired it can leave the originals untouched. Not optimal as this gives an unsafe by default result if you want to maintain backwards compatibility.

Another problem with this approach is that it is a lot of work to change existing code, which you might thing is just par for the course but I was convinced I need to find a way to do so more transparently.

I could for example at program start just walk through the $_GET etc arrays and apply some filtering to them using addslashes() and such but this is very restrictive, what if you need to get it unfiltered, especially if you perform destructive filtering? How would you go about filtering some variables for phone numbers, some for email addresses etc?

The answer lies in PHP’s new Standard Programming Library, specifically in its ArrayAccess interface, which if you don’t care for older versions of PHP is the way to go.

The basic advantage of this is that you can expose properties of your objects by using array notation rather than object notation:

$result = $foo->getBar();

compared to:

$result = $foo[‘bar’];

Both statements give access to the private variable $bar just using different syntax. So using this technique we can write a transparent filter for input variables, the basic usage of the final library would be something along these lines:

$_GET = new ArrayArmor($_GET);

print (“Filtered Variable:$_GET[test]<br>\n”);

print (“Unfiltered Variable: ” . $_GET->getRaw(“test”));

A possible output from this script can be seen below:

Filtered Variable: 1234\’;delete from accounts;–
Unfiltered Variable: 1234′;delete from accounts;–

You can see that the default behavior is to protect the input but even for destructive filtering methods the raw unfiltered data would be available if the programmer needed it. You can provide all sorts of extra methods to validate emails, post codes and such.

A quick and dirty example of a class that provides this kind of filtering can be seen below:

<? class ArrayArmor Implements ArrayAccess { private $original;

    function __construct (&$variable) {
        $this->original = $variable;
    }

    function offsetExists($offset) {
            return isset($this->original[$offset]);
    }

    function offsetGet($offset) {
        return addslashes($this->original[$offset]);
    }

function offsetSet($offset, $value) {
}

function offsetUnset($offset) {
}

    function getRaw($offset) {
        return($this->original[$offset]);
    }
}
?>

So that’s it, a simple method that is very easy to put into existing code. This is clearly not a full example as addslashes() is hardly the be-all and end-all of input protection, but if you build on this you can get a very easy to use and flexible input filter that is safe by default.

Nasty PHP Authentication Handling

by R.I. Pienaar | Mar 18, 2008 | Code, Front Page

Sometimes you come across things that just make you wonder what is going on in peoples minds.

For years everyone who wrote applications compatible with the standard HTTP Authentication method has used the REMOTE_USER server variable as set by Apache to check the username that was logged in by the webserver, this has worked well for everyone, CGI’s and all would just grab it there and everyone would be happy.

Along comes PHP and they make great big mess of it, PHP suggests that we use $_SERVER[‘PHP_AUTH_USER’] instead, and they give some good reasons for this too, except they have severely crippled this for all but Basic and Digest authentication, the following code from main/main.c

        if (auth && auth[0] != ‘\0’ && strncmp(auth, “Basic “, 6) == 0) {
                char *pass;
                char *user;

                user = php_base64_decode(auth + 6, strlen(auth) – 6, NULL);
                if (user) {
                        pass = strchr(user, ‘:’);
                        if (pass) {
                                *pass++ = ‘\0’;
                                SG(request_info).auth_user = user;
                                SG(request_info).auth_password = estrdup(pass);
                                ret = 0;
                        } else {
                                efree(user);
                        }
                }
        }

As you can see above, they only import the user and pass from Apache if the AuthType is Basic, this makes no sense at all. Why not just check with Apache, if it set the username then import it? Surely Apache know if a user has authenticated? Ditto for password. It is so broken in fact that PHP in CGI mode also doesn’t work since those headers don’t get set for that either, countless comments and nasty hacks can be found in the PHP user contributed notes about this, but it is all just sillyness.

The reason this is annoying me is that I have written a Single Singon system in PHP, you can host a identity server on any domain and hook any site in any other domain into the SSO system, its a bit like TypeKey

Of course it’s nice to have a easy to use SSO system in PHP but what is the point if you can’t make legacy apps like Nagios, Cacti, RT etc play along with the SSO? So to solve this I extended Apache::AuthCookie with a new mod_perl module that plugs into Apache and does authentication using my SSO and a small bit of glue that you put on your RT/Cacti/Nagios box.

All’s great, I have SSO to Nagios, RT and countless other things working flawlessly, except of course Cacti because it’s written along the lines of the PHP manual, uses PHP_AUTH_USER instead of REMOTE_USER and so my new fancy AuthType in Apache does not work with Cacti. As it turns out its a quick 2 liner fix in the Cacti code but you would think PHP would be a bit more generic in this regard since as it stands now I think a lot of people who want to do SSO using hardware tokens and such have issues with PHP being silly.

Macs and MS Keyboards

by R.I. Pienaar | Mar 12, 2008 | Uncategorized

Previously I posted about my iMac 17″ that I got, that was January 2006 well I have now upgraded to a bigger mac, this time a 24″ iMac Core 2 Duo Extreme with 2GB RAM.

I still have the 17″ and will keep it, it’s replacing my really old AMD Linux Desktop on my desk but the 17″ has been getting a bit long in the tooth with Parallels, MS Office, and all sorts of other stuff that I have been doing on it as I am now working full time from home.

Previously I bought at the bottom of the spectrum and the machine lasted well, but I was hoping to keep it as my primary machine for at least 3 years. I guess my needs have increased though so this time I bought at the top end of the range and will upgrade it to 4GB RAM soon, just not from Apple as buying direct from Crucial will save me about 200 pounds.

What immediately annoyed me – to the point of cramps in my hands and general unhappyness – were this amazingly crap thinline keyboard that comes with the machines. I soon started looking at other options and found no 3rd party Mac keyboards but did notice that Microsoft keyboards have a utility to configure the various additional keys etc so I took the plunge and got a MS Natural Ergonomic 400 keyboard to replace my very old MS Office keyboard.

I am extremely pleased with this keyboard, everything works as it should. The configuration utility lets you configure every key on the keyboard and everything is mapped correctly as expected. Even the function keys like ‘new’ works by sending ‘apple key-n’ etc right out of the box, this is the case with all the MS keyboards on the market today so I can happily recommend any MS keyboard to mac users.

The iMac itself is lovely, I am really happy with it. Speed wise the Core 2 Duo Extreme chip has made a huge improvement, with Parallels running Windows the machine idles at about 2% while I have Firefox, Netnewswire, iTerm, several Terminal.app, Adium, Skype and all sorts of background stuff going, really cannot have asked for more from a desktop machine.

Detailed Apache Stats

by R.I. Pienaar | Mar 5, 2008 | Code, Front Page

Apache has its native mod_status status page that many people use to pull stats into tools such as Cacti and other RRDTool based stats packages. This works well but does not always provide enough details, questions such as these remain unanswered:

How many of my requests are GET and how many are POST?
How many 404 errors and 5xx errors do I get on my site as a whole and for script.php specifically?
What is the average response time for the whole server, and for script.php?
How many Closed, Keep Alive and Aborted connections do I have?

To answer this I wrote a script that keeps a running track of your Apache process, it has many fine grained controls that let you fine tune exactly what to keep stats on. I got the initial idea from an old ONLamp article titled Profiling LAMP Applications with Apache’s Blackbox Logs.

The article proposes a custom log format that provides the equivelant to an airplanes blackbox, a flight recorder that records more detail per request than the usual common log formats do. I suggest you read the article for background information. The article though stops short of a full data parser so I wrote one for a client who kindly agreed that I can opensource it.

Using this and some glue in my Cacti I now have graphs showing a profile of the requests I receive for the whole site, but as you are able to apply fine grained controls to select what exactly you’ll see, you could get per server overview stats and details for just a specific scripts performance and statuses:

The script creates on a regular interval a file that contains the performance data, the data is presented in variable=value data pairs, I will soon provide a Cacti and Nagios plugin to parse this output to ease integration into these tools.

The performance data includes values such as:

Amount of requests in total
Total size of requests separated by in and out bytes
Average response time
Total processing time.
Counts of connections in Close, Keep Alive and Aborted states.
Counts for each valid HTTP Status code, and aggregates for 1xx, 2xx, 3xx, 4xx and 5xx.
The amount of GET and POST requests.
And detail for each and every unique request the server serves.

See the Sample Stats for a good example, variables are pretty self explanatory. To keep the data set small and manageable 2 selectors exist, one to choose which requests to keep details for and which to keep stats for. These can be combined with standard Apache directives such as Location to provide very fine grained stats for all or a subset of your site.

You would need some glue to plug this into Cacti and Nagios, I will provide a script for this soon as I have time to write up some docs for it.

Install guide etc can be found on my GitHub there is also extensive Perldoc Docs in the script, the GitHub also have links to downloading the script.

« Older Entries

Next Entries »

Designing a Single Sign On system – part 1

Easy transparent PHP input filtering

Nasty PHP Authentication Handling

Macs and MS Keyboards

Detailed Apache Stats

Licence