So that I don’t forget how to do this next time around. Worked for me, your mileage may vary.

First step is to get a working install of PHP.

  1. Download PHP 5.4.latest ZIP file from the PHP Windows website
  2. Unpack the ZIP file into c:\php. You should end up with c:\php\php.exe
  3. Copy c:\php\php.ini-development to be c:\php\php.ini
  4. Edit c:\php\php.ini to suit (e.g. set date.timezone)
  5. Make sure you add c:\php to your system PATH (via Computer’s Advanced Properties -> Environment Variables)
  6. Reboot (this is Windows, after all :)

At this point, you should be able to open up a Command Prompt, and type ‘php -v’, and see the response ‘PHP v5.4.latest …’ appear as expected.

Now for PEAR itself.

  1. Open http://pear.php.net/go-pear.phar in a browser, save this file into c:\php
  2. In a Command Prompt, cd to c:\php and then run “php c:\php\go-pear.phar”
  3. At the prompt, select ‘system’. A text menu of paths will appear
  4. Fix the default path for pear.ini (option 11) to be c:\php\pear.ini
  5. Fix the default folder to look inside for php.exe to be c:\php
  6. Make sure the binaries folder (option 4) is c:\php
  7. Check all of the other options, make sure they are prefixed with c:\php
  8. Press ENTER, and you should see PEAR downloading various PEAR packages onto your system
  9. Double-click the PEAR_ENV.reg file in c:\php
  10. Reboot again to make sure PEAR_ENV registry entries have taken effect

At this point, PEAR is installed and should be available to use in your own projects, or with something like Phix.

Personal Notes

Some reminders to myself for the next time I have to do this.

  • Documentation for PHP for Windows and PEAR for Windows both seem to be out of step with current downloads. There’s currently no Windows installer for PHP available, and the PHP .ZIP file doesn’t contain the ‘go-pear.bat’ file.
  • You have to pay close attention to the default folders offered when running ‘go-pear.phar’. They appear to use the current working directory as the prefix even when installing system-wide, except for the location of pear.ini and php.exe – neither of these defaults are sane, and must be manually changed during the install :(
  • After install, pear command doesn’t seem to be 100% compatible with its behaviour on Linux and OS X. -D switch didn’t work, there may be other problems too that I haven’t yet found.
  • Both reboots are required – I’m not taking the piss there – for all running Windows apps to pick up the changes.

7 comments »

One of the questions I’ve been asked after yesterday’s blog post about Phix’s ContractLib is why not just use PHP’s built-in assert() function? I think it’s a great question, and the best way to answer it is to take a look at the key differences between two solutions.

Side By Side Comparison

Feature assert() ContractLib
Implementation PHP extension written in C (ships as standard part of PHP) PHP library written in PHP
Enable / disable execution Partial (there is an overhead when disabled, but it’s low) Partial (there is an overhead when disabled, but it’s higher)
Issues PHP4-style warning when tests fail Yes (configurable) No (throws a ContractFailedException instead)
Terminate PHP script when tests fail Yes (configurable) Only if the ContractFailedException is never caught
Quiet eval of test expression Yes (configurable) No (not required; test expressions are pure PHP code, not eval() strings)
Callback on failed test Yes (configurable) No (unwinds the stack instead by throwing ContractFailedException)
Throws Exception when tests fail No (but can emulate if you write your own assert() callback method) Yes (standard behaviour)
Tests are pure PHP code No – recommended way is to pass strings into assert() to be eval()’d Yes
Error report includes original value that failed the test No Yes
Support for per-test custom failure messages No Yes – are required to provide one
Support for Old Value storage and recall No (but can emulate by hand) Yes

The Differences Explained

The key difference is one of philosophy. assert() sits well with the PHP4 style of error reporting and handling, whereas ContractLib is firmly in favour of the OO world’s approach to reporting errors.

It’s a personal preference, but I think that PHP4-style errors have no place in code that has any desire to be robust. Exceptions aren’t perfect, don’t get me wrong, but their core property of unwinding the call stack in an orderly fashion makes writing robust code much easier. And they also carry a payload – information about what went wrong and why – which PHP’s assert() cannot provide to the same extent.

It’s much quicker to debug something when there’s a record of the value that failed the test. For that reason alone, I’d always prefer something like ContractLib over the built-in assert() approach.

But we can’t ignore the fact that these are tests that get shipped to, and executed in, the production environment. Unlike unit tests, adopting programming by contract will slow down your PHP code in production. The question is: by how much?

What About The Performance?

I’ve done some benchmarking between the two, using the five tests listed in the final example in yesterday’s blog post. It’s a real-world example of the kind of tests that I would look to add to code to improve robustness.

Here are the results I gathered, calling the tests 10,000 times in a tight loop. The tests were run from the command line, and the times do include PHP start-up / shutdown time and the time taken to parse each test file. I assumed a best-case scenario, where the tests would always pass.

Test Approach Time w/ Tests Disabled Time w/ Tests Enabled
Tests written using assert() 1.103s (100%) 5.989s (543%)
Tests written using ContractLib 3.055s (277%) 3.096s (281%)

When tests are disabled, using assert() is much cheaper than using ContractLib today. That’s to be expected, as assert() is written in C. I imagine that we could get close to the same performance if ContractLib was rewritten in C as a PHP extension.

But, when tests are enabled, assert() is much slower than ContractLib. Why? Because the recommended way to use assert() is to pass the test in as a string. PHP has to convert that string into bytecode to execute, and that conversion appears to be quite expensive.

Given the choice, I’d rather trade things running a little slower in production for having much faster tests when I’m writing code, and that’s why I created ContractLib. Plus I get much better information to understand why the test failed, and if I wanted to run the tests in production, I can handle their failures in a much saner way.

Final Words

In my experience, the time it takes to develop and ship code is normally more critical than how fast the code runs in production. Developer time has become a scarcer resource than CPU time.

Used intelligently, these kinds of tests in your code can help your team deliver quicker, because the code they are using and reusing is more robust first time around. Programming by contract is different to, and complements, unit testing because contract tests catch errors in using the code.

Whether you use ContractLib, assert(), or you create your own solution, you should really consider how much it is costing you when you don’t use these kinds of tests.

7 comments »

In my last blog post, I introduced ContractLib, a simple programming by contract library that I’ve created for PHP 5.3 onwards. And I promised some examples :)

Installing ContractLib

ContractLib is available from the Phix project’s PEAR channel. Installing it is as easy as:

$ pear channel-discover pear.phix-project.org
$ pear install -a phix/ContractLib

At the time of writing, this will install ContractLib-2.1.0. We use semantic versioning, so these examples will continue to work with all future releases of ContractLib-2.x.

Adding ContractLib To Your Project

Assuming you’re using a PSR-0 compatible autoloader, just import the Contract class into your PHP file:

use Phix_Project\ContractLib\Contract;

Adding A Pre-condition Contract To Your Method Or Function

Take a trivial method like this:

class ActionToApply
{
    public function appendNow($params)
    {
        $params[] = time();
    }
}

This method works fine … until someone passes a non-array as the parameter. At that point, your code stops working – not because your code is wrong, but because someone used it in the wrong way. This is a classic cause of buggy PHP apps. Thankfully, it’s very easy to address using ContractLib.

If we were certain that the $params parameter was always an array, then we can keep the method itself extremely simple and clean. We can ensure that by adding a pre-condition using ContractLib.

use Phix_Project\ContractLib\Contract;

class ActionToApply
{
    public function appendNow($params)
    {
        Contract::Preconditions(function() use ($params)
        {
            Contract::RequiresValue(
                $params,
                is_array($params),
                '$params must be an array'
            );
        });

        // original method code continues here
        $params[] = time();
    }
}

Now, if someone passes in a non-array, the caller will automatically get an E5xx_ContractFailedException, which makes it clear that the fault is in the caller’s code … not your’s.

PHP 5.4′s upcoming support for better type-hinting is another way to catch this kind of error, but not only does ContractLib work today with PHP 5.3 (which means you don’t have to wait to migrate to PHP 5.4), but also that you can write tests for anything, not just the checking that’s built into PHP.

This means you can make your code much more robust, by tightening up on the quality of the parameter passed into your code by other programmers. To extend our example, we might decide that an empty array is also unacceptable:

use Phix_Project\ContractLib\Contract;

class ActionToApply
{
    public function appendNow($params)
    {
        Contract::Preconditions(function() use ($params)
        {
            Contract::RequiresValue(
                $params,
                is_array($params),
                '$params must be an array'
            );
            Contract::RequiresValue(
                $params,
                count($params) > 0,
                '$params cannot be an empty array'
            );
        });

        // original method code continues here
        $params[] = time();
    }
}

The point here is that we can go way beyond type-hinting checks (important as they are) and look inside parameters to make sure they are suitable.

Here’s a real example from Phix’s CommandLineLib:

use Phix_Project\ContractLib\Contract;

class CommandLineParser
{
    // ...

    public function parseSwitches($args, $argIndex, DefinedSwitches $expectedOptions)
    {
        // catch programming errors
        Contract::Preconditions(function() use ($args, $argIndex, $expectedOptions)
        {
            Contract::RequiresValue(
                $args,
                is_array($args),
                '$args must be array'
            );
            Contract::RequiresValue(
                $args,
                count($args) > 0,
                '$args cannot be an empty array'
            );

            Contract::RequiresValue(
                $argIndex,
                is_integer($argIndex),
                '$argIndex must be an integer'
            );
            Contract::RequiresValue(
                $argIndex,
                count($args) >= $argIndex,
                '$argIndex cannot be more than +1 beyond the end of $args'
            );

            Contract::RequiresValue(
                $expectedOptions,
                count($expectedOptions->getSwitches()) > 0,
                '$expectedOptions must have some switches defined'
            );
        });

        // method's code follows on here ...
    }
}

In this real-life code, we start off by checking for basic errors first (by making sure we’re looking at the right type for each parameter), and then we follow up with more specific tests, that ensure that we have data that we’re happy to work with. We’ve done these tests at the start of the method, so that it isn’t cluttered with error checking, which makes our code much cleaner that it might otherwise be. And, because all the tests are in one really easy to spot block, anyone reading your code can immediately see what they have to do to meet the contract you’ve created.

Because these tests are just plain-old PHP code, and don’t rely on annotations or any other such nonsense, the contracts you create and enforce are limited only by your choices.

But Aren’t All Those Tests Slow?

They are. PHP’s getting better and better at this, but function/method calls have always been painfully slow in PHP. I’m afraid that if you want robust code, you can’t have it for free. (You can in C, but that’s a topic to discuss over a decent whiskey at a conference).

I’ve done key two things with ContractLib to keep the runtime cost down:

  1. Contract::Preconditions() accepts a lambda function as its parameter. Your contract’s tests go inside this lambda function, and Contract::Preconditions() only calls the lambda function if contracts are enabled.
  2. By default, ContractLib does not enable contracts. You have to choose to do so by calling Contract::EnforceWrappedContracts().

This keeps the overhead down to just one method call (to Contract::Preconditions()) when contracts are not enabled. It isn’t as good as having no overhead, but it’s cheaper than the developer time lost trying to track down bugs in code that always assumes the caller can be trusted to do the right thing every time.

Any Questions?

I hope these examples have given you an idea on how to get started with ContractLib. If you have any questions or suggestions, please let me know, and I’ll do my best to answer them.

4 comments »

Introducing ContractLib

Posted by Stuart Herbert on January 11th, 2012 in 2 - Intermediate.

ContractLib is a simple-to-use PHP component for easily enforcing programming contracts throughout your PHP components. These programming contracts can go a long way to helping you, and the users of your components, develop more robust code.

ContractLib is loosely inspired by Microsoft Research’s work on the Code Contracts Library for .NET.

What Are Programming Contracts?

Programming contracts are tests around functions and methods, and they are normally used:

  1. to catch any ‘bad’ data that has been passed into the function or method from the caller, and
  2. to catch any ‘bad’ data generated by the function or method before it can be returned to the caller

These are pre-condition and post-condition tests, and they are tests that either pass or fail.

Why Have Programming Contracts?

Two reasons: code robustness and time saved.

Programming contracts catch errors early, and (unlike unit tests) they don’t just catch your errors, they catch errors made by programmers who reuse your code.

  • Catching errors early

    There is a class of bugs best described as garbage in, garbage out. The “garbage in” is data that is of the wrong type, or out of range, or missing (think empty arrays, empty strings, nulls). Often, the garbage being fed in is also garbage that has come out of a buggy function or method.

    Simple pre-condition checks at the start of your functions and methods quickly catches garbage data before it can propagate through your code. The more functions and methods contain pre-condition checks, the easier it becomes to catch garbage data closer to where it is being created. This allows you to spend less time tracking down the original source of a bug, and more time writing new code.

    These pre-conditions also greatly increase the chances of bugs in your code being caught in development, especially when combined with a healthy amount of unit testing.

    You can also add post-conditions at the end of your functions and methods, to make sure that you’re never returning any garbage back out of your function or method. There’s a lot of overlap between post-conditions and unit tests; the main difference is that your post-conditions will run 100% of the time, whereas your unit tests will only run when you run them and against the (often extremely limited) data you use in your unit tests.

  • Catching errors when code is reused

    Unit tests are great, and a very important part of creating high-quality code. But they’re your tests. They’re written to prove that your code does what you think it does.

    Unit tests don’t prove that someone else is reusing your code the way you meant them to.

    And neither do integration tests, because if someone is reusing your code, integration tests are their tests. Integration tests are tests to prove that they have glued their code on top of your code in a way that they are happy with.

    Simple pre-condition checks at the start of your functions and methods are your best opportunity to test how someone else is reusing your code, and to tell them if they’re getting it wrong.

Programming contracts are about building trust (just like unit tests). Code that you can trust is normally code that is quicker to work with. They’re really quick to write (normally far quicker than unit tests), and they can make it really quick to track down the origin of bugs in your code.

Don’t Programming Contracts Make Code Stupidly Strict?

An appropriate amount of strictness is a requirement of all high-quality code. The trick is knowing what to be strict about. Not strict enough, and you let in shitty data that causes your code to fail or be insecure. Too strict, and people will think that your code is too much trouble to work with.

As a general rule, pre-conditions should check for:

  • data that’s in an incorrect format
  • data that’s out of range
  • data that’s missing

Post-conditions should check the same things. They can also be used to check for data that should have been changed, but hasn’t been changed.

Aren’t Programming Contracts Too Old-Fashioned For PHP?

The concept has been around for decades. As a C programmer, I first learned about programming contracts in the early 90′s, when I was writing code that had to run for months at a time with zero downtime. We were debugging and improving code dating back from the 1980′s, and introducing programming contracts played an important role in getting to the bottom of many of the bugs that users reported.

PHP code (and other modern languages like Java, Ruby, Scala etc) is fundamentally similar to older languages like C, although you may not realise that this is the case. It’s the same fundamental paradigm – data is passed into blocks of software, and blocks of software may also pass data out too.

The advantage we have with PHP is that our programming contracts don’t have to be as lengthy as they would for a C program, because PHP itself can enforce type checks through type hinting, and we don’t have to worry about low-level details like proper handling of null-terminated strings.

Examples

You can take a look at ContractLib’s unit tests on GitHub.

I’ll post some detailed examples in my next blog post.

6 comments »

If you’re building a web-based app, it’s always a good idea to build some instrumentation into your app. That way, you can see how your app is behaving, and how your users are interacting with your app over time.

I’m sure everyone who reads my blog is familiar with Google Analytics for tracking page hits. But what about what’s happening inside your app? Right now? Do you know?

Graphite is one way to graph the stats that you add to your app. Combine it with (say) statsd from Etsy, and adding any stats you want is easy. (Read this blog post from Etsy if you want to learn more about measuring your app, and how to add support for Statsd to your app).

Normally, you’ll probably be interested in looking at graphs that show your stats over a period of hours or days (for trend analysis), and both Graphite and Statsd are sensibly tuned for that. But what if you want to see what’s happening now, in real time? I couldn’t find any clear instructions on how to do that elsewhere, so here’s my take on how to do it. I’m assuming you’re already familiar with installing both Statsd and Graphite, and that you’ve already had both up and running successfully with their default configurations.

Making Statsd Forward Data In Real Time

By default, Etsy’s Statsd collects the data sent from your apps, and forwards it on to Graphite’s data collector (known as Carbon) every 10 seconds. For real time, we need the data forwarded on every second. To do that, edit your Statsd config, adding the ‘flushInterval’ value:

{
  graphitePort: 2003
, graphiteHost: "localhost"
, port: 8123
, flushInterval: 1000
}

A value of 1000 tells Statsd to forward data on every second.

Making Graphite Store Data At One Second Resolution

Graphite’s default / sample configuration tells it to store incoming data at 60-second resolution; that allows us to look at the total stats recorded minute by minute, but we can’t drill down to see what happens second by second. To do that, we have to tell Graphite to store the data on a second-by-second basis.

Edit /opt/graphite/conf/storage-schemas.conf, and add the following clause:

[real_time]
priority = 200
pattern = ^stats.*
retentions = 1:34560000

This tells Graphite that we want all the data received from Statsd to be kept on a second-by-second basis for 400 days … plenty long enough for any sort of comparison you might need to do.

I found that, to get Graphite to start using this new storage definition, I had to go and delete the existing data files by doing:

$ rm -rf /opt/graphite/storage/whisper/stats*

Getting Graphite To Draw Real-Time Graphs

Now all we need to do is to get Graphite showing you all the collected data in real-time. By default, Graphite will happily plot the data onto a graph, but will only generate an updated graph every 60 seconds. That’s perfect for an ops team looking for trends over hours, but it isn’t real-time.

If you’re using Memcache with Graphite, you’ll need to add this to your /opt/graphite/webapp/graphite/local_settings.py file, to tell Graphite to only cache data in Memcache for 1 second:

MEMCACHE_DURATION = 1

Is it worth caching the data at all at this resolution? Honestly, I don’t know. I guess that depends on how many people need to watch the data in real-time or not. Ideally, it would be better if Graphite dynamically set the Memcache timeout based on the data stored in the particular key, but for now, you need to either stop using Memcache, or set the cache duration to 1 second.

This now gives you graphs with 1-second resolution … now we just need to change the Graphite web-app’s auto-refresh feature to load a new graph every second. By default, it will only generate an updated graph every 60 seconds. To change this, we have to edit some of the app’s Javascript code.

  • Open the file /opt/graphite/webapp/content/js/composer_widgets.js, and locate the function ‘toggleAutoRefresh’.
  • Locate the ‘interval’ variable inside that function. Change its value from ’60′ to ’1′.
  • Save the file, then refresh your web browser page.

Et voila. If you switch on auto refresh, you should now be able to see your app’s data being plotted second by second, giving you a real-time view of what your app is doing.

4 comments »
Page 1 of 3123