In my Beyond Frameworks talk, I explained how a component-based architecture can help answer some of the important (i.e. expensive!) questions you might face when creating long-lived apps that rely on a PHP framework. In this series of blog posts, I’m going to look at how to go about creating and working with components.

I’m now going under the bonnet of our components, and looking at the different file roles that the PEAR installer expects to find when we distribute our component as a PEAR-compatible package. Although the vast majority of your components will simply be libraries of code, sometimes you might need to ship data files for them to operate on. Thanks to the PEAR installer’s data role, this is possible.

What Is A Data File?

A data file (the ‘data’ file role supported by the PEAR installer) is an arbitrary file that your code either reads from, or (in the case of images et al) serves up to other programs.

There are some important practices and limitations that you need to follow to avoid any disappointments or nasty surprises:

  • Data files should not be PHP code that you plan on executing.

    Your users expect your code to be in the standard place (which is /usr/share/php for popular Linux systems, or vendor/php/ in the per-app sandbox). Follow the rule of no surprises, and make sure you put your code where your users expect to look for it.

    For example, PEAR’s HTTP2_Request component ships PHP code in its data folder, code to rebuild the Public Suffix List (which is, itself, a data file). The chances are that most of HTTP2_Request’s users are unaware that the data folder exists, and that there is code in the data folder which they might need to use. Another example is ezComponent’s ConsoleTools, which ships a UML diagram for ezComponents in its data folder. This probably belongs in the docs folder, if it is meant to be read by developers using ConsoleTools. My local /usr/share/php/ contains many more relevant examples; yours’ probably does too.

    Our component skeleton offers an alternative approach, to ship generate-list.php a properly-installed command-line program, or perhaps as a drop-in command for phix.

  • Data files cannot be written to; your code should treat them as read-only.

    When your data files are installed for system-wide use, the files are owned by the root user. Unless there is an almighty security cock-up, your code will never ever actually get to run with the root user’s security privileges. If your code tries to write to these data files, it will generate a runtime error.

    But this won’t show up when you unit-test your code. So remember, and don’t write the code in the first place :)

Where Do Data Files Go Inside The Component’s Structure?

If we take a look at the ComponentManagerPhpLibrary component, you’ll find the data files are inside in the src/data/ folder. These are the skeleton files used for a PHP library component.

src/data/ is meant to be a folder that holds the data files that you want installed into the computer system.

Where Do Data Files Get Installed?

When you use the PEAR installer to install your component:

$ pear install phix/ComponentManagerPhpLibrary

all of the files in your component’s src/data/ folder gets installed into /usr/share/php/data/<package-name> on your computer:

The PEAR installer’s behaviour here is different to both command-line scripts and PHP code; the installer creates a sub-folder with the name of your package, and then installs your data files into this sub-folder, and not into the main data_dir folder. This isn’t a problem in practice, as long as you are aware of the different behaviour here.

The data file script src/data/php-library/README.md therefore ends up installed onto your computer as /usr/share/php/data/ComponentManagerPhpLibrary/php-library/README.md.

There’s always the possibility that some Linux distros (and Mac OS X) will install data files into a different folder. You can see where the PEAR installer will put these files on your computer by running:

$ sudo pear config-show | grep data_dir
PEAR data directory            data_dir          /usr/share/php/data

If you want to read these data files from your PHP code, you cannot safely hard-code the final installation location into your scripts; it will vary from OS to OS, and will also be different again if your component is installed into a vendor directory. You’ll need to locate these files using a different technique.

How Do I Locate The Data Files From My PHP Code?

Take a look at the top of the LibraryComponentFolder class from the ComponentManagerPhpLibrary component:

class LibraryComponentFolder extends ComponentFolder
{
    const DATA_DIR="@@DATA_DIR@@/ComponentManagerPhpLibrary/php-library";

@@DATA_DIR@@ is a token that, at runtime, the PEAR installer expands to be the fully-qualified path to the top of the computer’s data_dir. Underneath that, you need to remember to add your component’s name to the path, otherwise you’ll be scratching your head and wondering why you can’t find the data files!

(The full instructions that tell the PEAR installer to expand this token are added to your component’s package.xml file when we build the PEAR-compatible package. I’ll look at the final package.xml file in detail towards the end of this series of blog posts).

How Do I Unit Test PHP Code That Relies On Data FIles?

There is one important downside to this technique; any unit tests that rely on loading data from your data directory are going to fail, because @@DATA_DIR@@ is only expanded when the PEAR installer installs your component. At the time of writing, I don’t have an easy solution for this, but leave it with me, and I’ll find a solution for this before the end of this series of blog posts.

7 Comments

  1. till says:
    April 11th, 2011 at 4:58 pm

    Glad you asked! ;-)

    [You know, we all hang out on EFNET in #pear and are always there for you.]

    Just a BTW – from what I can tell, we usually don’t do ‘@@foo@@’, but ‘@foo@’, regardless…

    The solution is simple, in a unit test to discover if e.g. you run from VCS checkout or similar, we usually do the following:

  2. till says:
    April 11th, 2011 at 5:00 pm

    The commenting on your blog is buggy, it stripped out my example code.

    Here is a gist:
    https://gist.github.com/913858

  3. KingCrunch says:
    April 11th, 2011 at 6:10 pm

    I test my packages by creating a local PEAR repository and install the package (including all dependencies) there

    pear config-create path/to/custom/library path/to/custom.pearrc
    pear -c path/to/custom.pearrc config-set devel
    pear -c path/to/custom.pearrc install package.xml

    My UnitTest-bootstrap appends the new library root to the include-path. Works fine :)

  4. Tco says:
    April 12th, 2011 at 7:58 am

    Hello Stuart!

    Could you please tell me what kind of tool do you use to generate the directory structure images?

    Thanks in advance!

  5. Brett Bieber says:
    April 12th, 2011 at 1:31 pm

    Just as you’ve outlined, the @@DATA_DIR@@ replacements can cause problems because the install is then tied to the filesystem. In Pyrus this problem has been resolved, but it does require different development practices than you’re recommending here.

    Packages that comply with the PEAR2 standards should reference their data files using dir(__DIR__).’/data/pear.example.com/MyPackage/datafile’; from the standard Pyrus registry layout:

    data/pear.example.com/MyPackage/datafile
    src/MyPackage.php

    This ensures your package can be installed and subsequently moved without causing problems or requiring re-installation. Keep in mind this is only for packages that use use the PEAR installer version 2 (Pyrus).

  6. Stuart Herbert says:
    April 12th, 2011 at 10:03 pm

    @Tco: I created the images using OmniGraffle for OS X.

  7. Stuart Herbert says:
    April 12th, 2011 at 10:07 pm

    @Brett: from that example, it looks like you are relying on the data/ path always being calculable relative to the source file’s path?