As if the title doesn’t give away the whole plot, I’m currently working on a project that uses PHP–gasp! No, I’ve not abandoned ColdFusion, and too be completely honest, I’m kind of a PHP noob.

Sure, I can do a fair amount of things in PHP…you know, querying databases, echoing content, that kind of stuff. Where I really get hurt, though, is when things start to get complex.

NOTE: If you don’t particularly feel like reading and just want to see some code, be sure to check out my GitHub repo for this.

Some Context

For example, I recently needed to build out an ExtJS View, and I wanted to have some pretty heavily-nested data. My ideal outcome would print something to the page like so:

Main Group 1
   Category 1
      Name 1
      Name 2
         Fav Color 1
         Fav Color 2
Main Group 2
   Category 1
      Name 1
      Name 2
Main Group 3
   Category 2
      Name 1
   Category 3
      Name 1

And so on. In other words, each step in the “tree” can have multiple children, which can themselves have multiple children, ad infinitum.

Now in ExtJS, creating a View is dead simple. You simply have to pass the View a JSON structure wherein each level is an array of objects. My problem, however, was not with the ExtJS side. Rather, it was turning a MySQL query result set into a structure that could be serialized into the proper format and sent off to ExtJS.

An Initial Attempt

Unfortunately, there is nothing in PHP like ColdFusion’s amazing “group” attribute in the <cfoutput> tag. So not knowing what to do, I headed out to The Google, and asked it to tell me about “creating nested json” and “nested data” and the like. For the most part, the answers I found suggested doing something like this:

  • Set a “flag” for the group (this will be updated when you move to a different group)
  • Loop over record set. When you reach a new “group” (e.g., group value != flag), reset the group flag, create a new array for the group data, nest child data, etc.

While this approach certainly works (I used it in my first several *attempts*), it gets hairy really quickly. Sure, it’s not a huge deal when you have a simple “group” -> “item” relationship. However, when you have to start nesting data in more complex fashions, it becomes incredibly unwieldy. Not only is it unsustainable as you have to replicate the functionality for EVERY change in data, but it’s also a nightmare because there are only so many “flags” and “nested flags” that the human brain can keep track of at a single time.

A Better Approach…But Still Not Great

I quickly abandoned this process. it was a headache to write, and the thought of having to do it N times for each query was maddening. So I stepped back a bit, and tried to think a bit more critically about my approach. One big problem, in my estimation, is that I was getting tripped up by the conversion from the flat result set from MySQL to the nested-array model I needed. The flags became a mess really quickly, because they were each trying to keep track of their place within the whole data set, instead of a “parent” only being concerned about whatever children it might have.

With this in mind, my second approach took the form of trying to iteratively reduce the data being dealt with at any particular “level” of the process. My first step was to create a quick function to transform the result set from MySQL into a super-dumb array.

function convertquerytoarray($query) {
   while($row = mysql_fetch_assoc($query)) {
      $rows[] = $row;
   }
   return $rows;
}

With this done, I proceeded to leverage PHP’s foreach() to loop over each record:

$gidx = '';
$groups = array();
$allrows = convertquerytoarray($sqlresult);
foreach($allrows as $row) {
   if($row['group'] != $gidx) {
      $carr = filterarraybykey($allrows,'group',$row['group']);
      ...
   }
}

Now you’ll notice that in the midst of the “group” foreach() loop, I create a new array ($carr) and assign to it the value returned from the filterarraybykey() method. Here’s what that looks like:

function filterarraybykey($array,$key,$val) {
   return array_filter($array,function($row) use($key,$val) {
      return $val==$row[$key];
   });
}

As I mentioned, the idea here is that for each “parent”, evaluating the children (and children’s children) would seem to be much easier if we limit down the data being evaluated within that relationship to only that data which is contained within the relationship.

So, let’s say that we have 3 groups: “Cats, Dogs, Hamsters”. When the “Cats” group is being evaluated, the filterarraybykey() method will take the full result set ($allrows), and return an array that has only the values that have a “group” value of “Cats”…maybe 5 records instead of the whole 20 of the full result set.

Now that I’ve limited the result set, I can continue the pattern farther down, applying the same technique to “categories” and “names”:

$gidx = '';
$groups = array();
$allrows = convertquerytoarray($sqlresult);
foreach($allrows as $row) {
   if($row['group'] != $gidx) {
      $carr = filterarraybykey($allrows,'group',$row['group']);
      $cidx = '';
      $categories = array();
      foreach($carr as $row) {
         if($row['category'] != $cidx) {
            $narr = filterarraybykey($carr,'category',$row['category']);
            $nidx = '';
            $names = array();
            foreach($narr as $row) {
               if($row['name'] != $nidx) {
                  $name = array('name'=>row['name'];
                  array_push($names,$name);
                  $nidx = $row['name'];
               }
            }
            $category = array('category'=>$row['category'],'names'=>$names);
            array_push($categories,$category);
            $cidx = $row['category'];
          }
      };
      $group = array('group'=>$row['group'],'categories'=>$categories);
      array_push($groups,$group);
      $idx = $row['group'];
   }
}

Alright, this feels a little better. Even though it still has a bunch of “flags,” the logic is pretty easy to follow. Since the result set being evaluated at each “level” of the nesting is constantly getting whittled, it eases the burden of trying to figure out where the particular record exists within the WHOLE data set. In this approach, you only have to worry about the current parent and siblings. Plus, since we have nested foreach() loops, each “level” is more or less self-contained.

Although this is a bit better, it still suffers from many of the same issues as the first. Namely, every time this needs to be changed, or every time I need something like this elsewhere (perhaps a grouping of only groups and names, with no categories), I have to recreate this nesting nightmare. If at any point along the way my flags or row pointers get off (via typo, omission, etc.), things will break, and frustration will ensue.

A Final (Hopefully Better) Solution

So while the last approach is still kind of kludgy, I think it’s moving in the right direction. Since each “level” of nesting represents a thin, relationally isolated slice of the total data set, it’s easier to drop things in and out of it to get the results you want. Therefore, with this as a basis, my last shot at a solution was an attempt to abstract the philosophy of this approach. I’ll paste the whole example, and then explain what’s going on:

function makenestedarray($array,$model) {
   // check if array is a mysql result; if so, convert it to a flat array
   $array = gettype($array)=='resource' ? convertquerytoarray($array) : $array;
   // set blank index for tracking which rows we've already added
   $idx = '';
   // blank array to store all the data
   $master = array();
   // retrieve value of "key" for the current iteratin of the data model
   $key = $model['key'];
   // loop over each row in the passed data array
   foreach($array as $row) {
      // if the value is not equal to the index value, evaluate row
      if($row[$key] != $idx) {
         // retrieve data for row based on model definition
         $item = makerowdata($row,$model);
         // if current level has "children" defined, evaluate each child
         if(isset($model['children'])) {
            // loop over array of children
            foreach($model['children'] as $child=>$val) {
            // whittle it down based on the child's data key
               $childarr = filterarraybykey($array,$key,$row[$key]);
               // recursively call makenestedarray()
               $children = makenestedarray($childarr,$val);
               // add nested data to parent array
               $item[$child]=$children;
            }
         }
         // done with all the looping; add current data item to master array
         array_push($master,$item);
         // update the index and kip rows if they have the same index
         $idx = $row[$key];
      }
   }
   // yay! all finished; return the completed data object
   return $master;
}

function makerowdata($row,$struct) {
   $columns = array();
   $keys = array_keys($struct);
   foreach($keys as $pos=>$key) {
      if($key != 'key' && $key != 'children') {
         $columns[$struct[$key]] = $row[$key];
      }
   }
   return $columns;
}

$model = array(
   'key'=>'group',
   'group'=>'group',
   'groupid'=>'id',
   'children'=>array(
      'categories'=>array(
      'key'=>'category',
      'category'=>'category',
      'children'=>array(
         'names'=>array(
            'key'=>'names',
            'nameid'=>'id',
            'name'=>'name'
         )
      )
   )
);
$data = makenestedarray($sqlresult,$model);
$returnjson = array("groups" => $data);
// serialize this mug, and return the string
return json_encode($returnjson);

My solution involves creating a recursive method to do what the last approach did–whittle down the data set being evaluated as the “level” increases. The big difference, of course, is that the recursive method couldn’t care less what the data being thrown at it is, and doesn’t particularly care how many levels of nesting there might be.

And this is what I like about this approach. Instead of manually looping and nesting data, all I have to do is define a “model” of the data in the form I’d like it be prepared, and the recursive makenestedarray() will do the rest.

This has a couple big benefits.

First, if I need to create various nested data arrays for ExtJS of different configurations (and I definitely do), I no longer have to worry about the headache of creating the nested nightmares of the first two “solutions.” I simply plot out exactly the data model that I want to return, and the method takes care of it for me.

And second, the “model” provides a nice visual verification of how the data I’m preparing is going to look when it’s returned as JSON. And if I need to make a quick tweak (like adding an additional data column to be returned at some level within this structure), it’s easy to add as I only need to update the model, and not the mechanism creating the ultimate output.

Wrapping Up

So what are your thoughts? This is the result of a lot of pain, headaches, and trial and error. Since I’m a total PHP noob, I acknowledge that it’s entirely possible (if not probable) that there are much cleaner, easier, and elegant solutions to this seemingly routine need. However, I wanted to write this up, if for nothing else than for my own benefit of reminiscing about the process I went through to get from absolute frustration to a sustainable solution that I feel pretty happy about.