XML string to PHP array

UPDATEthe code is now on github

One common need when working in PHP is a way to convert an XML document into a serializable array. If you ever tried to serialize() and then unserialize() a SimpleXML or DOMDocument object, you know what I’m talking about.

Assume the following XML snippet:

<tv>
  <show name="Family Guy">
    <dog>Brian</dog>
    <kid>Chris</kid>
    <kid>Meg</kid>
  </show>
</tv>

There’s a quick and dirty way to do convert such a document to an array, using type casting and the JSON functions to ensure there are no exotic values that would cause problems when unserializing:

<?php
  $a = json_decode(json_encode((array) simplexml_load_string($s)),1);
?>

Here is the result for our sample XML, eg if we print_r($a):

Array
(
    [show] => Array
        (
            [@attributes] => Array
                (
                    [name] => Family Guy
                )
            [dog] => Brian
            [kid] => Array
                (
                    [0] => Chris
                    [1] => Meg
                )
        )
)

Pretty nifty, eh? But maybe we want to embed some HTML tags or something crazy along those lines. then we need a CDATA node…

<tv>
  <show name="Family Guy">
    <dog>Brian</dog>
    <kid>Chris</kid>
    <kid>Meg</kid>
    <kid><![CDATA[<em>Stewie</em>]]></kid>
  </show>
</tv>

The snippet of XML above would yield the following:

Array
(
    [show] => Array
        (
            [@attributes] => Array
                (
                    [name] => Family Guy
                )
            [dog] => Brian
            [kid] => Array
                (
                    [0] => Chris
                    [1] => Meg
                    [2] => Array
                        (
                        )
                )
        )
)

That’s not very useful. We got in trouble because the CDATA node, a SimpleXMLElement, is being cast to an array instead of a string. To handle this case while still keeping the nice @attributes notation, we need a slightly more verbose conversion function. Here is my version, hereby released under a do-whatever-but-dont-sue-me license.

<?php
/**
 * convert xml string to php array - useful to get a serializable value
 *
 * @param string $xmlstr 
 * @return array
 * @author Adrien aka Gaarf
 */
function xmlstr_to_array($xmlstr) {
  $doc = new DOMDocument();
  $doc->loadXML($xmlstr);
  return domnode_to_array($doc->documentElement);
}
function domnode_to_array($node) {
  $output = array();
  switch ($node->nodeType) {
   case XML_CDATA_SECTION_NODE:
   case XML_TEXT_NODE:
    $output = trim($node->textContent);
   break;
   case XML_ELEMENT_NODE:
    for ($i=0, $m=$node->childNodes->length; $i<$m; $i++) { 
     $child = $node->childNodes->item($i);
     $v = domnode_to_array($child);
     if(isset($child->tagName)) {
       $t = $child->tagName;
       if(!isset($output[$t])) {
        $output[$t] = array();
       }
       $output[$t][] = $v;
     }
     elseif($v) {
      $output = (string) $v;
     }
    }
    if(is_array($output)) {
     if($node->attributes->length) {
      $a = array();
      foreach($node->attributes as $attrName => $attrNode) {
       $a[$attrName] = (string) $attrNode->value;
      }
      $output['@attributes'] = $a;
     }
     foreach ($output as $t => $v) {
      if(is_array($v) && count($v)==1 && $t!='@attributes') {
       $output[$t] = $v[0];
      }
     }
    }
   break;
  }
  return $output;
}
?>

and the result, for our Stewie snippet:

Array
(
    [show] => Array
        (
            [@attributes] => Array
                (
                    [name] => Family Guy
                )
            [dog] => Brian
            [kid] => Array
                (
                    [0] => Chris
                    [1] => Meg
                    [2] => <em>Stewie</em>
                )
        )
)

Victory is mine! :D

36 thoughts on “XML string to PHP array

  1. Rasmus

    Last time someone asked me how to do this, this was my answer. The benefits here are that it handles xml namespaces nicely:


    function xmlToArray($xml,$ns=null){
     $a = array();
     for($xml->rewind(); $xml->valid(); $xml->next()) {
      $key = $xml->key();
      if(!isset($a[$key])) { $a[$key] = array(); $i=0; }
      else $i = count($a[$key]);
      $simple = true;
      foreach($xml->current()->attributes() as $k=>$v) {
       $a[$key][$i][$k]=(string)$v;
       $simple = false;
      }
      if($ns) foreach($ns as $nid=>$name) {
       foreach($xml->current()->attributes($name) as $k=>$v) {
        $a[$key][$i][$nid.':'.$k]=(string)$v;
        $simple = false;
       }
      }
      if($xml->hasChildren()) {
       if($simple) $a[$key][$i] = xmlToArray($xml->current(), $ns);
       else $a[$key][$i]['content'] = xmlToArray($xml->current(), $ns);
      } else {
       if($simple) $a[$key][$i] = strval($xml->current());
       else $a[$key][$i]['content'] = strval($xml->current());
      }
      $i++;
     }
     return $a;
    }

    $xml = new SimpleXmlIterator('./a.xml', null, true);
    $namespaces = $xml->getNamespaces(true);
    $arr = xmlToArray($xml,$namespaces);

  2. Pingback: Proofing Scribd PDF Using Mimeo Connect

  3. Pingback: Proofing a Scribd PDF Using Mimeo Connect | Mimeo Labs

  4. Marko

    Nice work. For some reason foreach loop for attributes didnt work on my environment, so I had to change that part of the code:


    if ($n = $node->attributes->length) {
     $a = array();
      for ($i=0; $igetAttributes()->item($i);
       $a[(string)$attribute->name] = (string)$attribute->value;
      }
     $output['@attributes'] = $a;
    }

  5. Aurelien

    That was very useful, thanks!
    I had some empty fields and they appeared as an empty array instead of an empty string, so I added this line: if(empty($v)) $v = '';

    Here:
    if(isset($child->tagName)) {
       $t = $child->tagName;
       if(!isset($output[$t])) {
          $output[$t] = array();
       }
        if(empty($v)) $v = '';
       $output[$t][] = $v;
    }

  6. Pingback: Convert XML string to PHP Array

  7. Nabeel Khan

    Thanks man! your first piece of code worked for my current situation though!

    I’ve made a post regarding this (stealing some of your code) but gave you a backlink too for further stuff! :)

  8. Ben

    Seriousf**kingly!!!!!!!!!! xmlstr_to_array() is the ONLY script that I found to work to convert XML to an array in PHP. I had search for about two weeks and sifted through like 20 functions/classes and to be honest most were all shit.

    Look man, I have no idea why this is such a flipping major issue in PHP, but seriously your code gets a five star rating. And by the way, you should remove Rasmus’s xmlToArray() function because it just don’t work out of the box.

    I’ll repeat myself. Why is converting XML to an array so hard in PHP and why are there 1500 bunny trails to broken code to accomplish this (what should apparently be a) simple task? Many of them are written in such a way that they quickly exhaust sever memory…your’s seems to hold up very well under a severe XML beating.

    Anyways, thank you very much! I wish I would have found this two years ago. It should be easier to find! For me, now it is as a part of my permanent library, bundled with Curl as a viable SOAP tool.

    Ben


    // My real world PHP Curl SOAP solution utilizing xmlstr_to_array() :

    function curl_soap($xml,$api_url,$soap_action,$timeout=60) {
    $user_agent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $api_url ); // set url to post to
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return variable
    curl_setopt($ch, CURLOPT_TIMEOUT, $timeout); // connection timeout
    curl_setopt($ch, CURLOPT_POSTFIELDS, $xml );
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);

    $header[] = "SOAPAction: ". $soap_action ;
    $header[] = "MIME-Version: 1.0";
    $header[] = "Content-type: text/xml; charset=utf-8";

    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    $xml = curl_exec($ch); // run the whole process

    if (curl_errno($ch)) {
    $error['curl_error'] = 'Connection Error : '. curl_error($ch) ;
    return $error ;
    } else {
    curl_close($ch);
    $arr = xmlstr_to_array($xml);
    return $arr ;
    }
    }

  9. Jonathan

    Given: <xml><empty></empty><zero>0</zero></xml>
    Expected: array("empty" => "", "zero" => "0" )
    Returned: array("empty" => array(), "zero" => array() )

    Aurelien’s comment above fixed the empty node issue. I added a line to fix the zero issue.

    if (isset($child->tagName)) {
     $t = $child->tagName;
     if (!isset($output[$t])) {
      $output[$t] = array();
     }
     if(empty($v)) {
      $v = "";
     }
     $output[$t][] = $v;
    } elseif ($v || $v === "0") {
     $output = (string) $v;
    }

    Awesome otherwise. Thanks for the function

  10. Pingback: PHP Converting XML to array | SeekPHP.com

  11. kontur

    Thanks for posting this code. I was struggling with php’s own simpleXML, trying to wrap my head around how to deal with different namepspaces. Your function, however, was all sufficient to get two simple fields from the xml!

    Thanks also to Aurelien for the fix on the empty string issue. ;)

  12. Ben

    One slight mod (added code) to the github.com code repository fixes both the “empty to array” issue and the “non printing zeros issue” for me (swiping @Jonathan’s code and modifying it slightly):

    From the clean code at github.com above the line “$output[$t][] = $v;” put the following :


    if(empty($v) && $v !== '0') {
    $v = "";
    }

    This enables the zeros to show up in the array, otherwise zeros in the xml like 0 show up as empty.

    Again, fantastic code. The best XML to Array PHP function out there, hands down. This code should be polished/vetted and incorporated into the PHP core functions at a lower level, really. Would save a lot of people a lot of frustration.

  13. Tim Lee

    I was wondering if you could modify this to support multiple parent nodes in the xml? Right now if I give it this xml:

    Brian
    Chris
    Meg
    <![CDATA[Stewie]]>

    0

    you

    I get this as a result:

    Array
    (
    )

    This code works fine if the “testing” node is within the tv node, like so:

    Brian
    Chris
    Meg
    <![CDATA[Stewie]]>

    0

    you

    Is this something you could look into?

    Thanks,
    Tim

  14. Werner

    @Rasmus

    So far, ad’s code handles more XML documents than yours. Be it ugly, it’s definitely more useful IMHO.

  15. Pingback: PHP: XML to Array | nickmallare.com

  16. Pingback: A Complete List of Code I Didn’t Write : Create Awesome

  17. jossif

    Thanks a lot! I ran into the same problem and appreciate you sharing your solution.

    Cheers,
    jossif

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>