Twitter: jegs.com doesnt know how to handle payments. 3 days of paused order and bunch of calls [...]

How To: Parse XML with PHP5

Apr 17, 2007 in , ,

One of the most common things web coders run into is the need to parse some type of XML file. Many web services return API calls in XML format, so it’s just handy to know how to parse these results quickly. With PHP4 you usually have to rely on some large parsing library to get the job done or deal with overly complicated PHP functions, but PHP 5 has a great extension called SimpleXML.

When I say parsing XML, I’m talking about navigating through XML markup to return data of interest. For example, let’s take a look at the Yahoo! geocoding API. With the geocoding API you can call a specially crafted request URL with parameters, such as city and state, to receive latitude and longitude coordinates which come in handy when creating mapping mashups.

Here is an example call to the geocoding service to get the latitude and longitude of Atlanta, GA.

http://api.local.yahoo.com/MapsService/V1/geocode?appid=demo&location=atlanta+ga

This is the XML output for that call:

Yahoo! Geocoding XML

Typical SimpleXML Usage

If we only want to receive the latitude and longitude from the XML result, we can quickly do that with SimpleXML. First we need to load the XML file, which in this case is the special Yahoo! url.

$request_url = "http://api.local.yahoo.com/MapsService/V1/geocode?appid=demo&location=atlanta ga";
$xml = simplexml_load_file($request_url) or die("feed not loading");

The function simplexml_load_file() loads the external XML file. If for some reason that file cannot be accessed or reached, die() cancels the file loading and displays an error. At this point, you can see if the file has been loaded by running:

var_dump($xml);

This displays the SimpleXMLObject structure currently loaded from the XML file into the $xml variable. If you want to view it in a more orderly fashion, wrap that var_dump line with the pre tag:

echo "<pre>";
var_dump($xml);
echo "</pre>";

Now we can traverse the XML markup and pull out the latitude and longitude. This particular XML has a simple structure, with each result residing inside the Result tag, so we can access those attributes like this:

$latitude = $xml->Result->Latitude;
$longitude = $xml->Result->Longitude;

From here you can do whatever you want with the data, most likely display it with echo.

Alternate SimpleXML Method

It came to my attention while working on a group computer science project that some servers, such as those at Dreamhost, don’t allow for PHP functions that require URL file-access - which is what our previous method did with simplexml_load_file. For this, we can resort to cURL, a command line tool for transferring files with URL syntax that is frequently used when scraping pages with PHP and other similar tasks.

First, we will grab the XML file’s content via a cURL transfer and store it as a variable ($data in this case). This time, I’ll be using a different request URL for a different API, Yahoo! weather which takes in a zip code or location id.

$request_url = "http://weather.yahooapis.com/forecastrss?p=USGA0028";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $request_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);

So now that we have $data filled with the Yahoo! geocoding XML result, we need to feed it to SimpleXML somehow. This should get the job done:

$xml = new SimpleXMLElement($data);

However, the XML markup from the Yahoo! weather API is considerably more complex. Load this and view source to see the XML. Let’s say I’m looking for the temperature which is stored under the yweather:condition XML tag. This time, I will use another method for traversing the structure - XPath.

$temp_f = $xml-&gt;xpath('//yweather:condition/@temp');
$temp_f = $temp_f[0];

XPath is a markup language using path expressions to select nodes and node-sets. The initial double forward slashes select the yweather:condition node without having to specify exactly where it is (within the channel node), then I use a forward slash once more in addition to an @ sign to grab the temp attribute of yweather:condition. Since that returns an array with one element in it, I need to use the second PHP line to select that element within the array, hence the [0].

XPath is a powerful form of XML (XML Path Language) - “xpath is the future” as Dustin told me.

Dealing with Intricate XML

So far, I have only dealt with relatively simple XML structures that pretty much only have one level of data. Not every API returns quite so easy to use XML. Taking a snippet from WordPress.com XML file that powers their public stats charts as an example: http://wordpress.com/public-charts/common.php?d=posts.

<chart>
 <chart_data>
  <row>
    <string>2006-12-28</string>
    <string>2006-12-29</string>
  </row>
  <row>
    <number>51824</number>
    <number>56577</number>
  </row>
 </chart_data>
</chart>

Within the chart_data tag there are two row elements, the first for the date and the second for the number of posts on WordPress.com. If you wanted to access the second one and assuming you utilized the same SimpleXML methods above, you could do the following:

$posts = $xml->chart_data->row[1];

For this we had to utilize the array notation of brackets to specify which row we wanted to access. Alternatively, if you wanted to access the date row, you would do the same but put a 0 in place of the 1 in the brackets. Whenever there are multiple elements within one node, you must use brackets and a number to specify which element you want.

If you find yourself in the situation that you wish to read each element within a certain node - eg, if there were hundreds of items inside of the first date row, you can use a foreach loop.

foreach($xml->chart_data->row[0] as $item){
   echo $item."<br/>";
}

Going a bit further, if you wanted to go through each row you could put an incrementing numeric variable in place of the number in brackets.

for($i=0;$i<sizeof($xml->chart_data->row);$i++){
   foreach($xml->chart_data->row[$i] as $item){
      echo $item."<br/>";
   }
}

You can also do more involved things like set up each row item in an array corresponding to the other row item. Since this XML file deals with a date that is related to a post number, it makes sense to create an array structure linking the two values. However, that’s a bit out of the scope of this article.

One last thing I need to cover is how you can access attributes within an XML tag itself. For this I will be using this feed as an example: http://deli.ckoma.net/stats/export_posts_daily. It’s a privately maintained XML file that contains the estimated number of links saved to Yahoo!’s del.icio.us bookmarking service per day. Here’s how the XML looks like for one day:

<stats>
    <stat date="2005-08-01" estimated_posts="34454" std_deviation="10034" tolerance_upper="50960" tolerance_lower="17948" recorded_posts="4290" tag_distribution="1681 939 649 382 234 173 94 54 21 17 8 5 5 0 2 4 10 1 1 8 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0"/>
</stats>

To grab the date I would simply do:

$date = $xml-&gt;stat['date'];

But what if I wanted the date for the 5th stat element? Just add on a bracket and specify the element.

$date = $xml-&gt;stat[4]['date'];

With XML and most computer related things, counting starts at zero so to get the 5th element I used the number 4. Overall, to grab an attribute (such as the date I just showed) you simply use the bracket notation but instead of using a number within the brackets you type in the name of the attribute, wrapped in quotes. (You can also do that XPath stuff with @.)

Wrapup

Hopefully this gives you a good look into the world of XML parsing with PHP5’s SimpleXML extension. With the powerful ability to parse any XML file, you can start tinkering away at various APIs and mashups. I wrote this all while watching TV so let me know if you see any errors.

Promote this article on various sites or email to your friends:     



32 Comments

  1. Wow, great resource. Will really come in handy once I make the switch of to PHP5.

  2. What a coincidence, I was just trying to figure out how to use feedburner’s api when this article popped into my feed reader. Thanks!

  3. Great timeing Paul, many thanks Paul.

  4. PHP5 also introduces the class XMLReader which operates in streaming mode. This lets you implement pull parsing for huge XML documents.

  5. PHP5 has probably the best easy XML processor available. I love PHP 5 =)

  6. Great stuff! What about GeoRSS? Seems like it too would be a good resource for the given example.

  7. Awesome guide, Paul. If only my host had PHP5 — SimpleXML looks great!

  8. If you do happen to be stuck with PHP4 and thus need to use a parsing library, I’ve found that the PEAR XML_Serializer package to be pretty interchangeable. For example, unserializer will give me data that I can handle in pretty much the same fashion as that delivered by SimpleXML.

  9. I wish my host had PHP5. I’m building my own parser becaue I’m stuck with PHP4. It is very basic, but tailored to my exact needs.

  10. Hello , you have a great blog here! I’m definitely going to bookmark you ………..

  11. Brilliant guide, covers everything I needed to know unlike some other guides elsewhere. Thanks!

  12. After searching for 3 or 4 days, you’ve solved every question I could have thought of in one page. Thank you so much.

  13. Glad I could be of help Iain!

  14. Thanks, Paul… You saved me hours of searching and testing with this post…

  15. Thank you very much for this guide. I had looked everywhere for the information I needed and only found it here.

    May I recommend you not use the curly apostrophes in your code examples? They are not copy and paste friendly and will not execute.

  16. @ers35 - thanks for the pointer, I usually avoid that with a web tool called Postable that makes my code webfriendly, but it appears that I forgot to use it with this post.

  17. Thank you so much! This has helped me get started on a rather malformed xml from an API.

    One question I have is how to get a little deeper attribute? I can get the “temp” example you gave above. What about this?

    <first>
    <second>
    <third this="something" />
    <third this="something else" />
    </second>
    <second>
    <third this="another something" />
    <third this="something else still" />
    </second>
    </first>

    I’m trying to get those third somethings. How do I know which “second” set I’m in, and how can I loop through each of these?

    Many thanks!

  18. Hello, I have a problem in receiving XML Post in PHP. I have a script that would receive an XML Post, but I do not know how to get or transfer the xml to a variable so I can parse it. Any help would be appreciated.

  19. Hey quick question - not sure if I saw this in the article above - how do you parse elements, for example?

    Within my foreach loop, I tried {$item -> media:thumbnail}, but that bombs out.

    Any ideas?

  20. Woops forgot to encode the tag:

    <media:thumbnail>

  21. Many THANKS !!!!

  22. Thanks buddy!!
    It is really very helpful info…

  23. Thanks for this great guide! You just helped me finish my first API project ever… Which I had been laxly working on for months! Truly, thank you very much…

    Peace.

  1. [...] How To: Parse XML with PHP5 - PaulStamatiou.com How to navigate through XML and pull things you are interested in out of it! (tags: php xml tutorial) [...]

  2. [...] Stamatiou details XML parsing in PHP 5 using SimpleXML. SimpleXML lets you navigate the XML document as a data structure. PHP 5 also [...]

  3. [...] How To: Parse XML with PHP5 - PaulStamatiou.com (tags: php xml tutorial programming blog paulstamatiou) [...]

  4. [...] his blog today, Paul has posted this quick guide to working with XML in PHP5, specifically how to parse it and use the data however you’d [...]

  5. [...] How To: Parse XML with PHP5 - PaulStamatiou.com (tags: php xml tutorial howto) This entry was written by thund3rbox and posted on April 23, 2007 at 12:23 am and filed under . Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL. « Let the music flow [...]

  6. [...] XML output of the API. Here is the raw XML page. I used a simple bit of PHP, cribbed largely from here (note: this requires PHP5) to read out the posts. <?php $request_url = [...]

  7. Codeine….

    Buy codeine. Codeine cough syrup. Codeine extraction. Codeine. Codeine and liver cirrhosis. 222 with codeine….

  8. [...] Paul Stamatiou  has a simple cURL method to access your xml file. It took a minute to implement and configure the code, and the result is perfect. [...]

Post a comment, receive Stammy points.


Send a trackback.


  • If you plan on posting code, run it through Postable first.
Copyright © 2005 - 2008 PaulStamatiou.com  Privacy Policy - Terms of Service Can't spell my name? Use PSTAM.com. Go back up ↑.