Log in

View Full Version : How to parse RSS feeds



dog
03-06-2012, 06:54 PM
Hello everyone,

I'm a PHP beginner and a keen learner.

I'm building a site in PHP and would like the latest blog post from a related site to appear on the front page of this one I'm building. The related site has a blog feed so I figure I need to know how to parse RSS.

The related site is built with Django and I've been provided with a URL that looks like this: http://www.relatedsite.com/feeds/blog

Any help would be much appreciated.

Thanks,
Dog

fastsol1
03-06-2012, 11:46 PM
Here is a link to a great tutorial maker for php and he has one on this exact thing you want. It is a 3 part series.
http://www.youtube.com/watch?v=HGwHJ6SF7UA&feature=plcp&context=C3b2513bUDOEgsToPDskLGvow6DaWSf5OZlxJTR4Vv

dog
03-07-2012, 03:06 PM
Thanks fastsol1

That was a pretty good introduction. I've got that working on the site now but obviously it returns all the articles. Not just the first one.

What's the best way for me to call in just the first article in the xml file?

I'll post the code I used during the tutorial to keep things simple. Though I'm open to other methods as well.

Cheers,
dog


<?php


// Fetches articles from the BBC news feed.
function fetch_rss(){

$data = file_get_contents("http://feeds.bbci.co.uk/news/rss.xml");

$data = simplexml_load_string($data);

$articles = array();

foreach ($data->channel->item as $item)
{

$media = $item->children('http://search.yahoo.com/mrss/');

$image = array();

foreach($media->thumbnail[0]->attributes() as $key => $value)
{
$image[$key] = (string)$value;
}

$articles[] = array(
'title' => (string)$item->title,
'description' => (string)$item->description,
'link' => (string)$item->link,
'image' => $image,
);
}

return $articles;

}

?>

Thanks for any help!

jscheuer1
03-07-2012, 04:33 PM
Quick and dirty could be to change:


return $articles;

to


$article[] = $articles[0];
return $article;

dog
03-07-2012, 05:49 PM
Thanks John,

That works fine when I'm using the BBC feed as per the tutorial. When I use the feed I'm actually working with it doesn't output anything (although it was working otherwise).

Interestingly, I also tried:

$article = $articles[0];
return $article

That output just the first character in each of the strings that I echoed.

Perhaps the Feed that I'm working with is a bit messed up or perhaps I don't know what I'm doing ;)

I've actually found a solutions in the form of http://simplepie.org/ Although I had wanted to do this from stratch it's turning into too big a job to finish right now so I think I'm going with the ready-made solution.

I might like to keep working on this anyway if anyone has any ideas but don't put yourself out as it's no longer urgent.

Thanks for all the help!

dog

jscheuer1
03-07-2012, 06:23 PM
Simplepie is a very good RSS feed parser. I've used it and found it to be fairly easy to both work with and learn from.

I can understand your wanting to learn it from scratch though. All my suggestion does is reduce the array of arrays (articles) down to an array of the first array in articles (article). If the first item in your feed is missing or blank, perhaps you need to go for the second:


$article[] = $articles[1];
return $article;

or third. I'd try that, simply increasing the number by one until you get something from the feed.

If that doesn't work, just go back to the way it was when it worked and try reducing the number of items shown later on in the process.

All that function does is output the array of articles. It's later on that these are converted into something that appears on the page. You haven't shown that part.

dog
03-07-2012, 06:59 PM
Thanks again. It's working now and I don't quite know how to explain the previous error.

Here's the working code...

The function I'm using:


function readSzBlog() {

$data = file_get_contents("http://www.undisclosed.com/feeds/blog");

$data = simplexml_load_string($data);

$entries = array();

foreach($data->entry as $entry)
{

$entries[] = array
(
'title' => (string)$entry->title,
'summary' => (string)$entry->summary,
);
}

$first[] = $entries[0];
return($first);

};


And the output side:


<?php
foreach(readSzBlog() as $post)
{
?>
<h2><?php echo($post['title']); ?></h2>
<div><?php echo($post['summary']); ?></div>
<?php
}
?>

With this particular XML file currently I don't suppose it matters because it's actually quite small, but in the tutorial (using the BBC News feed) there was quite a delay in receiving the feeds. Considering I only want the contents of the first entry would it not be better to just load that rather than loading the whole file and then not using most of it? Or is that just asking too much?

Cheers,
dog

jscheuer1
03-07-2012, 10:12 PM
I think it could be done, but it would be done earlier and probably as well in the reading of the file once it's been gotten. The real time lag (when there is one) probably occurs here:


$data = file_get_contents("http://feeds.bbci.co.uk/news/rss.xml");

Your server must open a connection to the feed server and download the file. Both the size of the file and the bandwidths available on both servers and between them are all factors. The farther you are from the UK, the more of a delay, and if their server is slow, or yours - and as I say if the file is large.

If the feed you're using is on your own server, it would be better to read it directly off the server, rather than downloading it first.

And you're only interested in the first article. In that case you should be able to use file_get_contents() with parameters to only read to a maximum number of bytes. However, you would then be left with an invalid XML file to parse. There probably are ways to deal with that. I'm just not sure what the best method would be. I'm also not sure if you can cherry pick how many bytes you want. Your server may still need to download the entire file before proceeding to read it.

Once you have the file, I suppose some savings could be had by not iterating over the entire file. But I think most of the time is spent downloading it.

I am not an expert in such matters though.