I want to retrieve and display a full article (full-text content, images). Which parts of code needs to be tweaked for this purpose?
Printable View
Did you try the advice from that thread?
no I didn't, as this tweak is for retriving a description, whereas I need to retrieve a full-text content(full article). As far I understand, description is only part of xml structure, not a full content.
Thanks.
I'd say try it. If it doesn't get you what you want, it should at least be a jumping off point.
Feeds vary. Is that the feed you want to use?
I have tried tweak from that thread, but its not working for me, I didn't find any changes in feed.
I thought it as some universal approach, but if its a feed-specific, then, for example, for these feeds:
http://rss.kenwood.co.jp/f6761/rss.xml
www.dpreview.com/feeds/latest.xml
This script cannot pull anything from the feed that isn't there. The Kenwood feed is very minimalistic. Here's a typical item in its raw xml format:
Notice, no description. So all the script can get on that is the title, link and date.Code:<item>
<title>Information on the Kenwood Booth at IWCE 2011</title>
<link>http://rss.kenwood.co.jp/item_140140_2670610_6761.html</link>
<description></description>
<pubDate>Mon, 07 Mar 2011 11:33:24 +0900</pubDate>
</item>
The Digital Photography Review feed is similar:
It contains just a little more information per item.Code:<item>
<title>Nikon Coolpix L120</title>
<pubDate>Wed, 09 Feb 2011 01:00:00 GMT</pubDate>
<link>http://www.dpreview.com/news/1102/11020909nikonsuperzooms.asp</link>
<guid>http://www.dpreview.com/news/1102/11020909nikonsuperzooms.asp</guid>
<description>Also: <a href="/reviews/specs/Nikon/nikon_cpl120.asp">Specifications</a> and <a href="/products/shop/nikon_cpl120">Prices</a>.</description>
</item>
Because I caught your post before last before you edited it, I think I know what you're after - something like what feedex.net can give you. The feedex for the same item for Kenwood looks like so (portions wrapped):
And it goes on and on. The way they get that information is by sending a spider to crawl the link for each item and then retrieve the data from those links and place it into the description field for each item.Code:<item><title>Information on the Kenwood Booth at IWCE 2011</title><link>http://www.kenwood.co.jp/en/news/2011/20110307_01.html</link><description><h1>
Information on the Kenwood Booth at IWCE 2011<br>
Offering total wireless communications systems based on the theme of<br>
“Digital Systems &amp; Multimedia Solutions.”
</h1>
<div>
<p>
<strong>Kanagawa, Japan, March 7, 2011 —</strong> Kenwood Corporation,
an operating company of the JVC Kenwood Group, will exhibit at this year’s International Wireless
Communications Expo (IWCE), the world’s largest exposition and trade fair for radio communication
equipment and systems, to be held March 9-11 in Las Vegas. Visitors to Kenwood’s booth will be able
to view total wireless communications systems optimized for a variety of applications ranging from the
business &amp; industry market to the public safety market. . . .
The RSS Display Box script doesn't do that. It could parse the feedex version of the feed though.
However, the feedex terms of service are such that it may only be used for personal/non-commercial purposes unless one is willing to pay a minimal fee. But your responsibility doesn't stop there, you have to secure permission from the primary feed site to use its extended content.
It's possible that one could set up one's own service like feedex to crawl the link from each item and retrieve its data. That's beyond the scope of the RSS Display Box script, and probably of this forum. There would also be copyright considerations as to how much of the information from the feed site one can display. This would be true of a feedex feed as well if used for commercial or even just non-personal purposes.
If you want to try to develop a feedex type service with the help of folks in this forum, you can ask about it in the PHP section of this board. But I think that is, as I said, a bit beyond the scope of this forum.
If you're willing to do most of the work though, you might get some help with it in the PHP section.
Yes, currently I use feedex.net service to get expanded content (free service is limitated to 5 items per feed), as I know, some of such services use SimplePie engine to parse the feed. First of all, I have no intention to create own online service like feedex.net, it's for my personal needs only, personal web site.
I just wanted adjust SimplePie script to get full-lenght content without the necessity of using an external service. There is no legal issues related to using primary feed site to use its extended content.
So, the SimplePie can retrieve extended content only if it present inside <description></description> or <longdesc></longdesc> tags in original xml feed? It does not do neither searching nor parsing source URL inside <link></link> tags?
What feedex is doing is something like (in PHP):
Where $url is the item link. They then take just the body, strip out any remaining scripts and style, detect and convert to the proper encoding if necessary, convert links and src attributes that would be broken into either absolute paths or (in the case of certain links) links back to the extended feed (which depends upon some tests of these links), parse the remaining code into xml friendly tags and append that to the <description> tag in the new extended feed. Some special attention may be paid to image tags to ensure the images aren't beyond certain dimension limits. Other tests and tweaks of the content so gathered are done. I noticed with Dynamic Drive's feed, feedex stripped out ads and headers from the linked pages.PHP Code:$requested_page = file_get_contents($url);
You cannot even do this if your server doesn't allow it, it's a security setting. It can be blocked by the remote site as well, if they so choose. Feedex might have a way around that.
So, to make a long story short:
Yes. SimplePie can retrieve extended content only if it's present inside the feed.
And when you say it's a personal site, that doesn't mean that you may use another site's content without their permission. Are you the only person who views this personal site? If not, showing their content without their permission could be seen as a violation of their copyright.
By personal non-commercial use, feedex means only you get to see the results. And that you aren't using that information for any kind of commercial purpose, like if you were a competitor of the feed site and wanted to grab certain key bits of information from them in an automated fashion.
Also, you have no right to republish all or most of another site's page unless they give you permission. It doesn't matter how small your audience is. Unless it is very very small and you can guarantee that it will remain very very small, you could get into trouble.
The only exception to that I know of is for a critical review. Even that has limitations as to how much you can garner and present.
1. Seems, if full content present inside the feed, we can retrieve it without changing simplepie.inc file, by adding custom template in outputbody.php which defines body outputs?
Correct? If the full content does not exist inside the feed, template will return just all data prior to "rsscontent" section?Code:{
else if ($template=="Custom"){
?>
<DIV class="rsscontainer">
<div class="rsstitle"><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></div>
<div class="rssdate"><?php echo $item->get_date('d M Y g:i a'); ?></div>
<div class="rssdescription"><?php echo $item->get_description(); ?></div>
<div class="rsscontent"><?php echo $item->get_content(); ?></div>
</DIV>
<?
}
2. what's correct syntax for JavaScript code on html page when we want to fetch all items, and display 5: should we use zero?
showbbc.set_items_shown(0, 5) or
showbbc.set_items_shown(5)?
3. is there another way to display feed on page, to make a feed search engines friendly, by using PHP to return the content directly, not via Javascript code?
- No, there's no get_content() function for items in the version (1.0 b3.2) of simplepie.inc used by this script. If you get the date, title, link and description, you pretty much have it all. Many feeds only have these. Some don't even have all of them. If the feed has other stuff, you can get that if simplepie has a way to, or if you use another feed parser that does. Apparently all versions from 1.0 on have it, you could update. But as I said before, that would necessitate changes to other files used by this script. And get_content still doesn't get you anything that isn't there - it will use the description if nothing more elaborate is available. In the case of the Kenwood feed, where there's nothing like that, not even a description, it would return nothing.
showbbc.set_items_shown(0, 5)For more info, see:
http://www.dynamicdrive.com/dynamici...laybox_ref.htm
- You can, but feeds are inherently not SEO friendly. They change too often. If you want to construct such a thing see:
http://simplepie.org/wiki/setup/sample_page
and:
http://simplepie.org/wiki/
Both use the most recent version of simplepie.
Or, if you hunt around, there may be something like that already. But there wouldn't be the type of real time updates and pagination like javascript affords.
well, many feeds don't change too often.
the above SimplePie example for php pages:
http://simplepie.org/wiki/setup/sample_page
Is there way call php script from the html page? My pages are dynamically generated html pages. I dont want changing the AddType line for PHP files to send .html and .htm files through the PHP interpreter.
For some reason, RSS Display Box not show this feed:
http://www.unsum.com/fetch?id=1475009
Where is the problem?