RSS feed aggregator/combiner in PHP with Magpie RSS (v2)

I have recently upgraded the code to combine multiple RSS feeds, for the original version see this post.

To install, extract the following zip file to a directory where you want your combined feeds to be displayed, I use thydzik.com/combinedfeed as thydizk.com/feed is already used by WordPress.

Edit index.php for your site, and make sure the temp directory has write permissions (mod 755). That should be it. Enjoy.

thydzik RSS feed aggregator v2.zip

Thanks to Magpie RSS and Feedcreator.

<?php
	$TMP_ROOT = "temp/"; //a atempory folder for storing the cached feeds, need to have write access (mod755)
	$DOMAIN_NAME = "http://thydzik.com/";
	$SITE_TITLE = "Travis Hydzik's blog feeds";
	$SITE_DESRIPTION = "A collection of Travis Hydzik's blog feeds";
	$SITE_AUTHOR = "Travis Hydzik";

	$FEEDS_ARRAY  = array( //the collection of urls linking to individual feeds
		"http://hydzik.com/feed/",
		"http://sonyaandtravis.com/feed",
		"http://thydzik.com/feed/"
	);

	$MAX_ITEMS = 10;
	$SHOW_FULL_FEED = FALSE;

	//stop editing from here onwards

	define('MAGPIE_DIR', '');
	define('MAGPIE_CACHE_DIR', $TMP_ROOT);

	//include magpie rss http://magpierss.sourceforge.net/
	@require_once(MAGPIE_DIR.'rss_fetch.inc');

	//include universal feed creator http://sourceforge.net/projects/feedcreator/
	@include(MAGPIE_DIR.'feedcreator.class.php');

	//create the basic rss feed
	$rss = new UniversalFeedCreator();
	$rss->useCached();
	$rss->title = $SITE_TITLE;
	$rss->description = $SITE_DESRIPTION;
	$rss->link = $DOMAIN_NAME;
	$rss->syndicationURL = curPageURL();

	//get all items is all feeds
	$total_temp = 0; //temp total number of posts in all rss feeds
	foreach ($FEEDS_ARRAY as $single_url) {
		$array_temp[$single_url]['page_title'] = url_grab_title($single_url); //grab the page title

		$rss_temp = fetch_rss($single_url);
		$items = array_slice($rss_temp->items, 0, $MAX_ITEMS);
		$array_temp[$single_url]['rss_data'] = $items;
		$total_temp += count($items);

		$array_temp[$single_url]['rss_pointer'] = 0;

		preg_match('@^(?:http://)?([^/]+)@i', $single_url, $matches);
		$array_temp[$single_url]['site_url'] = $matches[0];
	}

	while ($total_temp <> 0 && $MAX_ITEMS > 0){// loop while there are remaining posts to process
		$date_timestamp_temp = 0; //initialise to 0
		foreach ($FEEDS_ARRAY as $single_url) {
			$this_date_timestamp = $array_temp[$single_url]['rss_data'][$array_temp[$single_url]['rss_pointer']]['date_timestamp']; //get the date stamp of this post
			if ($this_date_timestamp > $date_timestamp_temp) { //if this date stamp is the newest, save where it came from
				$date_timestamp_temp = $this_date_timestamp; //update with this date stamp
				$temp_url = $single_url; //save the url feed
				$pointer_temp = $array_temp[$single_url]['rss_pointer']; //save the item number
			}
		}

		$total_temp --; //decrement total remaining posts to process
		$MAX_ITEMS --; //decrement number of posts to display
		$array_temp[$temp_url]['rss_pointer'] ++; //increment post index of used post rss

		//get the saved item
		$item = $array_temp[$temp_url]['rss_data'][$pointer_temp];

		//create the new item
		$item_new = new FeedItem();

		//add all the copied basics
		$item_new->title = $item['title'];
		$item_new->link = $item['link'];
		$item_new->date = $item['pubdate'];
		$item_new->author = $item['author'];
		$item_new->source = $temp_url;

		//to show full feed or blurb
		if ($SHOW_FULL_FEED) {
			$item_new->description = $item['content']['encoded'].'<p>Copyright &copy; <a href="'.$array_temp[$temp_url]['site_url'].'">'.$array_temp[$temp_url]['page_title'].'</a>. All Rights Reserved.</p>';
		} else {
			$item_new->description = $item['description']       .'<p>Copyright &copy; <a href="'.$array_temp[$temp_url]['site_url'].'">'.$array_temp[$temp_url]['page_title'].'</a>. All Rights Reserved.</p>';
		}

		$rss->addItem($item_new);
	}

	// a quick function the grab a pages title
	function url_grab_title($rss_url) {
  		$contents = file_get_contents($rss_url, TRUE, NULL, 0, 3072);
  		$contents = preg_replace("/(\n|\r)/", '', $contents);
		preg_match('/<title>(.*?)<\/title>/i', $contents, $matches);
		return $matches[1];
	}

	//get page url (for syndication), source http://www.webcheatsheet.com/PHP/get_current_page_url.php
	function curPageURL() {
		$pageURL = 'http';
		if ($_SERVER["HTTPS"] == "on") {$pageURL .= "s";}
		$pageURL .= "://";
		if ($_SERVER["SERVER_PORT"] != "80") {
			$pageURL .= $_SERVER["SERVER_NAME"].":".$_SERVER["SERVER_PORT"].$_SERVER["REQUEST_URI"];
		} else {
			$pageURL .= $_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];
		}
		return $pageURL;
	}

	// get your news items from other feed and display back
	$rss->saveFeed("RSS2.0", $TMP_ROOT."feed.xml");
?>
  • Robert

    I have been looking for something like this. I did notice that the script does not retain the pubDate of the original article. It resets the pubDate to the time you run the script.

  • hmmm, would
    $item_new->date = $item[‘pubdate’];

    changed to
    $item_new->pubdate = $item[‘pubdate’];

    fix the issue?

  • Tyler

    Actually, the problem is twofold:

    Magpie converts pubDates to unix timetamps, stored in each item as $item[‘date_timestamp’]. So in your script this line:
    $item_new->date = $item[‘pubdate’];

    …should change to:
    $item_new->date = $item[‘date_timestamp’];

    Secondly, the copy of feedcreator.class.php has a little bug. The FeedDate object’s rfc822() method uses time() instead of the stored unix date. So $date within that function should be set like this instead:
    $date = gmdate(“D, d M Y H:i:s”, $this->unix);

  • mboveiri

    Hi.
    How to add to generated feed.?

  • Glucose

    Hi
    How duplicate feed are manage ?