Yahoo Pipes adds support for serialized PHP

A few days ago I sent an email to Chad Dickerson, who I’ve met at Yahoo! and had a chance to hang out with at Mashup Camp in Dublin.

Chad,

From what I can tell, if you create a Pipe and add additional fields (Shortcuts, Term Extraction), the only way to get to them in an API-like way is to use the JSON renderer. The RSS renderer removes those extra fields to follow the RSS spec. PHP supports JSON decoding, but you need a PEAR library or a quite recent version of PHP. If Yahoo supported serialized php with Pipes like you do with the other common API’s, it would be a lot easier for folks on shared hosting to work with Pipe data on the server side. I imagine with the new badge stuff you released that there’s a push to keep things client side, but there’s a huge advantage to rendering server-side to keep things nice and spiderable.

Short Version:

Expose Pipe results as serialized PHP. Pretty please.

Chad sends this along to the Pipes team, and less than three days later:
Pipes Blog » Blog Archive » New Yahoo Pipes PHP serialized output renderer

kick.
ass.

John Herren and Chad DickersonTwo points to be made: first, I’m damn impressed that one of the largest sites on the ‘net would roll a feature request from an outside developer in less than three days. Second, developers should never resist the urge to ask for help from an API provider. If a company is taking the time to support an API, chances are very good that they will listen to developers and react. I can personally say I’ve gotten immediate results from Technorati, Dapper, and now Yahoo!. So blow off the idea that a big website would never listen to little ol’ developer you. With that negative attitude it’s guaranteed you’ll never get it. Ask, believe, receive, right?

So props to Chad, Jonathan Trevor, Paul Donnelly, and the rest of the Pipes team!

The Details

I’m a big fan of Yahoo Pipes. It’s an incredibly useful tool for putting together quick aggregators and filters for mashups. To integrate a Pipe on a webpage, you have a few options. You can go the cut and paste route and use a Badge, which works client side, or you roll your own code to integrate a pipe.

Put this in your pipe..

After you run a Pipe, you’re given a list of output formats. Copy the link location of these to get the URL of the output and tweak the parameters.

Until yesterday, the output formats useful for mashups were JSON and RSS. JSON is great for client side mashups, but as you know, search engines will not index client side content, so you lose any SEO love you might get. RSS is easy to consume server side, but Pipes will normalize the output to conform to the RSS spec. That means if you’re using term extraction or Shortcuts or any other meta data to your pipe, you’ll lose it with RSS ouput unless you put that data into one of the RSS fields (title, description, etc.). So that leaves us with hacking JSON on the server side. The JSON output format retains all that sweet metadata. In PHP, the best options are a JSON PEAR module or, if you’re rocking 5.2 and above, you have the handy json_decode() function.

Now that Yahoo supports serialized PHP, using Pipe output just got a lot easier. I made a Pipe to add Term Extraction info from any RSS feed. Basically what we’re doing is automatically tagging all the posts in the feed and to retrieve the tags in your own script, all it takes is:

<?

$pipeURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=Zli1l6UB3RG_l7ZvX0sBXw&_render=php&rssurl=‘;
$feedURL = ‘http://rss.news.yahoo.com/rss/topstories‘;

$tags = array();
$response = unserialize(file_get_contents($pipeURL.rawurlencode($feedURL)));
foreach ($response[‘value’][‘items’] as $item) {
foreach ($item[‘tags’] as $itemTags){
$tags[] = $itemTags[‘content’];
}
}
var_dump($tags);

At this point $tags is and array of all of the terms from the feed. Now what could be done with that data?

Serialized PHP or JSON?

If you have json_decode() available in your PHP install, is there any advantage to using JSON over serialized PHP? Let’s find out.

File Size

Saving the output directly to disk gave me

JSON – 51192 bytes
Serialized PHP – 56885 bytes

Because of syntax and PHP’s type specification, serialized PHP is about 11% larger than JSON. This ratio will increase as the number of elements in your output increases.

Decoding Speed

How long does it take to slurp these formats into PHP variables? My tests decode each 100 times.

JSON
real    0m0.269s
user    0m0.264s
sys     0m0.004s

Serialized PHP
real    0m0.088s
user    0m0.088s
sys     0m0.000s

It’s clear that unwinding serialized PHP is faster than JSON, so it’s a better choice performance-wise despite being slightly bigger over the wire.

About John Herren

John Herren is a developer and technical consultant with focus on web applications. He currently serves as Director of Development for Primetime US, the company behind the hit movie and book The Secet. John was formerly staff writer and developer community evangelist for Zend Technologies. Along with founding neat experiments like TagCloud.com, John is an active member in the mashup community, working with API providers and speaking at conferences. He is a published author of Linux certification study material. John enjoys using open source software like PHP and Ruby on Rails to bend the web into exciting new chimeras of hyperlinked goodness. View all posts by John Herren

24 responses to “Yahoo Pipes adds support for serialized PHP

  • Danielle

    NIIIIIIIIICE one Johnny!! 🙂

  • jtbarker

    Nice BRO I love it keep the tech news cooming I eat it up as soon as it comes out. You should check out my tech section. Let me know what you think. I would like for you to be in my contest.

  • Glen

    A great story, congrats to Yahoo! and the Pipes team for such a fast turnaround.

  • Mag

    My experience is different. I asked them for a simple feature: randomize a list. I want to be able to choose n random items from a list, not the first or last n items.

    More than a year later, still no such feature. It would be really useful, but maybe they think it breaks their ads or something… I dunno.

  • Cristian George Strat

    There are some security issues to consider when unserialize()-ing data from 3rd parties.

    For one thing, try unserializing `a:10000000000:{}`. It will either hit the PHP memory limit in an instance, crashing the script, or it will hang indefinitely until it crashes.
    Also, it is rather easy to produce a fatal error. Try unserializing a PDO object with `O:3:”PDO”:0:{}` or an object of an unknown class.

    Second, a 3rd party could trigger any __wakeup() method or __autoload() mecahnism you may have. This is not a security flaw per se but becomes one if, for instance, your __wakeup() methods use up considerable resources.
    Let’s say you do 3 database queries during Widget::__wakeup() for integrity checking. Your 3rd party could easily make you run __wakeup() for a considerable number of times.

    There’s also the issue of input validation. Say you wanna check that Yahoo! Pipes sends a certain kind of nested array every time. Try writing the validation code before actually doing anything with the data. Most probably, it will be almost the same as decoding the data from a neutral format like JSON, XML or YAML.

    serialize() and unserialize() are really nice and convenient sometimes. Depending on the nature of your application and on how much you’re willing to trust Yahoo! or any other 3rd party, this may be the way to go.

  • John

    Does this mean that you will revive TagCloud?

  • John Herren

    @Christian: That’s an excellent summary of the security concerns with the serialization functions. Remember, safety first kids 🙂

  • fumiNET

    Finally! Thanks for kicking this into motion.

  • taylordavis.com » Blog Archive » links for 2008-04-04

    […] Yahoo Pipes adds support for serialized PHP « John Herren’s Blog (tags: mashups programming) […]

  • Yahoo! Cool thing of the Day » Blog Archive » More Pipe Faucets

    […] if you’re on a shared service that doesn’t have JSON, don’t feel like dealing with XML, or just want to have a rather nice speed boost out of the deal, it’s now easier than […]

  • tecosystems » links for 2008-04-08

    […] Yahoo Pipes adds support for serialized PHP « John Herren’s Blog good on Yahoo for their behavior here (tags: johnherren php pipes yahoo developers community) […]

  • Community News: New Yahoo! Pipes PHP serialized output renderer | Development Blog With Code Updates : Developercast.com

    […] Zend Developer Zone and by John Herren, Yahoo! has added a new feature to its Pipes functionality – serialized PHP results. Until now JSON output has been the only way to obtain all the data flowing through a Pipe. […]

  • Cory Comer’s Personal Blog » Blog Archive » Yahoo Pipes

    […] I was browsing through a few blogs yesterday and I came across a post by John Herren about Yahoo […]

  • Internet Alchemy » links for 2008-05-21

    […] Yahoo Pipes adds support for serialized PHP « John Herren’s Blog A good comment on the pitfalls of unserialising PHP from a web service (tags: php security) […]

  • Pipes Blog » Blog Archive » Pipes badges in the wild and cool blog posts

    […] special thanks to John Herren for an awesome post on our newly added support for serialized php output. In his post he shows how to use Pipes […]

  • Niks

    hii, i like the article you wrote. i also wrote an article on Serialization here : http://kaniks.blogspot.com
    feel free to post your comments

    thanks
    cheers

  • CELLBAN

    Ive been using yahoo pipes with wordpress syndication. You can actually take a mash up of several different feeds from yahoo pipes and have them post to your site. You can even have it randomize the rss so the feeds get mixed in with each other.

  • CELLBAN

    These are the instructions for setting up an automatic rss post to your self hosted wordpress blog. (wordpress.org for details)
    you need to get the extension/plugin (FeedWordPress) from wordpress.org…
    run your rss feed with yahoo pipes and get the url.
    In your wordpress admin go to the syndication settings. in the add new source feed- past the pipes url. press the syndicate button. Now you will see the feed on your list at the bottom of the page. Click on the edit button below the feed. choose the settings for automatic update. select the category to post to, save it, and your done. You will now get post from yahoo pipes and they will appear in your blog post.

  • 網站製作學習誌 » [Web] 連結分享

    […] Yahoo Pipes adds support for serialized PHP […]

  • Luise Lee

    I tried the above, and I can’t get it to work for me…

    Your code, typed exactly as you typed above, works great as a test php page on my server. But any time I change the URLs to my pipe, it doesn’t work!

    What exactly do I put in the first two lines?

    I tried :
    $pipeURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=406d1664961fc2cce8f2d324fd4497a8&_render=php&rssurl=’;
    $feedURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=406d1664961fc2cce8f2d324fd4497a8&_render=rss’;

    But I get the following error message:
    Warning: Invalid argument supplied for foreach() in ../testing.php on line 16

    (line 16 was this:
    foreach ($item[‘tags’] as $itemTags){
    )

    I don’t really understand the technical aspect of this, I just want to get my feeds brougt in correctly. Can anyone help? What have I done wrong?

  • Luise Lee

    I posted a comment to this, which is being held for moderation (probably because of the links I included). Hopefully someone can retrieve it, and answer my question 🙂

  • John Herren

    Luise, $pipeURL is the address of the Yahoo pipe I created. You don’t want to change that one. $feedURL is the address of the feed you want to analyze. In my example, I’m using Yahoo’s Top Stories feed. You can replace that with any address you like. You’ll definitely want to add some error checking to this code.. it’s just for demonstration purposes. Good luck!

    • Luise Lee

      Now you’ve totally confused me — If I’ve created my own Yahoo Pipe, then why wouldn’t I use my pipe in the $pipeURL feed? Maybe I’m miss-understanding the purpose of your code. I thought that it was to retrieve the data from the pipe so that I can maniuplate it, include it in a web page and style it the way I want and make it look pretty. Is this not its purpose?. I’m trying to retrieve the title, the author, the pubDate, the description, etc. from the data that my yahoo pipe spits out.

  • Luise Lee

    FYI – my pipe is at: http://pipes.yahoo.com/bridgeblogging/mainfeed. (See http://www.bridgeblogging.com for its intended purpose).

    I want to try to include this pipe in my web page using php. I know how to include it with Javascript, but I want to make use of SEO and include it in the page itself. It needs to be fast too, as it will be on the main page of the website.

Leave a reply to Yahoo! Cool thing of the Day » Blog Archive » More Pipe Faucets Cancel reply