John Herren’s Blog

Entries categorized as ‘mashups’

Yahoo Pipes adds support for serialized PHP

April 3, 2008 · 15 Comments

A few days ago I sent an email to Chad Dickerson, who I’ve met at Yahoo! and had a chance to hang out with at Mashup Camp in Dublin.

Chad,

From what I can tell, if you create a Pipe and add additional fields (Shortcuts, Term Extraction), the only way to get to them in an API-like way is to use the JSON renderer. The RSS renderer removes those extra fields to follow the RSS spec. PHP supports JSON decoding, but you need a PEAR library or a quite recent version of PHP. If Yahoo supported serialized php with Pipes like you do with the other common API’s, it would be a lot easier for folks on shared hosting to work with Pipe data on the server side. I imagine with the new badge stuff you released that there’s a push to keep things client side, but there’s a huge advantage to rendering server-side to keep things nice and spiderable.

Short Version:

Expose Pipe results as serialized PHP. Pretty please.

Chad sends this along to the Pipes team, and less than three days later:
Pipes Blog » Blog Archive » New Yahoo Pipes PHP serialized output renderer

kick.
ass.

John Herren and Chad DickersonTwo points to be made: first, I’m damn impressed that one of the largest sites on the ‘net would roll a feature request from an outside developer in less than three days. Second, developers should never resist the urge to ask for help from an API provider. If a company is taking the time to support an API, chances are very good that they will listen to developers and react. I can personally say I’ve gotten immediate results from Technorati, Dapper, and now Yahoo!. So blow off the idea that a big website would never listen to little ol’ developer you. With that negative attitude it’s guaranteed you’ll never get it. Ask, believe, receive, right?

So props to Chad, Jonathan Trevor, Paul Donnelly, and the rest of the Pipes team!

The Details

I’m a big fan of Yahoo Pipes. It’s an incredibly useful tool for putting together quick aggregators and filters for mashups. To integrate a Pipe on a webpage, you have a few options. You can go the cut and paste route and use a Badge, which works client side, or you roll your own code to integrate a pipe.

Put this in your pipe..

After you run a Pipe, you’re given a list of output formats. Copy the link location of these to get the URL of the output and tweak the parameters.

Until yesterday, the output formats useful for mashups were JSON and RSS. JSON is great for client side mashups, but as you know, search engines will not index client side content, so you lose any SEO love you might get. RSS is easy to consume server side, but Pipes will normalize the output to conform to the RSS spec. That means if you’re using term extraction or Shortcuts or any other meta data to your pipe, you’ll lose it with RSS ouput unless you put that data into one of the RSS fields (title, description, etc.). So that leaves us with hacking JSON on the server side. The JSON output format retains all that sweet metadata. In PHP, the best options are a JSON PEAR module or, if you’re rocking 5.2 and above, you have the handy json_decode() function.

Now that Yahoo supports serialized PHP, using Pipe output just got a lot easier. I made a Pipe to add Term Extraction info from any RSS feed. Basically what we’re doing is automatically tagging all the posts in the feed and to retrieve the tags in your own script, all it takes is:

<?

$pipeURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=Zli1l6UB3RG_l7ZvX0sBXw&_render=php&rssurl=‘;
$feedURL = ‘http://rss.news.yahoo.com/rss/topstories‘;

$tags = array();
$response = unserialize(file_get_contents($pipeURL.rawurlencode($feedURL)));
foreach ($response['value']['items'] as $item) {
foreach ($item['tags'] as $itemTags){
$tags[] = $itemTags['content'];
}
}
var_dump($tags);

At this point $tags is and array of all of the terms from the feed. Now what could be done with that data?

Serialized PHP or JSON?

If you have json_decode() available in your PHP install, is there any advantage to using JSON over serialized PHP? Let’s find out.

File Size

Saving the output directly to disk gave me

JSON - 51192 bytes
Serialized PHP - 56885 bytes

Because of syntax and PHP’s type specification, serialized PHP is about 11% larger than JSON. This ratio will increase as the number of elements in your output increases.

Decoding Speed

How long does it take to slurp these formats into PHP variables? My tests decode each 100 times.

JSON
real    0m0.269s
user    0m0.264s
sys     0m0.004s

Serialized PHP
real    0m0.088s
user    0m0.088s
sys     0m0.000s

It’s clear that unwinding serialized PHP is faster than JSON, so it’s a better choice performance-wise despite being slightly bigger over the wire.

Categories: PHP · mashups

MashupCamp Venice 2009

February 5, 2008 · 4 Comments

VeniceMashup - MashupCamp

Hot off the presses, and just a “concept,” but still…

Categories: mashups

Mashup the Mefi

January 22, 2008 · No Comments

Put this on your list of data to be mashed. The oh-so popular group blog Metafilter has released a data dump of site content. Andy Baio, as usual, is already off to a good start.

Metafilter Infodump

Categories: mashups

Spam me with your political blogs

January 8, 2008 · 2 Comments

I’ve got an idea for a new mashup, but I need some help. I usually get my Politik fix on Reddit, so I’m looking for a list of political blogs. General criteria:

  1. Should be original content, not aggregators
  2. RSS/Atom required, with full posts
  3. popular is better
  4. opinionated is better
  5. Group ‘em by left wing/right wing/centrist (does that exist?) if possible

Categories: mashups

Add functions to your mashups with Utility Mill

January 4, 2008 · 3 Comments

Utility Mill - Makes Utilities

Utility Mill looks like a kick ass hosted service for mashup development. It lets you create hosted web services that run your custom Python code, handling input and output in an easy way. This service fills a hole in the “mashup stack” by adding custom scripting functionality without the need for a server side scripting language. So, if you need to perform any kind of filtering or munging of your mashed up data, and you aren’t exactly a Javascript guru, you can still keep everything client side by passing your data through Utility Mill first.

Other really nice features include commentary, revision control,  and mandatory GPL licensing. It’s also not a bad way to pick up some Python knowledge–especially when you see contributors named “guido.” Here are a couple fun/interesting/useful ones I found:

and of course:

  • MetaUtility Enter some Python. Run it. See the output.

I was able to register on the site and write a silly little UUID Generator in about three minutes :)

I’d really like to see this done for PHP or Ruby too.

Categories: mashups

YDN Theater: MashUp Camp — Dublin

December 2, 2007 · No Comments

Chad Dickerson and Tom Hughes-Croucher hit the streets in Dublin to ask, “what’s a mashup?

Categories: mashups

Mashup University. I talk about AOL’s XDrive

November 11, 2007 · 5 Comments

One of the AOL guys couldn’t make it to Dublin, so Dave Berlind asked me to fill in. I took a different angle with this talk, and did it without any speaking. It was risky, but I think the reaction was great. This was probably the funnest ‘talk’ I think I’ve done.

I do want to add that the 128 bit part was shamelessly lifted from this awesome article about GUIDs.

Categories: mashups

Intro to Mashups - Mashup Camp 5 / Mashup University - Dublin

November 11, 2007 · 1 Comment

Here’s my slides for my intro talk:

Categories: mashups

I made it to Dublin

November 9, 2007 · No Comments

Not a bad trip. The guy next to me in the plane managed to spill a full glass of water all down my right leg while I was sound asleep. I turned the air jet on full blast and after 30 very uncomfortable minutes everything was fine. However, let’s coin a new phrase:

Aeroaquacadophobia

And I now suffer from it.

After a delightful cab ride to the Morrison Hotel (no relation to the album), I have a room and and Internet connection, Skype enabled!

My room at the Morrison Hotel My view from the hotel

Categories: mashups

Mashup Camp 5: Dublin!

November 7, 2007 · 2 Comments

 Mashup Camp 5

It’s on, bitches.

If I make it across the ocean, I’ll be 5 for 5 in Mashup Camp attendance!

For the third ‘Camp in a row, I’ll be teaching the Intro To Mashups talk, the very first session of Mashup University. It’s basically an ice-breaker talk that gets everyone excited and inquisitive about what we’ll be doing for the next three days.

I’m looking forward to seeing the friends I’ve made from earlier camps as well as meeting new folks. The challenge this ‘Camp will definitely be maintaining the balance between hacking on mashups during speedgeeking and a stable blood alcohol level. After all, this unconference will be hosted at the Guinness Storehouse!

“There can be no tradition without innovation.”
- Earle Hitchner, Irish music journalist

Categories: mashups