Author Archives: John Herren

About John Herren

John Herren is a developer and technical consultant with focus on web applications. He currently serves as Director of Development for Primetime US, the company behind the hit movie and book The Secet. John was formerly staff writer and developer community evangelist for Zend Technologies. Along with founding neat experiments like TagCloud.com, John is an active member in the mashup community, working with API providers and speaking at conferences. He is a published author of Linux certification study material. John enjoys using open source software like PHP and Ruby on Rails to bend the web into exciting new chimeras of hyperlinked goodness.

Google Adsense for Feeds

Looks like Google is now supporting Adsense ads for RSS feeds.

I remember the good ol’ days of Adsense. Now it’s about as ubiquitous as hit counters were in the 90′s. Still, I sure don’t mind seeing those Google checks in the mail, although they are fewer and further between these days.

The lack of Adsense support for feeds has been a pain point for bloggers and sploggers alike. Several startups exist solely to plug that hole. I guess we should get ready to start seeing ads in our feeds as often they appear on sites.

One more reason to love the keyboard shortcuts in Google Reader.

The announcement (if you must):

Inside AdSense: I feel the need… the need for feeds


Getting the number of posts per category in WordPress

This morning JTk presented me a WordPress riddle:

What I would like to be able to do is ask the MySQL if a category has any posts in it or if it is empty.  So I can do one thing if there are posts there and another if that category is empty.  And try as I might, no matter how hard I beat my head against it, all I get is a bruised head.  And, yes, I have to do this conditional because WP is flawed and will break if I try and do certain things with empty categories…..

Here is the schema ( maybe it helps )
http://blog.kapish.co.in/wp-content/uploads/2008/01/wp_db.png

And the worst part of this, is – I know this is trivial so I hate to even ask, but I guess I don’t hate enough not to ask :)  I asked the “community” and got told that there is not a WP function for this ( um, yeah, thanks…. ) that I shouldn’t make straight db calls ( arrrggg ), and that there are plugins that accomplish similar things….  But no real help, so I decided to ask a ninja….

So we want to get the number of posts per category. JTk, you’re in luck! A quick look at the schema (if that diagram is current) tells me we could query the wp_post2cat table and do a COUNT and a GROUP BY query on the category of interest. Even better, if wp_categories.category_count is what I think it is, that’s a simpler query. But let’s see what the code gives us already. In category.php, there are some tasty sounding functions like get_categories(), get_category(), get_category_by_path(), get_category_by_slug() and so on. I installed a clean version of WordPress, added some categories and assigned some fake posts to them. Then I tried some of these functions to see what’s up.

<?php var_dump(get_categories()); ?>

Output:

array(3) {
  [1]=>
  object(stdClass)#68 (15) {
    ["term_id"]=>
    &string(1) "3"
    ["name"]=>
    &string(14) "First Category"
    ["slug"]=>
    &string(14) "first-category"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "3"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "4"
    ["cat_ID"]=>
    &string(1) "3"
    ["category_count"]=>
    &string(1) "4"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(14) "First Category"
    ["category_nicename"]=>
    &string(14) "first-category"
    ["category_parent"]=>
    &string(1) "0"
  }
  [2]=>
  object(stdClass)#69 (15) {
    ["term_id"]=>
    &string(1) "4"
    ["name"]=>
    &string(15) "Second Category"
    ["slug"]=>
    &string(15) "second-category"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "4"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "2"
    ["cat_ID"]=>
    &string(1) "4"
    ["category_count"]=>
    &string(1) "2"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(15) "Second Category"
    ["category_nicename"]=>
    &string(15) "second-category"
    ["category_parent"]=>
    &string(1) "0"
  }
  [3]=>
  object(stdClass)#90 (15) {
    ["term_id"]=>
    &string(1) "1"
    ["name"]=>
    &string(13) "Uncategorized"
    ["slug"]=>
    &string(13) "uncategorized"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "1"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "1"
    ["cat_ID"]=>
    &string(1) "1"
    ["category_count"]=>
    &string(1) "1"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(13) "Uncategorized"
    ["category_nicename"]=>
    &string(13) "uncategorized"
    ["category_parent"]=>
    &string(1) "0"
  }
}

OK, so it sounds like the person who told JTk there were no built-in WordPress functions for this was lying :)  This function tells us exactly what we need. A couple of interesting observations, though. I had a category named “Empty Category” with no posts, which does not show up here. The documentation page tells me that I can pass a parameter to include the empty ones easily enough:

<?php var_dump(get_categories(array(‘hide_empty’=>false))); ?>

array(4) {
  [0]=>
  object(stdClass)#67 (15) {
    ["term_id"]=>
    &string(1) "5"
    ["name"]=>
    &string(14) "Empty Category"
    ["slug"]=>
    &string(14) "empty-category"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "5"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "0"
    ["cat_ID"]=>
    &string(1) "5"
    ["category_count"]=>
    &string(1) "0"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(14) "Empty Category"
    ["category_nicename"]=>
    &string(14) "empty-category"
    ["category_parent"]=>
    &string(1) "0"
  }
  [1]=>
  object(stdClass)#68 (15) {
    ["term_id"]=>
    &string(1) "3"
    ["name"]=>
    &string(14) "First Category"
    ["slug"]=>
    &string(14) "first-category"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "3"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "4"
    ["cat_ID"]=>
    &string(1) "3"
    ["category_count"]=>
    &string(1) "4"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(14) "First Category"
    ["category_nicename"]=>
    &string(14) "first-category"
    ["category_parent"]=>
    &string(1) "0"
  }
  [2]=>
  object(stdClass)#69 (15) {
    ["term_id"]=>
    &string(1) "4"
    ["name"]=>
    &string(15) "Second Category"
    ["slug"]=>
    &string(15) "second-category"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "4"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "2"
    ["cat_ID"]=>
    &string(1) "4"
    ["category_count"]=>
    &string(1) "2"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(15) "Second Category"
    ["category_nicename"]=>
    &string(15) "second-category"
    ["category_parent"]=>
    &string(1) "0"
  }
  [3]=>
  object(stdClass)#90 (15) {
    ["term_id"]=>
    &string(1) "1"
    ["name"]=>
    &string(13) "Uncategorized"
    ["slug"]=>
    &string(13) "uncategorized"
    ["term_group"]=>
    string(1) "0"
    ["term_taxonomy_id"]=>
    string(1) "1"
    ["taxonomy"]=>
    string(8) "category"
    ["description"]=>
    &string(0) ""
    ["parent"]=>
    &string(1) "0"
    ["count"]=>
    &string(1) "1"
    ["cat_ID"]=>
    &string(1) "1"
    ["category_count"]=>
    &string(1) "1"
    ["category_description"]=>
    &string(0) ""
    ["cat_name"]=>
    &string(13) "Uncategorized"
    ["category_nicename"]=>
    &string(13) "uncategorized"
    ["category_parent"]=>
    &string(1) "0"
  }
}

Sure enough,there is the empty category. I also noticed that the article count appears in both ->count and ->category_count. I’ll just assume the cached version, ->category_count, will remain correct through the code, but I’m just guessing it doesn’t matter which member we use.

So, that’s fine if we want the whole collection, but what about individual categories? Fortunately, we can extract this same data for both the category ID as well as the name of the category. Here’s an example of each, querying my “First Category” category:

Using ->category_count

<?php var_dump( (int) get_category(’3′)->category_count); ?>

<?php var_dump( (int) get_category_by_slug(‘First Category’)->category_count; ?>

<?php var_dump( (int) get_category_by_slug(‘first-category’)->category_count); ?>

Using ->count

<?php var_dump( (int) get_category(’3′)->count); ?>

<?php var_dump( (int) get_category_by_slug(‘First Category’)->count; ?>

<?php var_dump( (int) get_category_by_slug(‘first-category’)->count); ?>

All of these return the correct result, int(4). Good on ya, WordPress.

Now let’s see how robust the function is. Will it choke on a non-existent category, or return a zero like it should?

<?php var_dump( (int) get_category(’6969′)->category_count); ?>

<?php var_dump( (int) get_category_by_slug(‘This Damn Category’)->category_count); ?>

Both of these do indeed return zero, which is nice and correct.

In practical terms, you can use this code stub to do something will all categories depending on whether or not there are posts:

foreach (get_categories(array('hide_empty'=>false)) as $category){
	if ($category->count > 0){
		//has posts, do something
	}else{
		//no posts, do something else
	}
}

That’s all there is to it!  There’s definitely a built in function to inspect categories, and even some nice examples to go with. Happy WordPress hacking, and don’t forget that now’s the time to upgrade to PHP 5 if you haven’t already :)


The beginning of the end of PHP’s popularity?

“there will be no more PHP 4 releases, regardless of whether there are security issues found in PHP 4.”

Whoa. Is PHP shooting itself in the foot here?

Despite recent hating and negativity of PHP as a programming language, there’s no debate that PHP still rules as the most popularly deployed scripting language for the web. If you separate PHP developers from those just running PHP applications, that gap widens even further. Ease of deployment for PHP apps plays a huge role in its popularity versus other languages like Python and Ruby, despite the fact there are very capable web frameworks for both. It doesn’t get much easier than purchasing a $5/mo. shared hosting account, slapping WordPress on it and having a website. But could today’s last release of PHP version 4 bring about the beginning of the end of the language’s widespread popularity as a clear leader?

Today Derick Rethans, the release manager for PHP announced version 4.4.9, which is to be the last release of version 4:

“Now, more than 3 years after the last major PHP 4 release, it is time to die down. With hardly any support for OO, sub-standard XML support and generally lots of other suckyness as well, it’s time to focus on the future: PHP 6. So please die PHP 4 – and quickly. Today, August 7th, 2008 is the last release of PHP 4 – PHP 4.4.9. After today there will be no more PHP 4 releases, regardless of whether there are security issues found in PHP 4. It’s time to upgrade now.” [link]

Now, there are reasons that popular software for PHP is to this very day still written for PHP 4. A large number of hosting providers have not made the jump to version 5. While both versions can be run simultaneously, the solution is not optimal, and a bit of a hack. A little over a year ago, Matt Mullenweg of WordPress stated his reason for not abandoning version 4. I personally disagree with his apparent “if it ain’t broke don’t fix it” attitude:

WordPress works just as well with PHP 5 as 4, and there are no features on the roadmap (including ones on your list) that would require PHP 5. The only reason for us to break PHP 4 compatibility would be political, and our users without the ability to upgrade their server would be the ones who lose. WordPress doesn’t make PHP 4 interesting or not, it’s agnostic.” [link]

I certainly agree that Matt should have his user base in mind, and that yes, WordPress as an application is not novel because of the language it is written in, but to say there is no advantage to using the PHP 5 features in the development is a bit dubious.

But now there’s a new dilemma facing both the hosting providers and application developers such as WordPress; namely, the “vendor” of PHP 4 is closing shop. As far as official PHP releases are concerned, there will be no bug fixes, security updates, feature additions, or optimizations. What will happen if a serious security hole is found? I’m not preaching FUD, but being realistic based on history. At what point will hosting providers have to upgrade? Will the application developers follow suit and take advantage of the PHP 5 features and benefits? More importantly, will the ease of use of frameworks like Rails and Django bring about the “killer apps” for the world’s blogging, e-commerce, mailing list management, classified ads, and other applications that put PHP where it is today? Is Typo indeed the next WordPress?

It’s no secret I’m a fan of PHP, warts and all. PHP will do fabulous things for you as the glue language that it is. There are plenty times when I don’t need a full-stack MVC that would just be wasteful. I wholeheartedly agree that Rails is a joy to develop with, even if it’s a pain to deploy, but I would challenge anyone to a pure Ruby vs. pure PHP hackathon to prove you can write “cowboy code” in any language.

As far as my predictions for PHP’s popularity ranking, I’m really not worried. The upgrade is not bad, the building and configuring of the software across versions follows the same conventions. Porting applications, if they need it at all, is a modest task. For PHP to lose its foothold among hosting providers, better app solutions from other languages would have to exist, become wildly popular, and have good community support for updates. Simultaneously, the frameworks and server libraries would have to be stable enough for hosting providers to support and provision. So really the choice is: modest upgrade to PHP 5, or complete overhaul to trendier languages and frameworks? Of course, we talking open source here, so I guess there’s always a possibility of a grassroots PHP 4 maintainence posse.


A couple new APIs

Work’s been busy, so I haven’t had a chance to do a lot of mashup stuff lately. Via Chad, here’s some  APIs I just found out about that could make for some interesting mashing:

NPR API – “over 250,000 stories that are grouped into more than 5,000 different aggregations.”

Crunchbase API


Quote from Mark O’Connor

“Knowledge is after all a non-rivalrous good” – Mark O’Connor


SPFCCSMFT

My childhood icons are disappearing. First Mr. Rogers, then Mr. Wizard, and now the Hippy Dippy Weatherman with all my hippy dippy weather, man.

Aunt Dee-Dee let me see Carlin on Campus at way too young an age. I think George Carlin’s objective way of looking at all the stupid things we do as a collective really shaped my cynicism as a kid.

But now he’s dead, expired, perished, met his death, meet his end, passed away, been taken, yielded his breath, resigned his being, ended his days, ended his earthly career, breated his last, ceased to be, departed this life, is no more, gone off, dropped off, popped off, lost his life, sunk into the grave, dropped dead,  given up the ghost, paid the debt to nature,  shuffled off this mortal coil, taken his last sleep, gone the way of all flesh, handed in his chips, joined the greater number, crossed the Stygian ferry, crossed the bar, gone to Davy Jones’s locker, gone to the wall, received his death warrant, made his will, stepped out, gone out like a candle, come to an untimely end, caught his death, gone off the hook, kicked the bucket, bought the farm, turned up his toes, pushin’ up daisies,

and is, of course, stuck on the roof.

Thanks, George, for being the crotchety bastard we all needed to keep us in check.


Top post on WordPress.com

You gotta be kiddin’ me.

wphome.jpg


Yahoo Pipes adds support for serialized PHP

A few days ago I sent an email to Chad Dickerson, who I’ve met at Yahoo! and had a chance to hang out with at Mashup Camp in Dublin.

Chad,

From what I can tell, if you create a Pipe and add additional fields (Shortcuts, Term Extraction), the only way to get to them in an API-like way is to use the JSON renderer. The RSS renderer removes those extra fields to follow the RSS spec. PHP supports JSON decoding, but you need a PEAR library or a quite recent version of PHP. If Yahoo supported serialized php with Pipes like you do with the other common API’s, it would be a lot easier for folks on shared hosting to work with Pipe data on the server side. I imagine with the new badge stuff you released that there’s a push to keep things client side, but there’s a huge advantage to rendering server-side to keep things nice and spiderable.

Short Version:

Expose Pipe results as serialized PHP. Pretty please.

Chad sends this along to the Pipes team, and less than three days later:
Pipes Blog » Blog Archive » New Yahoo Pipes PHP serialized output renderer

kick.
ass.

John Herren and Chad DickersonTwo points to be made: first, I’m damn impressed that one of the largest sites on the ‘net would roll a feature request from an outside developer in less than three days. Second, developers should never resist the urge to ask for help from an API provider. If a company is taking the time to support an API, chances are very good that they will listen to developers and react. I can personally say I’ve gotten immediate results from Technorati, Dapper, and now Yahoo!. So blow off the idea that a big website would never listen to little ol’ developer you. With that negative attitude it’s guaranteed you’ll never get it. Ask, believe, receive, right?

So props to Chad, Jonathan Trevor, Paul Donnelly, and the rest of the Pipes team!

The Details

I’m a big fan of Yahoo Pipes. It’s an incredibly useful tool for putting together quick aggregators and filters for mashups. To integrate a Pipe on a webpage, you have a few options. You can go the cut and paste route and use a Badge, which works client side, or you roll your own code to integrate a pipe.

Put this in your pipe..

After you run a Pipe, you’re given a list of output formats. Copy the link location of these to get the URL of the output and tweak the parameters.

Until yesterday, the output formats useful for mashups were JSON and RSS. JSON is great for client side mashups, but as you know, search engines will not index client side content, so you lose any SEO love you might get. RSS is easy to consume server side, but Pipes will normalize the output to conform to the RSS spec. That means if you’re using term extraction or Shortcuts or any other meta data to your pipe, you’ll lose it with RSS ouput unless you put that data into one of the RSS fields (title, description, etc.). So that leaves us with hacking JSON on the server side. The JSON output format retains all that sweet metadata. In PHP, the best options are a JSON PEAR module or, if you’re rocking 5.2 and above, you have the handy json_decode() function.

Now that Yahoo supports serialized PHP, using Pipe output just got a lot easier. I made a Pipe to add Term Extraction info from any RSS feed. Basically what we’re doing is automatically tagging all the posts in the feed and to retrieve the tags in your own script, all it takes is:

<?

$pipeURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=Zli1l6UB3RG_l7ZvX0sBXw&_render=php&rssurl=‘;
$feedURL = ‘http://rss.news.yahoo.com/rss/topstories‘;

$tags = array();
$response = unserialize(file_get_contents($pipeURL.rawurlencode($feedURL)));
foreach ($response['value']['items'] as $item) {
foreach ($item['tags'] as $itemTags){
$tags[] = $itemTags['content'];
}
}
var_dump($tags);

At this point $tags is and array of all of the terms from the feed. Now what could be done with that data?

Serialized PHP or JSON?

If you have json_decode() available in your PHP install, is there any advantage to using JSON over serialized PHP? Let’s find out.

File Size

Saving the output directly to disk gave me

JSON – 51192 bytes
Serialized PHP – 56885 bytes

Because of syntax and PHP’s type specification, serialized PHP is about 11% larger than JSON. This ratio will increase as the number of elements in your output increases.

Decoding Speed

How long does it take to slurp these formats into PHP variables? My tests decode each 100 times.

JSON
real    0m0.269s
user    0m0.264s
sys     0m0.004s

Serialized PHP
real    0m0.088s
user    0m0.088s
sys     0m0.000s

It’s clear that unwinding serialized PHP is faster than JSON, so it’s a better choice performance-wise despite being slightly bigger over the wire.


MashupCamp Venice 2009

VeniceMashup – MashupCamp

Hot off the presses, and just a “concept,” but still…


Halloween == Christmas

Because

Oct 31 == Dec 25

It’s an octal joke. John Lim over at PHP Everywhere had to chase a bug report because in PHP tossing a zero in front makes your number octal, the way 0x makes you hexadecimal.

Octalpussy | PHP Everywhere

Octal notation can be useful for things like filemasks, but I can’t think of many other practical (non-bit fiddling) uses, so school me on it please.

A common PHP mistake is forgetting the chmod() function wants an octal number as the file mode. I wrote about this and some other fun screwups when I worked for Zend in an article called PHP Gotchas, which could really use a part two..


Follow

Get every new post delivered to your Inbox.