Overhauling roles and capabilities, part 2

This is a follow-up to my initial overview of the roles and capabilities system in WordPress. I’d check out Part I first.

I’ve previously explained the complexities of the roles and capabilities system, but I haven’t adequately argued why they are too complex.

Numerous developers have weighed in at Trac ticket #10201, but I want to touch on this anyway. More or less, the current system is fine if you want to know what the current user can do. We load up their roles and merge the capabilities that make up those roles, add any other capabilities the user was granted, and remove any revoked capabilities. Then, we check if the capability we’re checking is in that list of capabilities.

The problem is apparent when you want to know the answer to this question: How many users have capability X? Querying for abstract capability X strikes some as edge case at first, so a better question might be: how many users should be considered at least an author?

To answer this question, we have to load up each user individually and build their derived capability list, by loading up their roles, add-on capabilities, and revoked capabilities. It should be fairly apparent that on a site with many users, the capabilities system can be a performance issue, to say the least.

Let’s remove user-level capabilities from the equation. Now, we can load up all of the roles, figure out which have capability X, and figure out which users have those roles. Clearly a huge performance benefit, as we’re running one query (to fetch the role definitions), unserializing its result, looping through the roles and checking for a cap, then querying the users for that role.

If only it were that easy. We store a user’s roles and capabilities in a serialized array in usermeta. That means we have to again leave SQL with every user’s usermeta value, unserialize it, and check if the user has the role. Core does employ a hack in a few spots for performance reasons, by using LIKE "%editor%" against serialized data. This very unpoetic code is the epitomy of the overly complex capabilities system.

But, we’ll no longer be storing capabilities, or multiple roles. Thus, we can now have our usermeta value be a simple text value with the name of the role. Two example queries:

SELECT * FROM $wpdb->users u INNER JOIN $wpdb->usermeta m WHERE u.id = m.user_id AND m.meta_key = 'wp_capabilities' AND m.meta_value = 'Editor';

SELECT user_id FROM $wpdb->usermeta WHERE meta_key = 'wp_capabilities' AND m.meta_value = 'Editor';

Now that is poetic code, and it would scale much better in an environment with many users. (When I mean many, I mean many, many thousands.) And I’ll point out here that there have been suggestions for a new table, to get this out of usermeta entirely.

The proposed overhaul is controversial, which explains its postponement for multiple releases, but appears to be gaining traction for 3.1. As the wp-hackers thread progressed, I suggested that the proposed overhaul could always end up being watered down a bit before entering core as a compromise, notwithstanding rather solid support from the core developers for the proposal on the table.

That said, I cannot imagine that user-specific capabilities would remain in core. If there is a compromise, it would end up being multiple roles, because we could still avoid serialized data, by using the same meta key multiple times, with different meta values. The one thing I don’t believe we would want to do is take an overly complicated system and oversimplify it, especially given the growth and scale of WordPress as a CMS. If we can sanely implement multiple roles in the new schema, keeping them may be something to consider.

The counter-argument is that we have a chance to simplify the capabilities API to make it manageable, and we should take advantage of that. We could sanely build a core plugin for role management. Multiple roles not supported in the user interface, nor would they be supported in such a plugin, leaving them to be the only piece of the API that would not be exposed.

That brings me to a final question: Is anyone actually using multiple roles? Because I’m noticing that WP_User::remove_role() was broken and WP_User::add_role() had problems as well. Both were fixed in the 3.0 development cycle after having issues for a few versions. Makes you wonder.

Overhauling roles and capabilities

Read part two here.

Thanks to some Google Summer of Code proposals, there have been some conversations on the wp-hackers list about roles and capabilities, and how we can improve them. It’s important, though, to understand exactly what the current API allows — it’s much more complicated than many realize.

Here’s how an end user generally sees it:

  • A user can be given one of five roles: Administrator, Editor, Author, Contributor, Subscriber.

There’s much more of course behind that. Hopefully, a developer just getting started with WordPress does not notice the long-deprecated user levels. If all goes well, budding developers will also realize that:

  • Each role is made up of a number of capabilities, such as manage_options (only for Administrators), moderate_comments (for Editors and Administrators), and edit_posts (for Contributors, Authors, Editors, and Administrators).

Of course, a more experienced developer, or a blog administrator who has downloaded one of the various capability/role management plugins will also realize that:

  • You can create roles and give those roles a set of capabilities. Example: You can give the Editor role the activate_plugins capability.
  • You can grant individual users a specific capability that their role does not otherwise give them. Example: A user with the Editor role can be given the activate_plugins capability.
  • A user can have capabilities removed from them, negating capabilities they would otherwise have through their role. Example: A user with the Editor role can be stripped of the unfiltered_html capability.

But wait, there’s more:

  • A user can have more than one role. Thus, a user could have both the Editor role, and the Administrator role. (Since all capabilities in the Editor role are also in the Administrator role, then this example would have no effect.)

This last one is not supported by the core WordPress user interface. No publicly released plugins use it (that I know of). In fact, the main use case would be for a membership plugin (which is, incidentally, the rare use case for using your own table).

The problem is, this system cannot scale well due to the overhead. But, since most of our performance woes are from features that aren’t used, the best solution is to remove those features.

A diagram of the complexities of the roles and capabilities system. Right, the proposal currently on the table.

The current consensus — see Trac ticket #10201 — would be to eliminate user-specific capabilities (both additional capabilities, and negation), and we would force users to have only one role.

It would be an admission that the current roles/capability system, in a desire to be leaps and bounds over the original 0-to-10 user levels, went a little too far.

In a later post,* I’ll talk about the schema — how we store roles and capabilities now, how we may store roles and capabilities in the future, and how we’ll bridge the two on an upgrade. Some alternatives should also be discussed, but in the context of the schema.

* While drafting this, I’ve also been replying to that wp-hackers thread, so my thoughts on most of this are already out there.