2010-02-08

Aggregation vs. Collaboration

There was recently a Nieman Journalism Lab article comparing news coverage between Wikipedia and its smaller sister project Wikinews. It's an interesting topic, and certainly relevant to Wikimedia strategy in the long run: is Wikinews tenable? Would it be better to discontinue it, fold it into Wikipedia, keep it as is, or some other path?

Andrew Lih, who was interviewed for the article, attributed the differerence (and, apparently, a preference for Wikipedia) to a few factors. First, a formulaic structure in Wikipedia articles: an "inverted pyramid" going from general description to finer details. Second, a redundance in Wikinews articles: new events in a series require an entirely new story, with a new narrative and a new summary of contextual information—one that increases the workload and lessens the motivation of a Wikinewsie. Finally, he argues that the wiki process does not lend itself well to narrative processes like Wikinews, giving as an example the (essentially failed)
A Million Penguins project by Penguin Publishing.

While Andrew Lih's commentary is good, I think there's an interesting generalization that can be made that I'm not sure is evident in what he says. My generalization is as follows:

Aggregative content production is easier than collaborative content production, but lacks the same quality.

This generalization is intended to highlight one of the biggest failings, in my opinion, of the "Web 2.0" shift: a lack of real collaboration. Take any of a number of Web 2.0 sites, and they can be roughly categorized as either primarily aggregative or primarily collaborative. For example, Flickr and Wikimedia Commons are primarily aggregative: add a decent image and you've improved the collection. There might be some collaborative elements, particularly in managing the metadata on Commons, but the broad thrust of these sites is aggregative. Content could in theory be added automatically.

I feel that most sites will tend towards aggregation over collaboration: aggregation is far simpler and easier. You don't worry about whether your funny cat videos really add to YouTube: you simply trust that enough people will upload enough videos of sufficient quality to keep you amused. People don't go around deleting bad YouTube videos, or solely improving other peoples' work. Aggregation doesn't require high-quality reviews, or any sort of endorsement of content, but instead generally takes more general statistics and perhaps-ignorant numerical approximations that can be made automatically. Search engines like Google are good examples of aggregative content: people create websites on the Internet, then Google aggregates most of those websites and applies an automatic process to rank their relevance to any given keyword.

I think that in the Wikinews vs. Wikipedia debate, we're missing a key component of the difference: which project is more collaborative, and which is more aggregative? I see that even if only one author writes a set of Wikinews articles, dozens of articles may be required to get the same information that will be present in a single paragraph of a decent Wikipedia entry, each written by a person or team. A Wikipedia article can be updated by simply adding a sentence with the update and perhaps a citation confirming it. Wikipedia is, in this sense, more aggregative than Wikinews. As ironic as that may seem (since Wikipedia is generally more collaborative than many "Web 2.0" sites) Wikipedia can more easily compile small contributions that might be worthless on their own into a high-quality aggregate article. Wikipedia is more formulaic, more automatic than Wikinews in some senses, especially given the lack of narrative. (That being said, we should not ignore the fact that Wikipedia's model is, in fact, essentially collaborative, but my argument is that Wikipedia has more of the low-input aggregative mode in this respect than Wikinews does.)

Now, automation and aggregation are not necessarily good or bad things: Google is leader in the search market
precisely because it does an aggregative task well. But aggregation has severe limitations. There is a certain lack of human oversight in many aggregative processes, and they can easily be gamed: for example, Google bombing or search engine optimization or even simple sock puppets on sites without close moderation.

Collaboration is highly desirable. Good collaboration produces much better-quality content than aggregation, in general, since there are few forces to cause aggregated content to be improved. Standards can be built; corrections can be made. Google Knol is a good example of this problem. Individual "knols" are essentially controlled by their original author(s) and those designated by them, and there can be any number of knols on the same subject. Edits are often only allowed when manually approved by the original author. It is thus usually much, much easier to create a new, mediocre knol than to attempt to improve on someone else's knol. On average, I would bet on the Wikipedia process more than the Knol process.

The primary problem with collaboration is apathy. It's very, very hard to get good collaboration going, and when it does manage to make those few steps you'll still see only a tiny fraction of users contribute meaningfully to the end result. Apathy is a powerful enemy of collaboration, and without any interest in collaboration, a collaborative project will die a ghost town or be filled with irrelevant material or perhaps simply be taken over as a soapbox for a vocal minority. Aggregation solves the apathy problem by taking a route around it: make the apathy irrelevant by bringing in as much content as possible. Some users will still contribute high-quality material, and if they are numerous enough, the service will ultimately be useful. One does not have to care very much in an aggregative environment, and that helps overcome apathy to collect the end goal: the product; the content.

There can—and probably should—be a balance between collaborative and aggregative processes. Wikipedia, for example, harnesses aggregative forces in small edits and new articles, which fuels a platform for collaborative production. The difference in ease-of-growth between aggregation and collaboration, I think, is best illustrated by Wikipedia's own content statistics. A tiny fraction of the English Wikipedia's articles get significant collaborative influence and see recognition and promotion to grades like "Good" or "Featured", while a sad majority of articles remain short (but still hopefully useful) "stubs" and "Start-class" articles that have not seen significant editing by people other than their creator (or bots, which tend to break statistics looking at human patterns). Now, I think that the balance that ought to be sought is one that continues to accept the powerful aggregative influence, but that greatly promotes collaboration where possible, since collaboration most reliably produces good results.

The long-term goal needs to be to foster collaboration. Whether this will, or should, occur, at the expense of, or fueled by, an aggregative process, remains an interesting question.