Do you want to know how Wikipedia was able to become such an incredible success? Who the people behind its success are? The best book to learn about the history and the culture of Wikipedia is Andrew Lih’s new book “The Wikipedia Revolution“, launched last week. He was at Harvard last night to give a talk and do an interview with Berkman Fellow and distinguished internet scholar David Weinberger.
Andrew shares with us his story of how he first came across Wikipedia – in many ways, it was a very different experience from most people. On February 9, 2003, Andrew was looking for his next research project – he has been studying online journalism and new media for a long time – and has been instrumental in creating the new media program at the Columbia J-school – he was told that he should take a look at this new site called Wikipedia – this amazing site that “anyone can edit”. Contrary to most people, he heard the principle first, before he saw the actual website. When he took the time to explore the site, he was immediately taken away with it, thinking “the crowd could not have written this” He looked at more pages, started using Wikipedia in class assignments, and became so fascinated with the project that he wanted to study it full-time.
“It works in practice, but not in theory” is often said of Wikipedia. And that’s definitely true if you consider its origin. Wikipedia started out of a project called Nupedia – in many ways this was projected to be a conventional encyclopedia. Started by Bomis, it envisioned a 7 step rigorous peer review – it would recruit volunteers to write its articles – and the hope was that most of these volunteers would have a PhD degree. That is, the original vision of an online encyclopedia was one with very high stringent requirements.
The big problem: after one year, Nupedia had the grand total of twelve (count ‘m) articles. Even worse, they were written by someone on the payroll. This was clearly not sustainable. Larry Sanger decided to intervene – realizing they needed something radical to at least get seed material. He turned to this thing he saw called wiki software – created by Ward Cunningham – wiki was a way for programmers to share best practices – it would be an online resource for programmers. The name came from the wiki wiki bus in Hawaii – meaning ‘quick’. The wiki software indeed produced quick results – as of recent, there are over 2.8 million entries in the English Wikipedia alone. So why does Wikipedia work? Andrew suggests five key factors: it was free – open – neutral – timely and social.
Andrew describes the piranha effect – the idea that one change in one corner can inspire other changes and create a torrent in the community. For example, in one particular week, 33,800 (count ‘m) articles were added in Wikipedia. This was largely from a huge body of census data from the US – a software robot was written to extract relevant information from this data and inject every possible town and city in Wikipedia. One such town was Apex, and it just happened that on one day SethIlys visited this page. It was a dry article – what he decided to do was – hey, why not put a map on there? A few keystrokes later, he had added his own handmade map – and in his own way, was able to contribute his knowledge to the world. Useless perhaps? Perhaps, but if he visited this page, why not someone else as well? This experience was really empowering to him. Once he started with one map, he figured, why not add others? And once he started, it did not make sense to stop – so like Forrest Gump – he kept on running. The strange thing was, others started running, too. Nearly all the US census location articles now have maps.
There’s a famous saying: “if you have a hammer, everything looks like a nail”. Andrew adds to that: “If there was ever a project that had lots and lots of unhammered nails, it was Wikipedia.” The dot map project was an inspiration – an exemplar – encouraging people to do things they never thought possible. And in many ways, Wikipedia itself is such a project as well – an exemplar.
David starts his interview with Andrew.
David: Let’s get this out of the way first, are you neutral about Wikipedia?
Andrew: No I’m not. I analyze neutrally. But I’m a big fan. I believe Wikipedia is one of the most fascinating creations man has ever made. But that doesn’t mean it doesn’t deserve scrutiny.
David: You think it was important enough to write a book about – an endorsement in itself. But let’s go to its origin myth – as all super heroes have one – the myth is often that idealists came together to do this democratic experiment and that the world’s greatest encyclopedia is the result. Is that right, where did it go wrong?
Andrew: Telling Nupedia’s story helps debunk a lot of this. It started as a failure. There was no way anyone knew how to do this. Even though the founders were very internet savvy and big fans of open source, it was not apparent that doing an encyclopedia in that style was the way to go. Only after a full year, did they decide to try it this way.
It’s also interesting that Wikipedia is always cited as an example of democracy, but the community itself never uses that word. It assumes good faith, it likes consensus, but it never ever uses the word democracy. As a matter of fact, a key thing in wiki is NOT to do voting. They discourage voting – they rather decide through discussion, not to rely on hard measures like voting.
David: What’s wrong with hard measures?
Andrew: The problem of gaming the vote without having meaningful discourse. One of the most contentious issue was the Danzig/Gdansk edit war. An edit war is what happens when you don’t converge on a neutral point of view – the result is that there is a constant flipping back and forth between different revisions of one article. This edit war was the catalyst of a lot of policy change – for example, the Three-Revert-Rule. But in this case, after a year of brutal edit war, voting was inevitable – it was a defining edit war in English Wikipedia history.
David: Can you talk about the flatness – that supposedly every voice is equal and there is no hierarchy – and its rules, the anti-rules and emergence of rules?
Andrew: The rule is that you shouldn’t have that many rules – having too many rules, you start to game the rules. There are rules nevertheless – neutral point of view, assume good faith, – the idea that your next contributor could be the most prolific one, so don’t bite the newbie. But these rules are soft ones and established during the early days – the community has changed quite a bit since 2001.Today it is no problem to get people to contribute. The problem is to get rid of bad stuff. The concern: is the community is still as vibrant as the early days?
David: There is an antipathy towards rules – the idea that rules tend to breed bad behavior – yet at the same time it is a warm-hearted community – assume good faith. To what extent is Wikipedia free of a certain political mindset in the structure of Wikipedia as an emergent community?
Andrew: The English Wikipedia, it’s a liberal progressive community, or libertarian. It is reflected in the early roots of Wiki – they met on Objectivist mailing lists. Jimbo (Jimmy Wales) is a straight forward libertarian – common in the geek community. The articles are generally of good quality nevertheless. But if you disagree, you can fork. One such response is Conservapedia.
David: Is it built along the same principles?
Andrew: No, but I wish it were. Articles are often written in direct opposition to Wikipedia articles.
David: Is it open to edit?
Andrew: Hmm, hard to say. More people are in control, they are not as inclusive.
David: What I like is the pragmatism of Wikipedia – a general dislike for rules, but if you need a rule to build an encyclopedia, then it’s fine.
Andrew: There are really five pillars, one of them is that Wikipedia is an encyclopedia. That might sound silly, but that wasn’t so in 2004. Wikipedia had grown as a community with lots of social aspects – there was a gaming lounge for example where people were playing virtual chess games. We had to shut that down – it was pretty cruel – but we are here to write encyclopedia articles and not to support MySpace activities.
David: It’s also a discussion about which articles are deleted – that Wikipedia is not an art project. It’s an encyclopedia, but sort of different – so the question becomes what an encyclopedia is in a digital age? It’s a sharp edged debate between the deletionists and inclusionists – what side do you fall on?
Andrew: The inclusionists’ argument is that wiki is not paper – why not have articles about anything under the sky? An article on an obscure issue does not take away from your general experience. The deletionists, also called exclusionists, argue that the value of an encyclopedia is that it is a set of articles. It’s no good to have an article where every single word is cross-linked, or that are not reliable – the key test here is – should we have an article on what we had for breakfast?
In the early days, I was considered an exclusionist. I argued that it does matter how selective you are – that articles need to be verifiable, high quality. Over the years, the community standards have shifted, to the point that I don’t think I have changed my stance that much, but where I am now being considered an inclusionist.
Now it is crucial to keep out the bad stuff – Wikipedia is now high profile – and recent policy changes are all about restrictions, restrictions, restrictions. It provides a much more different atmosphere than the early days – now much more stringent.
David: What gets people so passionate about this particular issue?
Andrew: It’s not just within one language – it’s across cultures as well – for example, the German Wikipedia has 900,000 articles – a long way to go before you hit the 2.8 million articles of the English Wikipedia. But the Germans are very happy with their 900,000 articles – they generally have a much more stringent standard. Wikipedia used to be known as the definitive guide to Pokemon – that would not fly in German Wikipedia. That’s their style. The German Wikipedia is more traditional – but also has a great reputation – the German government, libraries, and universities are all interested in working with Wikimedia Deutschland because their quality is so high.
That is to say, the inclusionist/exclusionist argument also varies widely depending on the cultural lens you use.
David: Is it a problem that neutrality happens only if there is enough homogeneity in the community? Or they will have to break off? Does Wikipedia reinforce a prevalent domain of discourse that everybody agrees on? And thus excluding other views?
Andrew: Certainly in some languages – the first twenty languages – the largest languages – are fairly well educated and multilingual – especially contributors for the English Wikipedia span the whole world – and there is diversity of view points. But after the twenty languages – the drop off is bigger – and people are more homogeneous.
David: Isn’t this the case in English Wikipedia as well? That is, neutrality hides a fork – people fork.
Andrew: Yes, but they create meaningless forks, that nobody links to, they fade away.
David: That is exactly the price that it exacts – marginalization of points of view out of mainstream – that they cannot get on the same page – lots of groups accuse Wikipedia of this.
Andrew: Jimbo said once that Neutral Point of View is a term of art – most things that work are not razor sharp. There is a lot of faith in the actual ground troops – that they stay within directive – and that hopefully the diverse community will take this in account and create reliable content.
David: Lets talk about the changing roles of authority. Being a big prof doesn’t matter – its bad form even if you say this.
Andrew: Editorial authority is even more interesting in Japanese Wikipedia – most are anonymous – this is because the dominant internet culture in Japan is based on anonymity. You could be discussing with anyone, a housewife or a prof, what matters is the quality of edits.
David: Let’s talk about Essjay.
Andrew: That was one of the bigger crisis. Essjay was a pseudonym – and on his user page it said that I can’t tell you who I am but I have a PhD in Theology and I work at an academic institution but would get into trouble if I tell you my real name. Was an incredible prolific contributor – over 10,000 edits and everybody generally accepts that they were good quality. He eventually got access to admin privileges – that is, he could check IP addresses of users behind the scenes, and only a dozen people can do that, had access to private data.
What happened was that the New Yorker was doing an article – by Stacy Schiff, a Pulitzer award winning reporter – she did an interview with Essjay – wrote a long piece. Then Essjay took a job with Wikia Wikimedia Foundation (EDIT: Andrew corrected me: he took a job with Wikia, the for-profit firm founded by Jimmy Wales and another Wikipedian Angela Beesley) – and to do so, he had to come clean – that he was a 20-year old with no PhD degree. This was a huge embarrassment to the New Yorker – it seemed that Stacy never even asked Essjay’s name just to fact check it.
Some people argue that Essjay lied to a reporter but had good contributions. Others pointed to the fact that he sometimes used his credentials to win arguments. It was a real soul searching for the community – a prized Wikipedian would lie to the outside world, to a Pulitzer award winning reporter, and raised issues with regard to having faith in each other in the community.
David: The increasing use of credentials – or the German system that now allows for the marking, a flagging of pages that are considered reliable – is this a trend that will continue?
Andrew: Germans lead on quality issues – they have a tighter community of admins, who almost act like a council – whereas the admins in the English Wikipedia function more like janitors. So why not have a flagged version – you could flag the last version of an article that is stable – and you show people the latest checked version. You get better quality but you lose that they are instantly updated. The Germans implemented this last year – quite a success – flagged 89% in first year. The English Wikipedia has interest to implement this but it is hard to get the community to reach consensus on anything at all. Right now it’s a total stalemate – it had a surge of initial support but now trickled down.
David: The common complain is that students go to Wikipedia and simply believe what is there. What is it that readers need to do not to be fooled by occasional vandalism? How scared should we be?
Andrew: Wikipedia should be the starting point, but not ending point. It should not be in citations, just like entries from the Britannica should not be cited.
David: How confident should we be when we use it to look things up>
Andrew: The critique that it is dangerous when 14 year olds take it as gospel is not fair. Most people are media savvy. And then there is a whole range of things the community implemented – for example, requiring sources – in 2003, 2004 you never had any article that was tagged ‘citation needed’, now you do everywhere – there is a team called the ‘citation needed patrol’. Standards have improved – but ultimately I think flagged versions should be put in some way – right now it looks like it will be used for entries of living persons – this is for libel reasons. We start there and see what happens.
Question: Can you discuss failed Wiki projects?
Andrew: The battlefield of failed wikiprojects is vast. Wikitorial from the LA Times was a real disaster. There is an assumption that you put up a Wiki and the Wiki Magic will happen. The LA Times learned the hard way – if you have no robust community with admins that fight vandalism, it’s a recipe for disaster.
What you realize after all this failed projects – wiki is perfectly suited for encyclopedia. It’s like a bento box of writing.
Very structured writing and lends to crowdsourcing. Very modular. This is not true for a novel, for example. Penguin had a contest where they put up a Wiki and expected that the magic wiki crowd would write a novel – did not happen.
Those that do work: lots of sharing, step by step, modular structured style of writing. Certain type of content are like this, but lots don’t. A lot of other organizations learn the hard way.
Question: Why not make people use full names?
Andrew: There is always talk in community – now do we don’t need anonymous people anymore – they give us more problems than they are worth – lets start requiring higher standard. In the beginning – the original culture dominates – Wikipedia tends to be inclusive – anonymous users are the core value of “anyone can edit”.
David: What about pseudonyms?
Andrew: It makes you to be able to converse with this person, it allows interaction, although you don’t know the authenticity. You can still see all the edits. Interestingly, pseudonym users give less information than anonymous users – with anonymous users, an IP address is recorded, and that often provides geographic location, what organization you are part of, etc. The Wikiscanner used this to its advantage – found out that people in Congress, Ogilvy, all kinds of organizations were editing articles they probably should not be editing. It was a typical example of sunshine being the best disinfectant – it was a kind of watchdogging the crowd.
Question: If Wikipedia would have been run by company, would it have been different?
Andrew: If Wikipedia was a commercial company, no way it could have been successful – people contribute because it is a free license – same like with Linux – people know it wasn’t making a company rich. Example is the Spanish fork – in the early days, there were some rumours about the possibility of advertisements – the Spanish community went ballistic on the mention of ads – they literally took the ball and went home – started Encyclopedia Libre – convinced all contributors to leave. This incident set the tone for the community since then.
Question: Is the bulk of content made by a small number of people?
Andrew: The idea behind the 80/20 rule is that 80% is done by 20% of the people. But this is not necessarily true for Wikipedia – Aaron Swartz’s research shows that there is a wide swath of people that edit Wikipedia. While the distribution is still non-linear, it’s just not the case that there is an elite crowd who edits over hundred hours a week.
David: Aaron’s work shows that the creation of new articles, the bulk of it is done by a broad range of users – which makes intuitive sense.
Andrew: As far as where the community is now, we don’t have good numbers. Since October 2006, there is no authoritative dump of Wikipedia anymore – it takes more than a month to do a monthly dump. This leaves Wikipedia vulnerable – and you also can no longer do statistical analysis.
David: We should each download one page!
Question: Can you talk about Larry Sanger?
Andrew: Sanger has an odd role – he did set up most of the basic rules of Wikipedia – but over time also encouraged Wikipedia to be more elitist over time – and some started seeing him as a pariah, as the anti-Wikipedian. Citizendium is supposed to be Wikipedia done right – with a layer of expertise but still largely open. His main criterion seems to be maintainability. He thinks a lot of what is going on in Wikipedia is just bs – trying to turn vandals into productive members – he is saying, cut that out, work with experts who can cut through the junk. We’ll see what history will say about that.
(Question about the vote on license migration – got lost in the details)
David: Wikipedia experienced exponential growth – but what got us there may not be the right set of tools to move ahead.
Andrew: That’s why flagged is inevitable – not to grow further, but to maintain quality.
Question: How did the power structure evolve?
Andrew: The number of privileged positions have grown but tend to be technical rather than editorial oversight. As an admin – you can block users – but only in narrow situations. You can lock articles – but only temporary – for combating vandalism. Promotion is community decision, there are no hard metrics. Things considered include the number of edits, activities you engage in, social capital – these are all intentionally left vague – the decision is made on an interaction human human basis – it’s not like there is an eBay rating or Amazon ranking.
Question: Why are there different forks and how do they exist – is there a possibility to have one global Wikipedia instead of all these divides?
Andrew: You’re right that it is too easy to see the 2.8 million English entries as the super set from which other Wikipedia languages should be translated from. This set is missing lots of things on Chinese arts, history – things the Chinese Wikipedia has. But the problem is, you need bilingual folks, tools to discover which article is good in one language and has a bad counterpart in another ..
Question: Will the WikiMedia foundation do this?
Andrew: They are a great engine to raise funds.