User:Statistics~enwiki

From Wikipedia, the free encyclopedia

In my toddling years, I was delighted by children's author Richard Scarry's whimsical picture book What Do People Do All Day?. That inspired me to wonder,

"What Do Wikipedians Do All Day?"[edit]

Here's an answer that's more number-crunching than whimsical, but don't be put off! There are plenty of pictures too. The pictures are Ternary plots, triangles that depict the relative proportions or percentages of three quantities that make up a substance. In this case the substance is an editor's edits, and quantities are

  • Edits to Articles and article Talk pages
  • Edits to user and User Talk pages, and
  • Edits to Wikipedia and Wikipedia Talk pages.

For each editor, we've broken down their edits into three parts, and calculated each part as percentage of the total edits. We've only looked at editors with 500 or more total edits, to make sure what we're seeing is a consistent pattern over time. And we've limited our look to each editor's most recent 1000 edits, to make sure we're seeing what that editor is doing now, not what he or she did in the distant past. By "edit" we include any edit shown on the editor's "Contributions" page, including new pages, minor edits, page moves, or anything else the "Contributions" page shows.

  • Article edits include edits to
    • anything that adds directly to the encyclopedia: articles, images, templates, categories, and MediaWiki
    • and any talk devoted to discussing and improving articles: not only article "Talk" pages, but also Talk pages for images, templates, categories, and MediaWiki Talk pages
    • and pages that are technically in the Wikipedia "namespace" but are about improving articles, specifically: Votes for Deletion, Votes for Undeletion, and Templates for Deletion
  • User edits include any edits to User pages or User Talk pages
  • Wikipedia edits include any edits in the "Wikipedia namespace" except for the deletion/undeletion votes explained above, that are counted as "Article" edits.

Every edit is counted, and any edit is placed into one and only one of these three categories. The three percentages are calculated, the percentage of edits to Articles, the percentage of edits to User pages, and the percentage of edits to Wikipedia pages. Since every edit is counted, these three percentages always sum to 100%. So for example, we might describe an editor, Wally Wikipedian, as (70%, 20%, 10%) if out of his 1000 most edits, 700 were to articles or article talk, 200 were to his user page or to other's User Talk pages, and 100 were to Wikipedia pages like Village pump or policy.

Now for the Pictures![edit]

Ternary diagram, showing intervals along the first axis

The position of the symbol in the Ternary plot triangle shows the proportion, or percent, of that editor's edits to the different sections of Wikipedia described above. The closer the symbol is to the top of the triangle, the greater the proportion of edits to articles, and the smaller the proportion to User and Wikipedia pages. Likewise, the closer the symbol is to the left point of the triangle, the more edits made to user pages; the closer to the right point, the more edits to Wikipedia pages. Because this is a proportion, as the proportion one sort of edit increases, the proportion of other sorts of edits decrease. In all cases, the three proportions sum to 100%, or 1000 edits.

Ternary diagram, three sample points and the lines they intersect

So what the ternary plot shows is where Wikipedia editors devote their time: do they mostly edit Articles, or do they spend their time chatting on user pages, or are they editing Wikipedia policy pages? In other words, these ternary plots tell us "What Do Wikipedians Do All Day?"

Details and methodology[edit]

In all cases,

  • only editors with at least 500 edits were considered
  • only the most recent 1000 of each user's edits were counted (or all edits, if the user's total edits were less than 1000)
  • no edits were omitted
  • all samples of were taken from 6 to 8 May, 2005

Twice on May 6, 2005, and once on the two following days May 7 and May 8, we calculated proportions for the editors responsible for the last 50 "Recent Changes", as listed on the "Recent Changes" page. Because some of those editors showed up more than once, those four samples yielded a total of 67 unique editors, each with at least 1000 total edits.

Case 1: Anon Surprise![edit]

Surprisingly, five of those editors with 500 or more total edits weren't not logged-in editors, but "Anonymous IPs". Even more surprisingly, given the negative reputation of non-logged in editors, all of these five anons spend almost all of their time adding content to the encyclopedia. Here are their IPs and percentages, ordered from highest to lowest proportion of edits to Articles, and the ternary plot of their efforts. This plot is actually pretty boring, because all the points on it are crammed into the very apex of the triangle. But simple as it is, it's a good introduction to how ternary plots are drawn, so take a look at it.

Ternary diagram, showing edit proportions of randomly selected anonymous IPs
Editor Edits to Articles Edits to User pages Other Edits (Wikipedia namespace)
131.204.194.154 100.00% 0.00% 0.00%
68.197.107.71 100.00% 0.00% 0.00%
62.252.0.9 97.74% 1.02% 1.24%
195.92.67.69 97.65% 1.18% 1.18%
62.252.64.18 96.70% 0.80% 2.50%

As you'll note, two of these anonymous IPs devoted all 1000 of their last 1000 edits to adding to the encyclopedia. And of the other three, the smallest proportion of Article edits was over 96%: 967 edits to articles of 1000, and only 33 edits that didn't add to the encyclopedia. Great work, anons!

Case 2: Logged-In editors[edit]

In addition to the five anonymous users we found by looking at Special: Recent Changes four times over three days, we found 62 logged in Wikpedians meeting our criteria of 500 or more life-time edits.

Most edits by most editors are to articles[edit]

Ternary diagram, showing edit proportions of randomly selected logged-in users

Of these 62 logged-in Wikipedia editors taken from four different samples of Recent Changes over three days, fully 59 of those 62, or 95% of those editors, made 59% or more of their last 500-1000 edits to Articles pages or article Talk pages, or to Votes for Deletion. (It's mere coincidence that of those 59 editors, the one with the least edits to articles and article Talk spent 59% of his edits there.)

Fully 51 of the 62 editors (82% of the editors) made over 70% of their edits to articles or article discussion.

Of those 62 logged-in editors, only three of those (or 5%) devoted less than 59% of their last 500-1000 edits to directly improving Wikipedia through editing articles or discussing articles. Of these three, only two spent less than fifty percent of their time directly improving Wikipedia.

The table below summarizes our findings:

Sample # 1 2 3 4 All four samples combined
Number of Logged-in Editors with more than 500 edits 18 11 13 20 62
Number of those editors having a percentage of edits to Articles plus Article Talk in the range of
90.0% or more 8 2 4 11 25
80.0% - 89.9% 4 5 7 2 18
70.0% - 79.9% 4 1 1 2 8
60.0% - 69.9% 0 2 0 5 7
50.0% - 59.9% 1 0 1 0 2
less than 50% 1 1 0 0 2

This table tells us at least two things: first, that each of the four samples yielded pretty much the same results, making it likely that they are representative sample. And second, that most logged in editors spent most of their time directly improving articles, or improving the encyclopedia by discussing articles.

The last 5% of editors[edit]

So the three editors not devoting 59% or more of their edits to articles, what are they doing with their Wikipedia time?

The distribution of these three editors' edits is given in the following table:

Editor Edits to Articles Edits to User pages Other Edits (Wikipedia namespace)
JRM 51.40% 20.80% 27.80%
Beland 41.00% 4.70% 54.30%
Jnc 31.20% 5.40% 63.40%

User JRM[edit]

JRM's case is the simplest. Even though he falls behind most editors in the sample, he still spends just over 50% of his time on articles and article discussion. Most of his Wikipedia edits were to Wikipedia_talk:Countdown_deletion, Wikipedia:Reference_desk, and Wikipedia:Countdown_deletion, all of which are concerned about improving Wikipedia content. In addition to those edits, about one edit in five, or 20%, are made to User pages.

User Beland[edit]

Beland spends about half his edits in the Wikipedia namespace, with about 4% of his edits at Wikipedia:Auto-categorization/Wikipedia_namespace, with the remainder of his Wikipedia namespace edits spread out over various pages like Wikipedia:Utilities, Wikipedia:Offline_reports, and Wikipedia:Multiple_redirects.

User Jnc[edit]

Jnc is the user in our sample with the least edits to articles and the most edits to the Wikipedia namespace. However, nearly a quarter of his total edits are to Wikipedia:Redirects_for_deletion, so it seems a strong case can be made that his efforts are devoted to improving the encyclopedia by doing particularly tedious work.


In summary, while we see three editors with significantly lower percentages of article editors than the Wikipedia mainstream, a closer look at their work reveals that by and large, they, like all editors in our samples taken from Special: Recent Changes are devoting their edits in one way or another to improving Wikipedia.

More to Come![edit]

In our next installment, we'll take a look at the Wikipedia bureaucracy