Collaboration in context: Comparing article evolution among subject disciplines in Wikipedia
First Monday

Collaboration in context: Comparing article evolution among subject disciplines in Wikipedia by Katherine Ehmann, Andrew Large and Jamshid Beheshti



Abstract
This exploratory study examines the relationships between article and Talk page contributions and their effect on article quality in Wikipedia. The sample consisted of three articles each from the hard sciences, soft sciences, and humanities, whose talk page and article edit histories were observed over a five–month period and coded for contribution types. Richness and neutrality criteria were then used to assess article quality and results were compared within and among subject disciplines. This study reveals variability in article quality across subject disciplines and a relationship between Talk page discussion and article editing activity. Overall, results indicate the initial article creator’s critical role in providing a framework for future editing as well as a remarkable stability in article content over time.

Contents

1. Introduction
2. Related literature
3. Conceptual framework
4. Method
5. Results
6. Discussion
7. Limitations and recommendations for future research
8. Conclusion

 


 

1. Introduction

Love it or hate it, Wikipedia has become an established part of the Web. With sites in 253 languages and containing over 2.5 million articles in the English version alone, Wikipedia now ranks seventh (as of October 2008) on the list of most visited sites on the Internet (Alexa, 2008), affirming its place in the virtual landscape.

In its relatively short history, this free, open source encyclopedia has elicited various controversies. Most recently, an online utility called WikiScanner (or Wikipedia Scanner; http://wikiscanner.virgil.gr) was developed to create matches between IP addresses from edits made to Wikipedia articles and computer networks of governments, companies, and organizations — most notably discovering the CIA’s edits to an article on Iranian President Mahmoud Ahmadinejad (Fildes, 2007). In another case, an active Wikipedian claiming to be a university professor was revealed as a 24–year–old without any experience in higher education (CBCNews.ca, 2007a). Inevitably, such incidents have led to a questioning of Wikipedia’s open editorial policy.

In spite of such controversies, Wikipedia itself is increasing in popularity and the site continues to expand. Most recently, a CD–ROM version of the encyclopedia has become available for those without access to the Internet (CBCNews.ca, 2007b), which contains about 2,000 stable versions of articles on topics that are not likely to change drastically over time, such as geography and literature. Additionally, a new offshoot of the site — Simple Wikipedia — “translates” existing Wikipedia articles into simplified English for “people with different needs, such as students, children, adults with learning difficulties” and those just learning language [1].

While Wikipedia’s currency is undeniable, it is difficult to determine its authority and credibility as a reference resource, because anyone can contribute to the open source encyclopedia and its content may be altered at any time. Indeed, Katz’s (2002) traditional criteria for evaluating print resources (e.g., purpose, authority, scope) are difficult to apply to Wikipedia. For instance, while traditional encyclopedias have a stated purpose which guides the focus of the reference work (e.g., age appropriateness), Wikipedia has no such objective, beyond stating it is free and editable to all (Wallace and Van Fleet, 2005).

Discussions of Wikipedia are often centred on a given individual’s experiences with the site as a user rather than a contributor. Studies of article quality have focused on well–established articles rather than newly created ones. This study will explore the article creation process from the point of view of a contributor and will compare both newly created entries and well–established articles on the site in order to examine the collaborative process which builds articles over time. Further, it will compare articles representative of the traditional academic divisions of the hard sciences, soft sciences, and humanities to both test Wikipedia’s article quality in these three broad areas and provide further insight into how knowledge is textually represented in these meta–disciplines. Specifically, this study will address the following questions: How does the collaborative process behind a Wikipedia article function? What kinds of contributions are made to an article and its accompanying Talk page, and with what frequency, and how do these contributions relate to the quality of an article as a whole? Does the collaborative process differ amongst articles within the hard sciences, soft sciences, and humanities, respectively, and if so, how? Examining these processes will contribute to a fuller understanding of the workings of Wikipedia itself and provide a broader basis for evaluating the encyclopedia as an information resource.

 

++++++++++

2. Related literature

Growth of Wikipedia articles

Viégas, et al. (2004) created a software tool called history flow [2] to visualize the edit history of Wikipedia articles, allowing patterns to be discovered and examined more readily. Several common types of article vandalism are identified in this study, including mass deletion and offensive copy. Viégas, et al. also observed what they term the “first–mover advantage,” where the “initial text of a page tends to survive longer and tends to suffer fewer modifications than later contributions to the same page.” [3] They also noted the difficulty of examining a well–established Wikipedia article due to the vast number of edit changes that occur over time, and observed that the topic of an article can influence the type of contributions made to it.

Assessing article quality through comparison

To better understand Wikipedia’s quality as a reference resource, several comparisons have been made between this collaborative effort and traditional encyclopedias (e.g., Giles, 2005; Chesney, 2006) which suggest Wikipedia has a high level of information accuracy. Comparing 10 entries from Wikipedia with the Columbia Encyclopedia, Emigh and Herring (2006) observed that Wikipedia articles become increasingly formalized and standardized the more they are contributed to. Conversely, from closely examining two biographical entries on Wikipedia, Duguid (2006) observed that individual contributors do not necessarily have an overall sense of an article or intend to maintain an article’s balance when contributing to it, which poses an issue for each article’s overall coherence.

Edit history and article quality

Several studies (Brändle, 2005; Dondio, et al., 2006; Lih, 2004) indicate that an article with a topic of high relevance is also likely to be of high quality, due to the higher number of contributions from various authors. Similarly, Wilkinson and Huberman (2007) observed a strong correlation between the number of edits an article receives, the number of unique editors for a given article, and article quality. Additionally, through a quantitative analysis of a random sample of Wikipedia regular articles and featured articles, Stvilia, et al. (2005a) noted that featured articles tend to have better readability scores on average than typical Wikipedia entries. Collectively, these studies suggest an increasing quality of Wikipedia articles as the number of participants and editors to the site increases.

Using grounded theory, Pfeil, et al. (2006) categorized the types of contributions made to articles (e.g., clarify information, add information), providing a means of classifying changes observed in articles’ edit histories.

Talk pages’ relation to article quality

In addition to research suggesting a correlation between increasing numbers of collaborators to a given article and its quality, several studies link a discussion of articles to article quality itself. From a content analysis of 30 randomly selected article discussion pages, Stivilia, et al. (2005b) identified 10 quality issues faced by the Wikipedian community, including accuracy, completeness, and consistency, indicating Wikipedians’ concern for ensuring article quality. Similarly, from coding discussions on a random selection of 25 Talk pages, Viégas, et al. (2007) provided 11 dimensions for the types of discussions that occur (e.g., requests/suggestions for editing coordination, references to Wikipedia guidelines and policies), which may be applied to analyzing discussion pages. Collectively, these studies point to the potential role of Talk pages in Wikipedia article quality. The interaction between a given Talk page and the article it accompanies, both in terms of edit history and article changes, however, has not yet been examined.

Disciplinary differences

There is an established tradition of grouping academic disciplines by similarities in approaches to the pursuit and presentation of knowledge. Biglan (1973) created a dimension of disciplines, with the “hard” sciences on one end, the social sciences towards the middle, and the humanities at the opposite end of the dimension. The hard sciences encompass the physical or natural sciences (such as biology, chemistry, and physics) and mathematics. In contrast, the “soft” sciences include the social sciences, such as anthropology and sociology, and psychology. Lastly, the humanities include studies of “human thought and culture” [4] such as history, philosophy, art, and literature (Becher, 1987; Biglan, 1973).

From a content analysis of 30 abstracts each from chemistry, psychology, and history journals, Tibbo (1992) found that scientific abstracts on average have shorter sentences than social science and humanities abstracts, which share the same mean of words per sentence. From qualitatively examining scientific textbooks for elementary and secondary school students, Halliday (1993) noted several common types of grammatical difficulties found in scientific texts, including interlocking definitions, technical taxonomies, and syntactic ambiguity. Likewise, Hartley, et al. (2004) found that, compared with social sciences and humanities scholarly articles, science articles on average have the lowest readability scores. However, also considering the readability of texts aimed at a wider variety of audiences — school textbooks, textbook chapters designed for colleagues, magazine articles written for high school students, and magazine articles written for the general public — scientific writing overall was more readable than social sciences and humanities texts, in spite of the fact that science texts often contain more passive word constructions.

 

++++++++++

3. Conceptual framework

Purpose

This current study focuses on the collaborative process behind the creation of Wikipedia articles and the relationship between article contributions and article quality. For the purposes of this study, collaboration is defined as “the process of shared creation: two or more individuals with complementary skills interacting to create a shared understanding that none had previously possessed or could have come to on their own.” [5] Collaboration includes the individual contributions, in the form of edits, made to each Wikipedia article and which comprise each article as a whole. Coordination is also considered an aspect of collaboration (Montiel–Overall, 2005), as Wikipedia users may coordinate editing activities that contribute to each article.

This study is motivated by Viégas, et al.’s (2004) assertion that it is difficult to examine a well–established Wikipedia article, where myriad edits have likely occurred over time; specifically, this observation highlights the value of examining a small sample of Wikipedia articles from their initial creation, when their edit history is not as extensive, to gain a detailed understanding of the collaborative process from its inception.

While this study is exploratory in nature, it builds upon several established methods of Wikipedia article and Talk page categorization and analysis (Brändle, 2005; Pfeil, et al., 2006; Stvilia, et al., 2005b; Viégas, et al., 2007) and seeks to extend the observations of the consensus–building and collaborative processes that occur during the evolution of a given Wikipedia article (Brändle, 2005; Duguid, 2006; Emigh and Herring, 2005; Lih, 2004; Stvilia, et al., 2005a, 2008; Wilkinson and Huberman, 2007). It also builds upon the theory of disciplinary differences (Becher, 1987; Biglan, 1973), exploring if and how the collaborative processes for the hard sciences, soft sciences, and humanities articles differ from each other, what the effect on article quality is in each case, and how Wikipedia articles evolve over time. Although this exploratory study uses only a relatively small sample of articles, the methodology employed here could be used in future studies with larger samples to provide a stronger basis for generalizations about article evolution in Wikipedia.

Variables

This study examines the relationship between article contributions and:

  • article quality; and,
  • article subject discipline.

Article contribution is defined as an edit made to a given Wikipedia article, including the initial “edit” that created the article. Wikipedia provides the edit history for each entry, indicating what changes were made, when, and by whom. Individual contributors may be identified either by username or IP address. Article quality is an assessment of the overall value of the content and form of an article, according to the criteria discussed below. Article subject discipline refers to an article’s positioning within the hard sciences, soft sciences, or humanities, as defined above in the review of the related literature.

This study also explores the relationship between Talk page contributions and:

  • article contributions;
  • article quality; and,
  • article subject discipline.

A Talk page is a Web page that accompanies each Wikipedia article where users may post discussion about that article.

 

++++++++++

4. Method

4.1. Design and data collection instruments

The exploratory method employed in this study included observing article and Talk page contributions as well as article quality. The total number of edits and unique editors for each article and Talk page was also noted.

Article contributions

Drawing upon Pfeil, et al.’s (2006) 13 categorizations of the types of contributions made to Wikipedia articles, article edits were classified as: (1) Add information (including photographs, diagrams, audio, or other media related to the topic); (2) Add link; (3) Clarify information (rewording existing content without adding anything new); (4) Delete information; (5) Delete link; (6) Fix link; (7) Format (affecting structure of the whole page); (8) Grammar; (9) Mark–up language (changes that do not affect the page’s appearance or text); (10) Reversion (reversal of vandalism or a return to a previous article state); (11) Spelling; (12) Style/topography (affecting the presentation/appearance of text); and, (13) Vandalism.

Talk page contributions

Contributions to article Talk pages were categorized according to Viégas, et al.’s (2007) 11 dimensions:

  • Requests/suggestions for editing coordination;
  • Requests for information (related to the article but without the explicit intention to edit the article itself);
  • References to vandalism;
  • References to Wikipedia guidelines and policies;
  • References to internal Wikipedia resources;
  • Off–topic remarks;
  • Polls (where users vote on editing decisions);
  • Requests for peer review;
  • Information boxes (e.g., tagging a page as part of a WikiProject);
  • Images; and,
  • Other (contributions that do not fit in the above categories).

With the exception of edits classified as Requests for information or Off–topic remarks, Talk page contributions were then further classified according to Stvilia, et al.’s (2005b) 10 categories of quality issues faced by Wikipedians, where applicable, to gain a more detailed sense of the quality issues central to each article’s development (see Table 1). These categories of contributions, both to Wikipedia articles and to their accompanying Talk pages, are not mutually exclusive. For example, a user may add a paragraph of information to an article that also contains links, and therefore this contribution is classified under two different categories.

 

Table 1: Information quality issues faced by Wikipedians.
Note: Adapted from Stvilia, et al., 2005b, pp. 8–9.
Problem typesCaused by
Accessibility
  • Language barrier
  • Poor organization
  • Policy restrictions imposed by copyrights, Wikipedia internal policies, and automation scripts.
Accuracy
  • Typing slips
  • Low language proficiency
  • Changes in real world states
  • Wording excluding alternate point of view
  • Differences in culture/language semantics
  • Garbled by software
  • Conflicting reports of factual information.
Authority
  • Lack of supporting sources
  • Lack of academic scrutiny of the sources
  • Known bias of the source
  • Unfounded generalization.
Completeness
  • Existence of multiple perspectives
  • Unbalanced coverage of different perspectives
  • Difference between encyclopedia article genre and genre from which text imported.
Complexity
  • Low readability
  • Complex language.
Consistency
  • Using different vocabulary for the same concepts within the article or within the collection
  • Using different structures and styles for the same type of articles
  • Non–conformance to suggested style guides.
Informativeness
  • Content redundancy.
Relevance
  • Adding irrelevant content or that is outside scope of the article.
Verifiability
  • Lack of references to original sources
  • Lack of accessibility of original sources.
Volatility
  • Lack of stability due to edit wars and vandalism.

 

Article quality

Article quality was gauged through both qualitative and quantitative observations. Specifically, building upon Brändle’s (2005) criteria, article quality was measured through noting the following:

Richness:

  • W–questions (Does the article answer who, what, when, where, and why?);
  • Lead section (Does the article have an introduction which situates the topic?);
  • Background section (Does the article have a section explaining the topic’s relevance and history?);
  • Degree of cross–linking (How many internal and external links does the article contain?);
  • Transparency (How many references are contained in the article? Is all information properly cited?);
  • Number of media objects (Including images, charts, and graphs);
  • Layout/structuring (Including number of subheadings and lists);
  • Comprehensibility (Readability);
  • Size (Length of article); and,
  • Bibliography (How many items are suggested for “Further Reading?”).

Neutrality:

  • Objectivity (Are viewpoints stated as fact?);
  • Diversity (Number of different viewpoints); and,
  • Balance (Balance of viewpoints).

Textual features

In measuring comprehensibility (richness), the readability of each article was calculated on the current version of each article on the final day of data collection. Headings, subheadings, table of contents, bulleted lists, references, see also links, external links, captions accompanying images, and text from charts or information boxes contained in a given article were removed from each article’s text so as not to impact the readability results. For example, a bulleted point or a listed source is indistinguishable from a sentence to the readability software and therefore has the potential to make the overall result of the test inaccurate. Additionally, as not all of the articles contained bulleted lists, a table of contents, or references, for example, removing these features ensured consistency with this readability measure, as all articles were prepared in the same way (Hartley, et al., 2004). These article features were also excluded from the word count for each article.

Readability scores were calculated using both the Flesch Reading Ease and the Flesch–Kincaid Grade Level tests. The results of these tests were used to complement one another, as the Flesch Reading Ease provides a measure of each article’s textual difficulty compared with the “plain” English standard (Wright, 2007) while the Flesch–Kincaid Grade Level test provides a corresponding grade level to compare to the average reading level of Americans.

The average number of words per sentence and sentences per paragraph was calculated to provide a means of comparison between articles to complement the readability results. Further, as passive sentences are often cited as a hindrance to reading ease (Hartley, et al., 2004), the number of passive sentences per article was also noted to complement the readability scores for each article.

In measuring the first–mover advantage (Viégas, et al., 2004), the text of the original version of each article was compared with its current version on the final day of data collection. Each word from the original article (including headings, captions, text from charts, links, and citations/sources) as well as images, charts, or illustrations still remaining in the current version of the article were noted. An automated word count was used to calculate the total number of words in the original article text. The original text enduring in the current version of the article was then isolated from the remainder of the original article text and a word count was again used. The percentage of article text remaining in the current version of the article was calculated through the ratio of the total number of words from the original article text remaining over the total word count of original article text.

4.2. Population and sample

The sample for this study consisted of nine Wikipedia articles (see Table 2). Using the random article generator on the site, three pre–existing articles were randomly selected from the English Wikipedia in the subject areas of the hard sciences, soft sciences, and humanities: “Pneumocystis pneumonia,” “Structural functionalism,” and “Aestheticism.”

 

Table 2: Articles selected for sample.
Academic disciplineArticle topicSubjectDate of first article entry
Soft sciences Structural functionalismSociologySeptember 2003
DysrationaliaPsychologyFebruary 2007
Challenging behaviourPsychologyFebruary 2007
Hard sciences Pneumocystis pneumoniaMedicineOctober 2006
Flying frogBiologyFebruary 2007
AttelabidaeBiologyFebruary 2007
Humanities AestheticismArt and LiteratureFebruary 2003
Romantic heroLiteratureFebruary 2007
Cesare Cremonini (Philosopher)PhilosophyFebruary 2007

 

Registering under the user name of “PrimarySource,” these researchers initiated three Wikipedia entries in these subject areas, hereafter referred to as experimental articles. Topics were selected from the requested articles page of the site (“Requested articles,” 2007) and comprised “Dysrationalia,” “Flying frog,” and “Romantic hero.” These article topics were not selected randomly, but instead were chosen based on the likelihood that they would generate enough interest to be well contributed to. Various sources were consulted in researching each topic and a brief, footnoted entry was made for each article, comprising a single paragraph. The length of each entry was designed to encourage other users to expand on the article, as not all key details on each topic were included. Three articles were also randomly selected from the page listing newly created articles (“New pages,” 2007) from the same date on which the experimental entries were created: “Cesare Cremonini (Philosopher),” “Attelabidae,” and “Challenging behaviour.”

4.3. Procedure

Data was collected over a five–month period, from the beginning of February 2007 to the end of June 2007. Data collection was facilitated by the edit history for each of the nine articles and Talk pages, which tracks each contribution made. The complete edit history of the pre–existing, randomly selected articles and Talk pages was also used for data collection and therefore included edits made prior to February 2007.

Data was gathered using the criteria outlined above to record and measure change to the articles over time. At the end of this five–month period, each article was tested for readability and textual features and examined according to the richness and neutrality criteria previously outlined. Each article’s data was then analyzed individually and comparisons made among the same subject disciplines and across subject disciplines.

 

++++++++++

5. Results

5.1. Observations of article contribution and evolution patterns

Among the sample sets in this study, consistencies were found among the subject disciplines in most frequent types of contributions made and the endurance of the original article text. Patterns were also discovered in the editing activity of the initial article contributor.

Features of article contributors and article edits

The mean frequency distribution of edits per article was 40.3 edits, and the median 14.0 edits, while the mode was seven edits (occurring in two of the articles begun in February 2007). These differing results indicate considerable variability in the number of edits each article received. Indeed, the standard deviation of 47.7 edits and the range of 122 edits highlight the extent of this variability, which is likely due to the small sample size of articles in this study.

Comparing results between the subject disciplines, the humanities articles had the highest average number of unique article contributors with a mean of 34, with the soft sciences falling closely behind with a mean of 32.7, while the hard sciences had the lowest average number of unique article contributors with a mean of 8.7, still falling within the range of standard deviation.

On average, the humanities articles received the highest number of article edits with a mean of 54. The soft sciences followed with 49.7 edits, while the hard sciences articles received the lowest number of edits at 17.3. It is important to note, however, that both the humanities and soft sciences data sets each contained an article — “Aestheticism” and “Structural functionalism,” respectively — with an exceptionally high number of editors and edits relative to the remaining articles in each set, raising each article’s respective subject discipline average for this measure. These two articles were initially begun in 2003 and therefore have had a greater opportunity to receive contributions than the other articles in the data sample.

Interestingly, the three experimental articles begun in February 2007 had a mean of 7.7 edits by the end of the data collection period, compared with a mean of 21.7 edits for the sample of articles also created on the same day. The low edit count of the experimental articles may be accounted for by the relative obscurity of these topics. That is, given the fact that there was a request for these articles to be written, it may be presumed that no Wikipedia contributor felt familiar enough with these topics to contribute. It may also be that these articles had simply not generated enough interest to draw others to read them and contribute to them significantly.

Repeat contributors

The humanities article set had the highest average proportion of repeat contributors (editing a given article two or more times), with a rate of 43.8 percent compared with 20.3 percent and 14.5 percent for the soft sciences and hard sciences, respectively. However, two of the articles from the humanities received extreme scores on this measure relative to the rest of the articles in the sample. Specifically, “Cesare Cremonini (Philosopher)” had 43.8 percent repeat contributors and “Romantic hero” had 40 percent repeat contributors — more than double the rate for the other articles. On the other hand, with a rate of 18.5 percent, “Aestheticism” was consistent with the other disciplines’ average.

Frequency of contribution types

From examining the frequency of the contribution types identified by Pfeil, et al. (2006), several patterns of Wikipedia article contribution emerged. As indicated in Table 3, among the humanities articles, Add link was the most common type, followed by Add information, Format, and Spelling. Similarly, for the soft sciences articles, Add link was also the most common editing contribution, also followed by Add information. The third most common contribution type differed from the humanities (Fix link), while the fourth (Spelling) was also the same. Finally, the hard sciences also shared the same top three contributions as the soft sciences, while the fourth most common contribution among the articles in the sample was Style/topography. Across all articles, as Figure 1 illustrates, Add link, Add information, Format, and Spelling were the top contribution types.

 

Table 3: Average frequency (percent) of article contribution type per subject discipline and across subject disciplines (ranked in descending order by results across subject disciplines).
 Hard sciencesSoft sciencesHumanitiesAcross subject disciplines
1. Add link31.332.531.531.8
2. Add information24.921.415.720.7
3. Format8.27.010.18.4
4. Spelling2.77.88.66.4
5. Style/topography7.35.94.45.9
6. Clarify information3.26.07.95.7
7. Fix link3.98.44.65.6
8. Vandalism3.74.14.84.2
— Delete information5.72.34.74.2
9. Reversion3.73.15.64.1
10. Grammar2.51.42.72.2
11. Mark–up language1.503.41.6
12. Delete link1.501.71.1

 

 

Figure 1: Frequency of article contribution types across subject disciplines
Figure 1: Frequency of article contribution types across subject disciplines.

 

The ‘First–mover’ effect on article content

From observing the proportion of the initial article text remaining in the final version of each article on the last day of data collection, a considerable amount of the initial article text remains, a finding consistent with Viégas, et al.’s (2004) observation of the first–mover advantage. The hard sciences articles had an average of 98.1 percent of the initial text remaining, while the soft sciences averaged 96.1 percent and the humanities 76.7 percent. In the humanities set, however, the final version of the “Cesare Cremonini (Philosopher)” article only contained 35.9 percent of the original article text, reflecting the substantial changes that occurred to it from the original version, and also representing an extreme value which served to lower the overall average of this set. The remaining humanities articles averaged 96.9 percent of surviving original article text, a result more consistent with the other subject disciplines’ averages.

It is important to note that both the “Structural functionalism” and “Pneumocystis pneumonia” articles began as redirects to other articles. For the purposes of this analysis, the point at which each article had its own text was considered the original text, rather than simply the redirect itself, and the Wikipedian who provided this text was considered the article’s initial creator, although the redirects are included in the total edit counts for each article.

Across the sample sets an average of 90.3 percent of the initial Wikipedia article text remained over time. A relationship was found between the age of a given article and the proportion of its original text remaining. Specifically, in each of the three experimental articles, all initial article text remained at the end of data collection. Similarly, for the “Attelabidae” and “Challenging behaviour” articles, also begun in February 2007, all original text remained. The “Cesare” article, however, was an exception; also begun at the same time, its original text was reduced considerably.

While the newer articles in the sample had on average 89.3 percent of the original article text remaining (and 100 percent when the “Cesare Cremonini” result is omitted), the older articles — “Structural functionalism,” “Aestheticism,” and “Pneumocystis pneumonia” — averaged 92.1 percent. From comparing the original text with the remaining text in each of the articles in this sample, it is evident that the initial text that does remain undergoes little (if any) modification from its original version. In several instances, minor changes such as spelling and grammar were made to the original text, but most sentences remained in their entirety, with their original order preserved. Interestingly, while 93.8 percent of the initial text of the “Aestheticism” article remains, several of the original sentences have been broken up and rearranged, and in some instances a considerable amount of text inserted between sentences as compared to the original. In all instances, however, the initial text of an article appears to function as the article’s backbone, providing a necessary framework for future article contributors to build on over time.

Observed patterns of range of contribution types per edit

In the majority of articles in the sample (with the exception of “Aestheticism” and the three experimental articles), initial article creators performed a series of edits on the day the article was created and/or in the several days following, with edits spanning a range of contribution types. This suggests that the initial creation of an article tends to span a series of edits, rather than comprising a single edit. Further, the initial article creator him or herself performs a range of contribution types, in contrast to the bulk of Wikipedia editors, who only perform one type of editing activity per contribution.

 

Table 4: Average frequency of range of contribution types per edit (as percent of total number of edits).
 1 contribution type2 contribution types3 or more contribution types
Humanities64.017.718.3
Soft sciences72.315.712.0
Hard sciences55.720.324.0
Average across subject disciplines64.017.918.1

 

In each subject discipline, edits consisting of only one contribution type were the most common (see Table 4). For both the humanities and hard sciences, article edits consisting of three or more edit types were the second most common, whereas for the soft sciences, edits of two contribution types were the second most common. In the case of edits of three or more contribution types, in the hard sciences sample, 75 percent were repeat editors, while in both the humanities and soft sciences sets, 70 percent were repeat editors. For the hard sciences, repeat editors averaged a range of 5.7 contribution types, as compared with a range of 4.1 in the soft sciences and 3.5 in the humanities.

Isolating the edits consisting of two contribution types, the mode combination among all of the subject disciplines was Add information and Add link, a finding consistent with the top two types of editing contributions to Wikipedia itself, as mentioned above. The majority of links added to the articles in this sample were “wikifications” of words contained in the article — that is, linking a word to another Wikipedia page describing it — or “See also” links to Wikipedia articles related to the article topic at hand. Other common editing combinations discovered include Format and Add information, Clarify information and Grammar, and Spelling and Format.

5.2. Talk page contributions

Among the subject disciplines in this study, Talk pages were most commonly used to request and/or suggest editing coordination and the most common quality issue was article Completeness. Further, a relationship existed between discussion of a particular article concern and editing of a given article to address the issue at hand.

Discussion contribution types

From observing the Talk pages of each of the articles according to the dimensions provided by Viégas, et al. (2007), Requests/suggestions for editing coordination was the most common type of discussion posting in each of the subject disciplines, while Information boxes (including various WikiProject tags, requests for article expansion, and an image request) was the second most common type (see Table 5).

 

Table 5: Average frequency of talk page edits of each contribution type across subject disciplines (ranked by frequency of occurrence).
Type of contributionPercent of occurrences relative
to total number of Talk page edits
Requests/suggestions for editing coordination57.2
Information boxes35.4
Other5.6
Requests for information0.9
References to internal Wikipedia resources0.9
Off–topic remarks0
Polls0
Requests for peer review0
Images0
References to vandalism0
References to Wikipedia guidelines and policies0

 

Requests for information and References to Wikipedia guidelines and policies also occurred in the humanities article set, on the “Cesare Cremonini (Philosopher)” discussion page. The Request for information concerned a user’s request for more details on an Aristotelean theory mentioned in the article, while the Reference to Wikipedia guidelines was a mention of the “Wikimedia Commons” — a portion of the site that hosts media provided by Wikipedians to use in the encyclopedia — where it was suggested that a user upload an image she felt might be appropriate for the article. The Other category was applied to several Talk page contributions in both the hard sciences and soft sciences article sets. In the case of “Structural functionalism,” these edits included vandalism (where a user deleted a previous post), a reversion of this deletion, and an explanation of information contained in the article itself. On the “Pneumocystis” Talk page, the Other category was applied to an edit which redirected the “Talk:Pneumocystis pneumonia (PCP)” page to “Talk:Pneumocystis pneumonia,” to coincide with the renaming of the article itself.

Article quality issues raised

Observing the quality issues raised by each discussion post, in accordance with Stvilia, et al.’s (2005b) categories of information quality issues, Completeness of the article was the primary concern among each of the subject disciplines in this sample (see Table 6). For the soft sciences, this issue was followed by Accessibility, while Authority and Consistency concerns were also raised. For the hard sciences, Consistency was the second most frequent issue raised, followed by Accessibility and Accuracy. Finally, in the humanities discussion, Accuracy was the second most common concern raised, followed by Accessibility and Complexity. Combining the results among the articles in this sample, Completeness was followed by Accessibility and Accuracy as the main quality issues. No observable difference was found between the nature of discussions occurring on the humanities, soft sciences, and hard sciences Talk pages.

 

Table 6: Average frequency of talk page edits of each quality issue type across subject disciplines (ranked by frequency of occurrence).
Type of issue raised in contributionPercent of occurrences relative to
total number of Talk page edits
Completeness69.9
Accessibility10.7
Accuracy9.1
Consistency5.7
Authority1.6
Complexity1.6
Informativeness0
Relevance0
Images0
Verifiability0
Volatility0

 

Interaction between Talk page discussion and article evolution

In each subject discipline relationships were found between Talk page discussion and article contributions. For example, turning first to the humanities set, a user tagged the “Aestheticism” Talk page with a suggestion that this article be merged with the “Aesthetes” article and one week later, another Wikipedian agreed with this suggestion. More than two months after the merger proposal, another Wikipedian completed the merger, justifying the action by noting that the motion was “unopposed” on the discussion page. Also on this discussion page, another user remarked on a wiki link he added to the article that day and wondered if the statement containing the link was accurate. Eleven days later, a fellow Wikipedian deleted the entire sentence from the article, citing it as “unverifiable nonsense.” Further, a “Wikibot” automatically deleted the category of “Isms” listed in this same article, citing a Wikipedia discussion page where the usefulness of this category was hotly debated and where it was eventually decided the category would be removed (“Wikipedia:Categories,” 2007). It was removed from the article four hours after the debate came to a conclusion. Similarly, on the “Cesare Cremonini” discussion page, a user provided links to possible image choices and a fellow Wikipedian contributed an image to the article within the same day.

Just as in the humanities articles, the hard sciences sample also shows a relationship between Talk page contributions and article contributions. Within hours of the “Attelabidae” Talk page being tagged as part of the WikiProject on arthropods, and another Wikipedian ranking the article as “stub–class,” eight edits occurred to the article, six of which were from the article’s initial creator. Likewise, a contributor on the “Pneumocystis pneumonia” Talk page called on those with expertise to fix inconsistencies in the symptoms listed in the article. Three hours later a fellow Wikipedian edited this section of the article and the next day, the person who had made the request also it to further address the issue raised on the Talk page.

Several of the soft sciences’ Talk page discussions also coincide with article editing activity, most evidently in the “Structural functionalism” discussion. Between November 2005 and February 2007, four posts discussed the importance of including a particular social theorist (Parsons) in the article. After the final post, where a Wikipedian noted that the article was incomplete without this theorist, this individual added a lead section to the article including a mention of Parsons. Overall, the link between Talk page discussion and article editing was somewhat delayed, given the time lag in this discussion; however, the final post on this topic and editing of the article occurred on the same day. In another instance, one post suggested an article explaining the differences between structural functionalism and structuralism be written to alleviate confusion. Six days after this post, a “See also” link to the Wikipedia article “Functionalism (Sociology)” was added to the article, which may be viewed as a response to the issue the Talk page poster raised. Similarly, in the “Challenging behaviour” discussion page, the article’s initial creator posted a call for contributions, after making a series of edits herself that same day. This request coincides with two edits made to the article just over one week later, one of which added information to the article.

However, not all of the relationships between Talk page discussion of an article and its editing resulted in positive editing activity. Specifically, two instances of article vandalism coincided with Talk page discussion, both related to the “Structural functionalism” article. In the first case, an anonymous user deleted the last post to the Talk page, having also vandalized the article just five minutes prior, deleting several sentences over two edits. In another instance, an anonymous user maliciously deleted several sections of the article, including one specifically discussed on the Talk page just two days prior.

In other instances, within each of the subject discipline samples, several of the issues raised in the Talk page discussions did not directly translate into article editing activity. For instance, the requests made by the researchers for article expansion on the experimental articles’ discussion pages did not elicit responses. Similarly, on the “Pneumocystis” Talk page, one Wikipedian called for consistency between terms used in the article; however, at the end of the data collection period, this issue remained unaddressed by contributors. Likewise, in an “Aestheticism” discussion post in October 2006, a Wikipedian presented a concern with a parallel made in the article between aestheticism and symbolism. As of the end of the data collection period, this issue had not received any response on the Talk page or been addressed in contributions to the article itself.

5.3. Article quality

Applying Brändle’s (2005) criteria for richness and neutrality in article quality, in addition to the readability tests on the articles, this study revealed an inconsistent quality level among the Wikipedia articles. This suggests that article quality in this collaborative encyclopedia is primarily dependent upon the quality of contributions provided, rather than upon the quantity of contributions. As a result, article quality itself varied across articles in the sample (see Table 7).

Richness

Cross–linking

The humanities articles had the highest degree of cross–linking for internal links to the article from other Wikipedia pages, Wikipedia links contained within the article itself, and links to external sites, while the soft sciences had the lowest degree of cross–linking for each of these dimensions. Undoubtedly, the 31 external links in the sources listed in the “Cesare Cremonini” article contributed to the humanities articles’ high average of 13 external links relative to the hard and soft sciences, which averaged 1.7 and 1.3 external links, respectively. It should also be noted that each of the subject discipline samples contained an article with an unusually high degree of internal cross–linking relative to the other two articles in each respective sample (“Structural functionalism,” “Pneumocystis pneumonia,” and “Aestheticism”). These articles are the “oldest” in the sample and this result therefore suggests a relationship between article age and the degree of internal linking of articles. Considering the average results across the disciplines, a high degree of cross–linking was prevalent in these articles; however, these links were primarily to other Wikipedia pages rather than to external sites. Wikipedians, then, appear to place a higher priority on linking to other Wikipedia resources rather than to external sites, with the wikification of words within a given article text accounting for the majority of these links.

Transparency

The hard sciences articles in this sample had the highest average number of cited sources, while the soft sciences articles cited the lowest number of sources overall. However, from the hard sciences sample, the “Pneumocystis” article alone cited 31 sources, compared with just one for “Attelabidae” and three for “Flying frog,” therefore contributing to the subject discipline’s high average relative to the other sample sets. Similarly, while the humanities had an average of eight cited sources, “Aestheticism” contained none at all. Further, both the humanities and soft sciences samples contained a high proportion of “uncited” text. Overall, sources were not consistently cited in the articles in this sample, suggesting that transparency remains a key issue for Wikipedia articles.

 

Table 7: Averages of quantitative aspects of article richness.
Note: *Percentage of presence in articles is provided
(e.g., a table of contents in two out of three articles = 33.3 percent occurrence rate).
 Hard sciencesSoft sciencesHumanitiesAcross subject disciplines
Degree of cross–linking: 
  • Internal links to49.726.77149.1
  • External links1.71.3135.3
  • Wiki links28184630.7
 Total79.44613085.1
Transparency: cited sources11.73.787.8
Number of media objects3.3011.4
Layout/structuring: 
  • Headings54.74.34.7
  • Subheadings001.70.6
  • *Table of contents33.366.766.755.6
  • *List(s)33.333.333.333.3
Comprehensibility: non–linked foreign words or technical terms12.36.778.7
Size (words)715.7613.7638655.8
Items listed in bibliography03.342.4

 

Media objects

The hard sciences articles in this contained the highest number of media objects, while the soft sciences articles did not contain any. Media contained in the hard sciences articles included images, a diagram, and information boxes used to complement the information presented in each article. Specifically, charts were used to list scientific classifications and taxonomies (referred to as a “taxobox” by Wikipedians), while the diagram (in “Attelabidae”) visually represented the link between families of species and the Attelabidae’s place in this chain. Similarly, media in the humanities articles also included images and an information box (from “Cesare Cremonini,” using the philosopher “infobox” template).

Layout and structuring

Each subject discipline contained about the same average number of headings, while only the humanities articles also made use of subheadings. Table of contents were only used in articles with five or more headings, likely to make navigation of these articles easier. Lists were only used in one of three articles in each of the samples and seem to be applied according to a contributor’s preference in the presentation of text.

Length

The hard sciences articles had the highest average word length, with a mean of 715.7 words per article, followed by the humanities with a mean of 638 words and the soft sciences with a mean of 613.7 words. Within each discipline, however, there was a considerable range. For example, “Pneumocystis pneumonia,” with 1,866 words, and “Structural functionalism,” with 1,362, were the lengthiest of the articles; “Aestheticism,” with 725 words, and “Challenging behaviour,” with 292, ranked in the middle; and the shortest were “Flying frog,” with 159 words, and “Attelabidae,” with 122.

Bibliography

Of the nine articles in the sample, only two contained a bibliography listing further materials on the topic. “Structural functionalism” lists ten items, three of which are cited in the article itself. “Cesare Cremonini” includes both a concise and an extended bibliography of eleven items. From an act of vandalism, the “Structural functionalism” article’s list of references was removed in November 2005 and not restored for over two months in spite of numerous edits occurring to this article during that time frame and a request from a user on the Talk page for the list to be restored. In December 2006, these sources were again removed from the article and not restored until February 2007. These instances, combined with the overall observation of transparency in the articles in this sample, suggest that sources — either provided through citations or bibliography — are not of a high priority in terms of Wikipedia article development.

W–questions

Among the articles in this sample there was no consistency in answering who, what, when, where, and why questions. For example, while each of the articles in the humanities data set answered all five w–questions, only one article in the hard sciences and soft sciences sets did so. For the remaining articles, several of these questions were only answered partially. For example, as of the final day of data collection, “Structural functionalism” does not state when the theory was first developed or where. Similarly, the “Flying frog” article does not answer when this species was first discovered.

Lead section

All articles in the sample contained a lead section, which seems to be a standard Wikipedian convention. However, merely having this section did not guarantee a high–quality introduction to a given topic, as “Structural functionalism” indicates. This article’s lead paragraph, with such circular and dependent definitions as “a social systems paradigm is a sociological paradigm” — is likely to leave readers confused rather than enlightened.

Background section

All the articles in the humanities set contained a background section, while only two of the three articles in each of the soft and hard sciences set contained this section. Like the lead section, the background sections in this sample also ranged in quality. For example, the background section of “Romantic hero” did not fully explain the concept’s relevance, while the history in the “Structural functionalism” background section was difficult to follow and the topic’s relevance was not clearly stated.

Neutrality

Objectivity

The objectivity of each article was determined by assessing whether viewpoints are acknowledged as such or stated as fact. Specifically, the articles were examined for whether the information presented was supported through citation of sources, or direct quoting of sources, and whether or not the overall tone of the piece was neutral. For example, generalizations that were not supported by facts or cited with sources were considered as viewpoints presented as fact, such as in the “Cesare Cremonini” article where a Wikipedian declared Cremonini as one of the greatest philosophers of his time. In general, all the articles in the sample presented the topic information in a neutral tone. While both the humanities and soft sciences articles contained portions where viewpoints were presented as fact, the hard sciences articles were more transparent, clearly separating viewpoint from fact. The difficulty in separating fact from opinion in the humanities and soft sciences articles was amplified by the lack of citations supporting many of the assertions made.

Diversity

The diversity of each article was determined through noting the number of different viewpoints presented; the overall range of discussion on the given topic and whether or not any controversies were mentioned, if applicable, was also noted. The diversity criterion was difficult to apply to the hard sciences articles, as they seemed to be merely presenting “scientific truth,” although “Pneumocystis” clearly provided three differing viewpoints on the issue of this disease’s scientific name. Similarly, this criterion was also difficult to apply to the concept of “Romantic hero” from the humanities set, as all the characteristics of this literary type seemed to support one another. The remaining two articles of this humanities set, however, did offer discernibly varying views. In the soft sciences set, no alternatives to the “Structural functionalism” theory are given. Similarly, no controversies surrounding the labels associated with “Challenging behaviour,” such as Attention Deficit Hyperactivity Disorder (ADHD), or treatments for such behaviour are mentioned in the article. However, the “Dysrationalia” article presents three differing views on the topic.

Balance

The balance of each article was assessed through comparing the length (sentences) of the discussion of each viewpoint presented. As with diversity, the balance criteria was also difficult to apply to the hard sciences articles, as there was an overall sense of presenting one view, that of “scientific fact.” For the remaining sample sets, two of the three humanities articles seem well balanced, while only one of the three soft sciences articles was balanced. Specifically, in the “Aestheticism” article from the humanities set, while the article is broken into three sections — literature, visual arts, and decorative arts — it provides the largest amount of information on the first section and contains only one sentence about visual art. For “Structural functionalism,” while a critique of an offshoot of this theory — unilineal descent — is presented, it is followed by three views supporting the relevance of structural functionalism to understanding current society. As “Challenging behaviour” did not provide varying views, as discussed above, it did not seem balanced in its presentation.

Textual features

The hard sciences had the highest average number of sentences per paragraph, with a mean of 5.5, while the humanities had the lowest, with a mean of 4.1. Conversely, the humanities were found to have the highest average of words per sentence (27.6), while the hard sciences had the shortest sentences, with a mean of 18.9 (see Table 8). On average, the hard science articles contained the largest proportion of passive sentences (30.7 percent), while the humanities contained the lowest proportion, with a mean of 28.3 percent. These results support Tibbo’s (1992) observation that scientific abstracts have shorter sentences on average than social science and humanities abstracts. Also consistent with Tibbo’s findings, the mean sentence length of the humanities and soft sciences texts were about the same (27 as compared with 27.6). Likewise, these results are also consistent with Hartley, et al.’s (2004) observations that science texts have shorter sentence lengths on average and tend to contain more passive sentence constructions.

Readability

On the Flesh Reading Ease scale, 0 indicates extreme difficulty, 30 indicates fairly difficult, 65 is equivalent to plain English–language that is “simple and direct but not simplistic or patronizing” [6] — while 100 represents an extremely easy read. Of the sample sets, the hard science articles achieved the highest Reading Ease score (45.0), in spite of the higher proportion of passive sentences relative to the other subject disciplines, while the soft sciences achieved the lowest score (13.7). These results confirm Hartley, et al.’s study, where, in three out of five samples, science texts had the higher Flesch Reading Ease scores. The average results across the subject disciplines are provided in Table 9. While the hard sciences achieved the best readability score, it nevertheless indicates that the articles in this set are still fairly difficult to read. The humanities score of 29.5 indicates slightly more textual difficulty than the hard sciences score, while the soft sciences score indicates extreme difficulty.

 

Table 8: Comprehensibility: Comparisons among subject disciplines.
 Average number of sentences per paragraphAverage number of words per sentenceNumber of passive sentencesFlesch Reading Ease scoreFlesch–Kincaid Grade Level score
Humanities4.127.628.3%29.515.7
Soft sciences5.42728.7%13.718.3
Hard sciences5.518.930.7%45.011.5

 

For the Flesch–Kincaid Grade Level scale, in contrast to the Reading Ease score, the lower the score, the more readable a text is since each score corresponds to a school grade level. Here, the hard sciences achieved the best readability score (11.5), while the soft sciences obtained the worst (18.3), echoing the Flesh Reading Ease results. Again, while the hard sciences achieved the best score, a Grade Level of 11.5 is well beyond the grade 6–8 reading level of the average American reader (The Informatics Review, 2004) and therefore still represents textual difficulty.

 

Table 9: Comprehensibility results across Wikipedia articles.
Average number of sentences per paragraphAverage number of words per sentenceNumber of passive sentencesFlesch Reading Ease scoreFlesch–Kincaid Grade Level score
524.529.2%29.415.2

 

Overall assessment of article quality

Considering the sum of the quality measures, the hard sciences articles represent the highest quality among the sample sets. However, it is not possible to rate one set of articles as of a higher quality than the others in absolute terms, due to the fact that each of the article sets was lacking in several of the quality dimensions. The hard science articles had the highest degree of transparency and media objects, were balanced and objective, presenting diverse views where applicable, and achieved the best readability scores. Yet, only one article in this sample set satisfactorily answered all five w–questions. The humanities article set was second in quality, as these articles contained a high degree of cross–linking, answered all five w–questions, and each contained a background and lead section; however, several portions contained assertions that were not cited and these articles demonstrated textual difficulty on the readability scales. The soft sciences articles performed the worst for quality, as they had the lowest number of media objects and cited sources, the lowest degree of cross–linking, and the least amount of balanced viewpoints. Like the hard sciences set, only one article successfully answered all five w–questions. In addition, the overall mean of the soft sciences articles’ readability scores indicate the highest level of difficulty.

Interestingly, considering each article’s individual performance with respect to each dimension of article quality, the articles with the highest number of edits (129 for “Structural functionalism” and 115 “Aestheticism”) were not of the highest quality. This result contrasts with previous findings (Brändle, 2005; Dondio, et al., 2006; Lih, 2004; Stvilia, et al., 2005a, 2008; Wilkinson and Huberman, 2007) concerning correlation between the number of contributors to an article and its quality. Instead, “Cesare Cremonini (Philosopher)” and “Pneumocystis pneumonia,” both of which were ranked in the middle of the data set with 38 and 31 edits respectively, performed the best overall on the article quality measure.

 

++++++++++

6. Discussion

In spite of concerns raised in the mainstream media and within library and information studies literature itself over the volatility of Wikipedia, questioning its credibility as an information resource, the data from this study suggests that deletion of information from articles occurs infrequently in relation to the other types of editing contributions. Likewise, article content remains remarkably stable and constant over time, as the first–mover effect indicates. These findings also suggest that initial article creators have a considerable amount of influence in shaping how Wikipedia articles evolve over time, as a significant proportion of the original text remains, with succeeding versions built around it. Some Wikipedians may be inclined to focus on only one or two types of editing contributions, while others perform a wider range of editing activities — particularly the individual who begins a given Wikipedia article. The cases where only one or two contribution types are made raise the question of whether or not each contributor has the totality of the article in mind when making changes or additions to an article. In turn, article quality itself may suffer from this narrow focus, in spite of the efficiency such an assembly–like production of an article likely produces. On the other hand, Wikipedians whose contributions span over a range of editing types may counterbalance these effects.

The data in this study also suggests that as each user builds upon previous contributions, and most frequently only provides one type of contribution per edit, the development of a Wikipedia article is not necessarily as quick a process as it may at first seem. While articles can be edited with relative ease and speed, edit histories reveal that editing activity happens in clusters, with weeks and sometimes months of editing dormancy. Therefore, Wikipedia articles truly are built over time and are in a continual process of improvement.

While the link between Talk page discussion and editing of an article is not necessarily an immediate interaction, it is evident that, as Viégas, et al. (2007) observed, Talk pages are indeed a central means of editing coordination. Combining these observations with the results concerning article quality issues discussed by Wikipedians, it is evident that, as Viégas, et al. (2007) observed, Talk pages are a central site of editing coordination.

Turning to the readability scores across the disciplines (Table 9), it is evident that the articles examined in this study, at any rate, represent a high level of textual difficulty. None of the Wikipedia articles achieved a plain English level or better for the Flesch Reading Ease test and the lowest Grade Level score average (a mean of 11.5 for the hard sciences set) is well beyond the grade 6–8 reading level of the average reader. If these results are at all indicative of trends in Wikipedia as a whole, then the readability — and in turn, the accessibility — of articles presents an important area of concern.

Lastly, the overall quality of Wikipedia articles found in this sample, both between and among the subject disciplines, may best be summarized as inconsistent. While article quality is based entirely on the interest and knowledge of Wikipedia contributors, these results suggest potential areas for improvement among the various subject areas on the site.

 

++++++++++

7. Limitations and recommendations for future research

This current study has several limitations. Firstly, the sample size used was considerably small and therefore the results obtained cannot be generalized to Wikipedia as a whole. However, this study was intended to provide a methodology which others might repeat using a larger sample of articles and to highlight potential features of article collaboration in the encyclopedia.

Secondly, the articles in each of the subject disciplines varied from each other in several ways, including article length, number of editors, and number of edits. The mean was chosen as the measure of central tendency to summarize results, but in each case it disguises the extreme values contained in each set as a result of these differences. A larger sample size of articles would likely provide clusters of articles sharing similar characteristics and values, thereby providing results which might better reflect patterns in Wikipedia as a whole.

Thirdly, the topics chosen for the experimental articles were somewhat obscure. Further, the way each experimental article was initiated was somewhat artificial, occurring in just one edit rather than in a series of edits as in the other articles in this sample. This may have influenced the number of contributions each experimental article received and made these articles atypical in terms of Wikipedia as a whole. Additionally, the researchers were required to evaluate their own work in observing article quality. Determining and selecting more well–known/popular topics that have not yet been written about — though somewhat difficult, considering the encyclopedia contains well over two million articles [7] — and inviting a third party to begin these Wikipedia articles would address these limitations. Alternatively, articles could be selected from the “New pages” section of the site, but with special attention paid to each topic’s likelihood of attracting contributions.

In the case of the “Pneumocystis pneumonia” and “Structural functionalism” articles, which first began as redirects to other articles, the edit histories involved in the respective merges could not be readily accessed from the Wikipedia site. However, the “final” product of each article was likely influenced by the content of each article it was built from and the full history of merged articles might have revealed additional patterns to explore.

Finally, the experimental articles only provided a five–month edit history, in contrast to the three–year edit history of “Structural functionalism,” the oldest article in the sample. Selecting a wider range of article ages and observing experimental articles over a longer time period may strengthen any observations of article collaboration that can be made from these page histories.

In future studies, a larger article sample size than the nine articles used here could be analyzed to produce more robust results concerning collaboration. It may also be beneficial to create a taxonomy of the types of editors to Wikipedia articles, based on the common combinations of contribution types — for instance, the “proofreader” whose edits focus on clarifying information and fixing grammar and spelling. The creation of this taxonomy would require a broader examination of the editing combinations than described in the current study, including edits of three contribution types and above. It may also be useful to explore a possible hierarchy of editing activity over an article’s development, as the data in this study suggests priorities for article development. Further, assigning values/ranks to each dimension of article quality would aid in a more absolute comparison between articles. It would also be useful to represent a wider range of subject areas within each subject discipline to gain a broader sense of the encyclopedia’s strengths and weaknesses in each area and the patterns of collaboration shared throughout the site.

 

++++++++++

8. Conclusion

This study aimed to explore how the collaborative process behind the creation and development of Wikipedia articles functions, to provide a broader perspective from which to evaluate this open source encyclopedia. Particular focus was on the types and frequency of contributions made to each article and its accompanying Talk page as well as to the interaction between Talk page discussion and the editing activity of a given article. In turn, the relationship between individual contributions to article quality as a whole was also explored. Results were then compared within and among the hard sciences, soft sciences, and humanities sample sets.

Across the subject disciplines in this sample, Add link and Add information were the most common contribution types. Vandalism ranked eighth out of the 13 contribution types, suggesting its persistence as a quality issue on Wikipedia. For the accompanying article Talk pages, Requests/suggestions for editing coordination was the most common type of contribution, followed by Information boxes. Completeness was the most frequent article quality issue raised among these Talk page discussions, followed by Accessibility and Accuracy. Additionally, a relationship was observed between Talk page discussion and editing of the Talk page’s accompanying article; however, not all of the interactions between Talk pages and articles resulted in positive editing activity, as in several instances vandalism was an outcome. Several issues raised on the Talk pages remained unaddressed by Wikipedians, but nonetheless, it is evident that Talk pages have an integral role to play in the collaborative process, as Stvilia, et al. (2005b) and Viégas, et al. (2007) observed.

Across the articles in this sample the majority of Wikipedia editors only perform one type of editing activity per contribution, in contrast to initial article creators, who perform a range of contribution types over a series of edits. Isolating edits of two contribution types, Add information and Add link was the mode combination. A high percentage of repeat editors provided edits of three or more contribution types, suggesting a relationship between frequency of article editing and range in contribution types. The first–mover advantage, noted by Viégas, et al. (2004), was also observed among all articles in this sample, where a considerable amount of the original article text remained over time, and an inverse relationship was found between article age and the proportion of initial article text remaining.

Counter to previous findings (Brändle, 2005; Dondio, et al., 2006; Lih, 2004; Stvilia, et al., 2005a, 2008; Wilkinson and Huberman, 2007), the articles in this sample with the highest number of edits were not found to be of the highest quality. While the hard sciences articles had the highest level of article quality overall, they were still lacking in several key areas, such as successfully answering all w–questions. Similarly, while the hard sciences performed the best on both the Flesch Reading Ease and Flesch–Kincaid Grade Level tests, affirming previous findings (Hartley, et al., 2004; Tibbo, 1992), none of the articles in this sample achieved a plain English level of readability. In contrast to traditional encyclopedias — and even Simple Wikipedia itself — Wikipedia proper lacks an explicitly stated purpose and, correspondingly, lacks guidelines for writing articles at a reading level appropriate for its intended audience. In turn, the high level of textual difficulty observed in the articles in this study undermines Wikipedia’s more general aim of being accessible to everyone.

Taken together, these results suggest considerable variability in the quality level of Wikipedia articles across subject disciplines, likely resulting from the fact that each article does not receive an equal amount of attention or scrutiny. How much confidence one should have in an article whose content is continually under discussion and revision remains an open question. However, while the constant flux of Wikipedia is often cited, the articles in this sample suggest a remarkable consistency in article content over time and the considerable influence that the initial creator has over the article by essentially providing a framework on which future editing is built. Additionally, as the observations between Talk page discussion and editing of a given article reveal, raising one’s concerns about an article is likely to result in change to that article. Therefore, contributions to Talk pages — in addition to article edits themselves — provide Wikipedians with a powerful means of shaping the presentation of knowledge.

Undoubtedly, research into Wikipedia will become vital as the encyclopedia itself continues to evolve. As Wikipedia becomes increasingly relied upon as an information resource, further study will be crucial in providing insight into the patterns which shape this seemingly formless collaborative phenomenon. End of article

 

About the authors

Katherine Ehmann graduated from McGill University’s Master of Information Studies program in 2008. She was awarded a Canada Graduate Scholarship from the Social Sciences and Humanities Research Council of Canada (SSHRC) in 2006, which funded this research on Wikipedia. Previously, she attended Queen’s University at Kingston, Ontario, where she graduated with a Bachelor of Arts (Honours) in English Literature and Sociology.
E–mail: katherine [dot] ehmann [at] mail [dot] mcgill [dot] ca

Andrew Large currently holds the CN–Pratt–Grinstad Chair in Information Studies at the School of Information Studies, McGill University, and from 1989 until 1998 was the School’s Director. He has been actively involved in funded research for many years, and his research over the last 15 or so years has focused upon the information–seeking behavior of children and the design of information technologies to support this behavior. He has authored, co–authored or edited a number of books and published extensively in refereed journals.
E–mail: andrew [dot] large [at] mcgill [dot] ca

Jamshid Beheshti has taught at the School of Information Studies at McGill University for more than twenty years, where he was also the Director for six years. He was appointed the Associate Dean (Administration) of the Faculty of Education in 2004, and currently holds the position of Interim Dean of the Faculty.
E–mail: jamshid [dot] beheshti [at] mcgill [dot] ca

 

Acknowledgements

The authors would like to thank the Wikitech mailing list for its advice on a technical point. This research was funded by a Canada Graduate Scholarship from the Social Sciences and Humanities Research Council of Canada (SSHRC) to the principal author.

 

Notes

1. “Wikipedia:Simple English Wikipedia,” 2007, para. 2.

2. See http://www.research.ibm.com/visual/projects/history_flow/.

3. Viégas, et al., 2004, pp. 580–581.

4. American Heritage Dictionary of the English Language, 2006. “Humanities,” para. 2.

5. Schrage, 1990, p. 40.

6. Wright, 2007, para. 5.

7. See http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, accessed 6 October 2008.

 

References

Alexa, 2008. “Alexa top 500 sites,” at http://www.alexa.com/, accessed 16 October 2008.

American Heritage Dictionary of the English Language, 2006. “Humanities,” at http://dictionary.reference.com/browse/humanities, accessed 15 May 2007.

T. Becher, 1987. “The disciplinary shaping of the profession,” In: B.R. Clark (editor). The academic profession: National, disciplinary, and institutional settings. Berkeley: University of California Press, pp. 271–303.

A. Biglan, 1973. “The characteristics of subject matter in different academic areas,” Journal of Applied Psychology, volume 75, number 3, pp. 195–203.http://dx.doi.org/10.1037/h0034701

A. Brändle, 2005. “Too many cooks don’t spoil the broth,” Proceedings of Wikimania 2005 — The First International Wikimedia Conference (Frankfurt), at http://meta.wikimedia.org/wiki/Transwiki:Wikimania05/Paper-AB1, accessed 10 January 2007.

CBCNews.ca, 2007a. “Fake ‘expert’ scandal forces Wikipedia to review editor policy” (7 March), at http://www.cbc.ca/technology/story/2007/03/07/tech-wikipedia.html, accessed 11 May 2007.

CBCNews.ca, 2007b. “Wikipedia spins off reference disk,” (24 April), at http://www.cbc.ca/technology/story/2007/04/24/tech-wiki.html, accessed 8 May 2007.

T. Chesney, 2006. “An empirical examination of Wikipedia’s credibility,” First Monday, volume 11, number 11 (November), at http://www.firstmonday.org/issues/issue11_11/chesney/, accessed 22 May 2007.

P. Dondio, S. Barrett, S. Weber, and J.M. Seigneur, 2006. “Extracting trust from domain analysis: A case study on the Wikipedia project,” In: ATC (Autonomic and Trusted Computing) 2006: Proceedings of the Third international Conference (Wuhan, China, 3–6 September), Lecture Notes in Computer Science, volume 4158. Berlin: Springer, pp. 362–373.

P. Duguid, 2006. “Limits of self–organization: Peer production and ‘laws of quality’,” First Monday, volume 11, number 10 (October), at http://www.firstmonday.org/issues/issue11_10/duguid/, accessed 10 January 2007.

W. Emigh and S.C. Herring, 2005. “Collaborative authoring on the Web: A genre analysis of online encyclopedias,” Proceedings of the Thirty–Eight Hawaii International Conference on System Sciences (HICSS–38); version at http://ella.slis.indiana.edu/~herring/wiki.pdf, accessed 13 January 2007.

J. Fildes, 2007. “Wikipedia ‘shows CIA page edits’,” BBC News (15 August), at http://news.bbc.co.uk/2/hi/technology/6947532.stm, accessed 15 August 2007.

J. Giles, 2005. “Internet encyclopaedias go head to head,” Nature, volume 438, number 7070 (15 December), pp. 900–901, and at http://www.nature.com/nature/journal/v438/n7070/full/438900a.html, accessed 6 October 2008.

M.A.K. Halliday, 1993. “Some grammatical problems in scientific English,” In: M.A.K. Halliday and J.R. Martin (editors). Writing science: Literacy and discursive power. Pittsburgh: University of Pittsburgh Press, pp. 69–85.

J. Hartley, E. Sotto, and C. Fox, 2004. “Clarity across the disciplines: An analysis of texts in the sciences, social sciences, and arts and humanities,” Science Communication, volume 26, issue 2, pp. 188–210.http://dx.doi.org/10.1177/1075547004270164

The Informatics Review, 2004. “Comprehension and reading levels,” at http://www.informatics-review.com/FAQ/reading.html, accessed 8 July 2007.

W.A. Katz, 2002. Introduction to reference work. Volume I: Basic information services. Eighth edition. Boston: McGraw–Hill.

A. Lih, 2004. “Wikipedia as participatory journalism: Reliable sources? Metrics for evaluating collaborative media as a news resource,” Proceedings of the Fifth International Symposium on Online Journalism (Austin), at http://jmsc.hku.hk/faculty/alih/publications/utaustin-2004-wikipedia-rc2.pdf, accessed 11 January 2007.

P. Montiel–Overall, 2005. “Toward a theory of collaboration for teachers and librarians,” School Library Media Research, volume 8, at http://www.ala.org/ala/aasl/aaslpubsandjournals/slmrb/slmrcontents/volume82005/theory.cfm, accessed 9 February 2008.

“New pages,” 2007. Wikipedia, at http://en.wikipedia.org/wiki/Special:Newpages, accessed 8 February 2007.

U. Pfeil, P. Zaphiris, and C.S. Ang, 2006. “Cultural differences in collaborative authoring of Wikipedia,” Journal of Computer–Mediated Communication, volume 12, number 1, pp. 88–133, at http://jcmc.indiana.edu/vol12/issue1/pfeil.html, accessed 6 October 2008.

“Requested articles,” 2007. Wikipedia, at http://en.wikipedia.org/wiki/Wikipedia:Requested_articles#Topic_areas_in_Natural_sciences, accessed 18 January 2007.

M. Schrage, 1990. Shared minds: The new technologies of collaboration. New York: Random House.

B. Stvilia, M.B. Twidale, L.C. Smith, and L. Gasser, 2008. “Information quality work organization in Wikipedia,” Journal of the American Society for Information Science and Technology, volume 59, number 6, pp. 983–1001.http://dx.doi.org/10.1002/asi.20813

B. Stvilia, M.B. Twidale, L.C. Smith, and L. Gasser, 2005a. “Assessing information quality of a community–based encyclopedia,” Proceedings of the International Conference on Information Quality — ICIQ 2005 (Cambridge, Mass.), pp. 442–454, and at http://mailer.fsu.edu/~bstvilia/papers/quantWiki.pdf, accessed 12 May 2007.

B. Stvilia, M.B. Twidale, L. Gasser, and L.C. Smith, 2005b. “Information quality discussions in Wikipedia,” submitted to the International Conference on Knowledge Management — ICKM 2005; version at http://mailer.fsu.edu/~bstvilia/papers/qualWiki.pdf, accessed 19 January 2007.

H.R. Tibbo, 1992. “Abstracting across the disciplines: A content analysis of abstracts from the natural sciences, the social sciences, and the humanities with implications for abstracting standards and online information retrieval,” Library and Information Science Research, volume 14, number 1, pp. 31–56.

F.B. Viégas, M. Wattenberg, J. Kriss, and F. van Ham, 2007. “Talk before you type: Coordination in Wikipedia,” Proceedings of the 40th International Conference on System Sciences (Big Island, Hawaii), pp. 78–87, and at http://researchweb.watson.ibm.com/visual/papers/wikipedia_coordination_final.pdf, accessed 12 May, 2007.

F.B. Viégas, M. Wattenberg, and K. Dave, 2004. “Studying cooperation and conflict between authors with history flow visualizations,” In: E. Dykstra–Erickson and M. Tscheligi (editors). Proceedings from ACM CHI 2004 Conference on Human Factors in Computing System (Vienna), pp. 575–582, and at http://web.media.mit.edu/~fviegas/papers/history_flow.pdf, accessed 11 January, 2007.

D.P. Wallace and C. Van Fleet, 2005. “The democratization of information? Wikipedia as a reference resource,” Reference & User Services Quarterly, volume 45, number 2, pp. 100–103.

“Wikipedia:categories for discussion/log/2007 May 9,” 2007. Wikipedia (1 July), at http://en.wikipedia.org/wiki/Wikipedia:Categories_for_discussion/Log/2007_May_9#Category:Isms, accessed 15 July 2007.

“Wikipedia:Simple English Wikipedia,” 2007. Simple English Wikipedia (16 August), at http://simple.wikipedia.org/wiki/Wikipedia:Simple_English_Wikipedia, accessed 19 August 2007.

D.M. Wilkinson and B.A. Huberman, 2007. “Assessing the value of cooperation in Wikipedia,” First Monday, volume 12, number 4 (April), at http://www.firstmonday.org/issues/issue12_4/wilkinson/, accessed 10 May 10 2007.

Nick Wright, 2007. “Clear writing and plain language,” at http://www.plainlanguage.gov/whatisPL/definitions/wright.cfm, accessed 10 July 2007.

 


Editorial history

Paper received 10 July 2008; accepted 9 September 2008.


Copyright © 2008, First Monday.

Copyright © 2008, Katherine Ehmann, Andrew Large and Jamshid Beheshti.

Collaboration in context: Comparing article evolution among subject disciplines in Wikipedia
by Katherine Ehmann, Andrew Large and Jamshid Beheshti
First Monday, Volume 13 Number 10 - 6 October 2008
http://journals.uic.edu/ojs/index.php/fm/article/view/2217/2034





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2015.