Enclosing the public domain: The restriction of public domain books in a digital environment
Enclosing the public domain: The restriction of public domain books in a digital environment by Alex Clark and Brenda Chawner

This paper explores restrictions that are being applied to New Zealand public domain books once they have been digitized and hosted online. The study assesses access and usage restrictions within six online repositories, using a sample of 100 pre–1890 New Zealand heritage books. The findings indicate that new restrictions are being applied to works no longer protected by copyright. Out of the 50 titles that had been digitized, only three were hosted by repositories that do not restrict any type of subsequent use. Furthermore, 48 percent (24) were subject to access restrictions. Copyright law’s delicate balance between public and private interests is being eroded by the prevalence of online terms and conditions, which invoke the doctrine of contract law in an attempt to restrict the public domain and opt–out of limitations upon copyright. Furthermore, ambiguity surrounding the copyright status of some books is encouraging digitizers to adopt restrictive access policies, even when a work is highly likely to be in the public domain. Unless clear rules of online curatorship are articulated within legislation, previously liberated public domain works are at risk of being restricted by online intermediaries.


1. Introduction
2. Literature review
3. Methodology
4. Limitations
5. Findings
6. Discussion
7. Conclusion



1. Introduction

Copyright law has traditionally sought to achieve a fair balance between the interests of content creators and the wider public. Copyright owners are provided with many exclusive rights relating to the use of their protected works. Copyright law, however, also limits the duration and scope of these rights in order to serve the public interest. One of the most important copyright limitations is on the duration of protection. Although copyright duration varies significantly throughout the world, all countries have chosen to establish full copyright protection as a time–limited right that eventually expires [1]. Once this occurs, creative works enter the public domain and can be used without restriction. This process allows works to become part of an intellectual commons, enriching society with a wealth of material that can form the basis for new knowledge and creativity.

In a digital environment, new forms of content distribution are disrupting copyright’s long established balance between public and private interests. Of particular significance, the shift from physical to digital media has changed the relationship between content distributors and the public. Digital book distribution is almost always subject to an additional layer of restrictions imposed by Web site owners, who seek to control the subsequent use of books in ways that go beyond the scope of copyright law. These restrictions are articulated within ubiquitous user agreements and terms of service, and are imposed as a precondition for access to online content. By invoking the doctrine of contract law, these Web site owners often seek to bypass the public interest limitations that are embedded within copyright law. In the case of public domain books, works are at risk of being perpetually locked up within rules articulated by online intermediaries.

This study assesses the extent to which public domain books are being restricted, in terms of both access and subsequent use. It also considers the reasons for these restrictions, and the degree to which restriction may be justified. A number of recommendations are made in the pursuit of a more equitable approach to digitization and hosting, with a focus on better balancing the needs of both digitizers and the wider public.



2. Literature review

2.1. Copyright law and its impact on the public domain

Copyright law has traditionally sought to achieve a fair balance between the economic and moral interests of content creators, and the wider public interest in being able to access knowledge and build upon it (Corbett, 2004). Copyright owners are protected by a range of time–limited exclusive rights that enable them to determine the terms upon which others can access, copy, publish, distribute, perform and create derivatives of their work (Copyright Council of New Zealand, 2009). Copyright owners often use licenses to specify permissible uses, and can also transfer their exclusive rights to others in part or in full via a contract. The public interest is promoted through a range of ‘fair–dealing’ exemptions embedded within copyright law (known as ‘fair use’ in the United States). These exemptions allow for certain uses to be undertaken without permission from the copyright owner, such as copying sections of text for the purposes of criticism, news reporting, research, private study and education (New Zealand. Parliamentary Counsel Office, 1994). When copyright expires, works are liberated into the public domain and are no longer subject to any statutory restriction.

A number of scholars assert that legal and technological developments have increasingly favored rights–holders at the expense of users, undermining the balance contained within copyright law (Sims, 2007). Of particular concern for the public domain has been the increasing duration of copyright protection, which has significantly extended the amount of time that it takes for works to be freely used without restriction. When copyright was first introduced by Britain’s Statute of Anne in 1710, protection was limited to a period of only 14 years (Rose, 1993). In New Zealand, the first copyright law introduced in 1842 established a protection period of 28 years, extendable to the author’s death upon application (Frankel, 2011). Today, international agreements such as the Berne Convention and the Trade–Related Aspects of Intellectual Property Rights (TRIPS) agreement have established a minimum protection period of the author’s life plus 50 years, which has been adopted by many countries, including New Zealand. A number of countries go further by protecting works for 70 years after an author’s death, such as the European Union, U.S., and Australia. Leaked documents from on–going negotiations for the Trans–Pacific Partnership show that New Zealand is facing strong pressure to increase its copyright term to life of the author plus 70 years (Electronic Frontier Foundation, 2012).

In the United States, the Copyright Term Extension Act of 1998 has received significant criticism. In Eldred v. Ashcroft, 17 economists [2] presented an amicus curiae brief arguing that the limited benefits of copyright extension do not outweigh the social costs it creates (Akerlof, et al., 2002). Their research estimates that even if perpetual copyright were granted to authors, it would only result in a 0.12 percent increase in the present–day value of their work (Akerlof, et al., 2002). This increase assumes a constant royalty rate, and would be even lower if a depreciating royalty rate were incorporated into calculations. They assert that such a low increase in present–day value provides no significant additional incentive to create new works, and does not justify the social cost of increased copyright duration.

Rufus Pollock, an economist and co–founder of the Open Knowledge Foundation, has also undertaken research exploring the ideal term of copyright protection (Pollock, 2009). His economic modeling shows that as the cost of production, reproduction and distribution declines through the use of digital technology, so too does the ideal period of exclusive copyright protection, which he estimates as being just 15 years (Pollock, 2009). Although it is difficult to determine an exact period of protection that perfectly balances the diverse needs of all societal stakeholders, Pollock’s research does provide robust evidence in favor of copyright reduction.

Research conducted by Rappaport (1998) also supports a reduced copyright term. His findings highlight that the economic viability of works decreases significantly over time, with only a small proportion of works remaining economically viable when copyright expires. When formal registration requirements were in place in the United States, only one percent of copyrighted books had their copyright renewed. This is despite a low renewal fee of US$6, suggesting that the economic viability of these works was so low at the point of expiry that rights–holders could not justify the cost and effort required to extend their copyright (Rappaport, 1998). Furthermore, of those works that were renewed, only 11 percent were still in print at the time of his study [3]. These findings suggest that legislators need to reconsider whether it is in the public interest to restrict all works by default for several decades after the window of economic viability has passed.

Extending the duration of copyright is highly problematic for the public domain, as it increases the time that cultural and academic works are subject to statutory restriction. A longer period of protection also increases the likelihood that information about rights–holders or authors will become lost or difficult to obtain. Rights–holder information is essential for obtaining permission to use copyrighted works, while information about an author’s date of death determines when a work enters the public domain. The term ‘orphan works’ is used to describe works that have an unknown copyright status or an untraceable rights–holder (European Commission, 2013).

Orphan works are often trapped from being digitized and shared with global audiences, due to a digitizer’s fear that rights–holders may emerge at a later date. In many cases, if an author’s biographical information could be found to ascertain their date of death, these orphan works would be confirmed as being in the public domain. This would allow the works to be digitized and shared without fear of any legal action. Even if an orphan work was found to be protected by copyright, the vast majority are no longer economically viable (Rappaport, 1998). Consequently, it is highly likely that rights–holders would agree to their use if they could be identified and contacted. Anecdotal evidence provided by New Zealand digitizers shows that if a work is no longer being economically exploited, many rights–holders are happy to have their copyrighted works shared online without compensation [4].

A key contributing factor to the orphan works problem is the Berne Convention, which prohibits member countries from establishing any formal registration requirements as a prerequisite to copyright protection (Corbett, 2010). This has resulted in an international system of copyright that lacks any centralized database of rights–holder records. Consequently, even diligent digitizers are often unable to determine the copyright status of a book because biographical information about the author is unavailable (Corbett, 2010). As a signatory to the Berne Convention, New Zealand has no formal registration requirements for copyright. In addition, if biographical information about an author is unavailable there are currently no provisions within New Zealand law that allow users to make a reasonable assumption about copyright status, even in situations where it is highly likely that an author has been dead for more than 50 years (Corbett, 2010). This results in works falling into a legal limbo whenever biographical information cannot be found.

Several attendees at Berkeley’s 2012 Conference on Orphan Works and Mass Digitization recommended the reintroduction of formal registration requirements, emphasizing that a compulsory centralized database of reliable biographical and contact information could greatly enhance the rights–determination process (Brantley, 2012). Corbett (2010) highlights that even a voluntary centralized rights registry would substantially improve the current situation. Registry regimes have the potential to significantly streamline the rights–determination process, and would serve to benefit both content owners and consumers by connecting rights–holders with potential users of their content.

The transition to a digital environment has also created a situation where some digitizers feel empowered to claim rights after copyright has expired. Of particular concern, some digitizers claim copyright within their digital versions of public domain material (Crews, 2012). Crews condemns such practices as going “far beyond what copyright law specifically allows,” and advocates for repositories to end the practice [5]. Although book digitization does involve a significant investment of resources, the automated and mechanical nature of digitization is unlikely to meet the threshold of creativity required for copyright protection.

2.2. Beyond copyright: Licenses, contractual restrictions, and technological protection measures

In an online environment, there is widespread use of licenses and contracts that attempt to restrict books in the public domain. These restrictions go beyond the provisions of copyright law, and attempt to opt–out of the public good provisions that are essential for balancing the public and private interests within creative works.

The application of contractual restrictions to public domain works has been subject to widespread criticism (Eschenfelder and Caswell, 2010; Association of Research Libraries, 2012; Lindsay, 2002; McManis, 1999). These scholars warn that unless there are robust limits on the extent that contracts can override copyright law, then private actors have full power to determine the scope of acceptable content use in an online environment. Unless clear legislative action is taken to limit the power of contract law, governments risk abdicating their ability to protect the public interest through copyright law.

Within the New Zealand context, current legal frameworks appear to allow contract law to be used as a tool to restrict public domain works (Corbett and Boddington, 2011). Such restrictions undermine the legislative intent behind the Copyright Act, and erode the government’s ability to use copyright law as a policy tool to achieve a fair balance between public and private interests (Sims, 2007). Sims (2007) highlights the dangers of the current system, which allows works to be restricted in perpetuity long after copyright has expired. She recommends the introduction of new legislation that would substantially limit the extent to which contracts could opt–out of public good provisions within copyright law. In particular, she proposes a system where any restrictions applied to public domain books would need to be within a reasonable scope, meeting criteria clearly articulated in law (Sims, 2007). Such an approach would entrench a minimum framework of rights in all contexts, and would help address the current imbalance between creators and users.

In the United States, the legal status of contracts relative to copyright law is highly contested (Lindsay, 2002). Although restrictive contracts are commonplace, there is significant legal debate surrounding whether or not these restrictions are legally enforceable (Lindsay, 2002). In favor of the public domain, many legal scholars assert that the pre–emption clause contained within Section 301 of the Copyright Act prevents any state law or contract from adding restrictions that go beyond the provisions of copyright (Corbett and Boddington, 2011; Cornell University, 2012; Lindsay, 2002). Some courts have used this reasoning to invalidate contracts that restrict user rights created by copyright law. In Vault v. Quaid the Court of Appeals overturned a contract that sought to restrict users from reverse engineering software, an activity specifically permitted within the Copyright Act.

Other scholars, however, claim that restrictions within a contract cannot be seen as equivalent to copyright law, since a contract is a private agreement between two parties, as opposed to being a universal law applicable to all citizens (Lindsay, 2002). As such, they assert that contracts are not pre–emptive of copyright law. This view of contract law has also been upheld in court. In the case of ProCD v. Zeidenberg, the Court of Appeals claimed that while “copyright is a right against the world”, contracts “generally affect only their parties,” and therefore do not create exclusive rights that are equivalent to copyright law [6]. The high level of legal ambiguity that currently exists suggests that legislation is required to explicitly clarify the relative status of contract law and copyright law. Regardless of the legal enforceability of contracts, their ubiquity implies a de facto legitimacy that is likely to discourage users from exercising their fair use rights when confronted with contract restrictions.

Another issue that legislatures throughout the world are confronted with is whether or not the terms and conditions articulated within Web sites constitute a valid contract. In order for contracts to be enforceable, a user needs to engage in affirmative conduct that shows they have acknowledged and assented to the terms of service (Bayley, 2009). There are many ways that Web sites attempt to impose contracts, each with a different level of legal legitimacy. One approach is to use a ‘click–wrap’ contract, which involves a user being presented with the terms and conditions of a Web site before being prompted to click a box confirming that they ‘agree’ (Bayley, 2009). Courts in the United States have upheld ‘click–wrap’ contracts as being legally enforceable, due to the way that users are engaged in affirmative conduct that assents to the terms presented [7].

More commonly, Web sites simply state that a particular action, such as browsing their site or accessing material, constitutes acceptance of their terms and conditions (Macdonald, 2011). This practice is commonly referred to as a ‘browse wrap’ contract, and there is significant debate surrounding its legal validity (Bayley, 2009; Pistorius, 2004; Macdonald, 2011). Courts assess ‘browse wrap’ contracts on a case by case basis to determine whether the user has actually consented to the terms presented on the Web site (Macdonald, 2011). In many U.S. cases, users have been able to successfully argue that they were not aware of the terms presented and had not agreed to them [8]. In cases where it could be shown that the user was provided with sufficient notice of the Web site’s terms and conditions, the user was unsuccessful [9]. A number of Western courts have not specifically looked at the case of browse wrap contracts, including England, Wales, South Africa, Canada and New Zealand (Macdonald, 2011; Pistorius, 2004; Ngan, 2013). It is likely, however, that outcomes would be similar to U.S. courts.

Unlike traditional contracts, which are subject to meaningful negotiation between two parties, online contracts are ‘contracts of adhesion’ where the possibility of negotiation is excluded (Pistorius, 2004; Burgess, 1986). Contracts of adhesion have been heavily criticized due to the way that they favor suppliers at the expense of consumers, particularly when consumers are faced with non–competitive markets, or when a competitive market is populated by multiple suppliers offering a similar set of unfair terms (Lenhoff in Burgess, 1986). Burgess highlights that contracts of adhesion are “drafted for the public rather than specific individuals”, and argues that as such, these contracts should be subject to legal scrutiny under the public interest standard [10]. If the public interest standard is not met, he contends that a contract should be deemed invalid under the various laws that most Western countries have enacted to prevent unfair contract terms. Rather than impose the cumbersome task of defining the public interest within every contract, Burgess recommends upholding existing public interest legislation as having priority over contract provisions. In the case of digitized books, this would mean upholding public interest provisions within copyright law such as fair use and the public domain. In an online environment, where contracts of adhesion govern nearly every aspect of Internet use, Burgess’ arguments have potent force. Unless governments take clear legislative action to delineate the acceptable scope of contract law, they will renounce the ability to regulate many types of online activity.

Contractual restrictions become particularly potent when terms and conditions are actively enforced with digital locks, commonly referred to as Technological Protection Measures (TPM) or Digital Rights Management (DRM) [11]. Through the use of digital locks, rules surrounding content use can be technologically embedded into the work itself and enforced in real time with near absolute force. This is in stark contrast to an analogue environment, where content rules can only be enforced if an illegal use has been identified, taken to court, and upheld by a judge. The all encompassing potential of digital locks provides rights–holders with the prospect of controlling every aspect of content access and use, and has led to them being described as “the most powerful technologies of control that man has devised” [12]. Although digital locks can be removed with a range of circumvention tools available online, the act of circumvention is illegal in many jurisdictions throughout the world. Furthermore, many users are not aware of these tools or do not possess the technical expertise required to use them.

Laws that uphold the legitimacy of digital locks are widespread internationally, largely due to a 1996 WIPO treaty that requires members to criminalize the use of circumvention software (McDermott, 2012). In New Zealand, laws supporting the use of digital locks are balanced with a limited number of provisions that allow circumvention for the purpose of conducting activities permitted by copyright law, such as fair dealing exemptions and the use of public domain works. Circumvention, however, must be undertaken by people within approved institutions such as librarians, archivists, or educational establishments (Corbett, 2010). Because of heavy restrictions preventing the promotion of circumvention technologies, it is difficult for approved institutions to inform the public about their ability to lawfully circumvent. This has led to criticism that circumvention provisions are “all but useless in practice,” and fail to address the issue of digital locks being used too broadly by content providers [13]. Furthermore, if a valid contract has been applied to a public domain book in conjunction with a digital lock, then the user does not have any legal remedy to facilitate full use of the work. Consequently, users are left in a Catch–22 situation; copyright law allows them to use the work without any restriction, however contract law has made it illegal to circumvent the technological barrier (Corbett, 2010).

While many forms of contractual restriction upon digitized public domain books are unacceptable, there are situations where a strong argument can be made for at least some form of limited restriction. A variety of reasons have been used to justify the application of restrictions to public domain works. While some restrictions are motivated by the desire for monetary gain and to maintain competitive advantage, others reflect more nuanced concerns held by online book repositories. These include the desire to: avoid the misuse and misrepresentation of sensitive cultural works, to prevent commercial exploitation, to ensure proper object description and metadata, to maintain accurate usage statistics of material, to meet donor requirements surrounding the use/reuse of a text, and, to facilitate monetization in order to fund the digitization process (Eschenfelder and Caswell, 2010). Although arguments favoring some forms of restriction are worthy of consideration, there are also strong arguments in favor of open access. Research shows that open repositories allow works to have far greater societal impact than when access and use is restricted (Cullen and Chawner, 2008). Open access also allows cultural works to be subject to a much wider range of interpretations and adaptations, which may otherwise be suppressed within a tightly controlled environment (Eschenfelder and Caswell, 2010).

As an alternative, a limited rights regime may be desirable for digitized public domain books, providing intellectual property protection that is limited in time and scope to incentivize the digitization process. Although no country has yet articulated this concept within copyright law, the European Union has introduced recommendations that seek to balance the needs of digitizers with the wider public. In 2011 the European Commission recommended that digitizers be allowed to enter exclusivity contracts with European libraries providing them with public domain books for digitization [14]. This recommendation outlines that preferential commercial use of digitized public domain books should last for a maximum period of seven years. The recommendations are based on the belief that this level of protection is “adequate to generate [...] incentives for [...] mass digitization [while allowing] sufficient control of the public institutions over their digitized material” [15]. Determining the line between acceptable and unacceptable restrictions is a complex process, however legislation should be guided by the public interest standard. Legislatures should strive to balance the legitimate needs of a variety of societal stakeholders, with the ultimate goal of maximizing overall public welfare.



3. Methodology

In order to establish the extent to which digitized public domain books are being restricted, a sample of 100 pre–1890 books was selected from the New Zealand National Bibliography (NZNB). This sample was chosen on the assumption that these works had entered the public domain under New Zealand copyright law. Each book in the sample was searched for within six online repositories: Google Books, Hathi Trust, Internet Archive, Early New Zealand Books (ENZB), New Zealand Electronic Text Collection (NZETC) and Project Gutenberg. In addition, Google and Bing searches were conducted for all sample books that could not be located within these repositories. Searches were conducted via an Internet connection provided by a New Zealand ISP during January 2012.

When available, the following information was collected about books that had been digitized: stable URL, date of digitization, source of digitized text, access restrictions, and usage restrictions.

Access restrictions were classified using four categories:

i) No access to content;
ii) Partial access [16];
iii) Full view, no download; and,
iv) Full view and downloadable.

If content could be accessed, then usage restrictions were classified using three categories:

i) Technological Protection Measures (TPM) [17];
ii) Terms and conditions restricting use [18]; and,
iii) Unrestricted [19].

Additional data were gathered to assess how each repository determined the copyright status of its books. This information was sourced from repository Web sites, as well as via email correspondence and informal face–to–face meetings. Access restrictions were reassessed in November 2013, and all titles originally identified as restricted in January 2012 were still subject to the same level of restriction.

In addition to this data analysis, preliminary research was conducted to identify whether copies of digitized sample books were available for purchase, either as digital downloads or as print–on–demand books. In order to explore the pay–per–download market, copies of each digitized title in the sample were searched for within the iTunes store and General Books LLC. Any books not found within these websites were also searched for via Google and Bing searches.

The print–on–demand market was assessed by searching for printed copies of digitized sample books within the Amazon store. Any books not found on Amazon were searched for via Google and Bing searches. List prices on Amazon were compared with the cost a consumer would pay for a single print–on–demand copy via an online publishing service (http://bookpatch.com/), based on the cost of a single printed soft cover copy with the same number of pages.



4. Limitations

This research used a sample of 100 early New Zealand books published before 1890. Although general trends from this study may apply to other countries, specific results are likely to vary across countries due to a number of factors. These include differences in the extent of digitization initiatives, as well as the varied legal frameworks and institutional policies that govern digitization in other jurisdictions. Furthermore, data were gathered using a New Zealand Internet Service Provider so findings cannot be generalized to users accessing content from other countries.



5. Findings

It was promising to find that 50 percent (50) of public domain books within the sample have been digitized. The results, however, suggest that there is a high level of restriction being applied to public domain books after the act of digitization. Out of a sample of 100 public domain books, only three are hosted by repositories that did not seek to restrict any form of subsequent use. Many repositories also impose significant barriers to access, with 48 percent (24) of all digitized books being hosted by at least one repository that restricts access. An exploratory study into the paid market for public domain books also revealed that 72 percent of digitized books within the sample are offered as paid downloads on at least one merchant Web site, with prices as high as US$9.99.


Table 1


5.1. Copyright determination policies

Sample repositories use one of three approaches to determine the copyright status of their digitized books.

i) Restrictions based on actual U.S. copyright status
All works published before 1923 are classified as public domain, based on U.S. copyright law. Non–U.S. users are encouraged to check that texts are also public domain within their own country, however no regional access restrictions are applied.
Used by: Internet Archive, Project Gutenberg

ii) Restrictions based on actual N.Z. copyright status
Public domain status is determined using biographical information about the author to determine whether copyright has expired, which occurs 50 years after the author’s death.
Used by: Early New Zealand Books, New Zealand Electronic Text Collection

iii) Dual copyright determination (U.S. actual, New Zealand estimate)
All works before 1923 are classified as public domain for users within the U.S. For New Zealand users, books are restricted for 140 years after the date of publication (regardless of actual public domain status).
Used by: Google Books, Hathi Trust

5.2. The extent of digitization

Overall, 50 percent (50) of the sample has been digitized (see Table 1). Of these, 40 percent (20) are hosted by only one repository, 60 percent (30) are hosted by two or more repositories, 44 percent (22) by three or more, 12 percent (6) by four or more, and 6 percent (3) by five (see Table 2).


Table 2


5.3. Access restrictions

Forty–eight percent (24) of digitized sample books were subject to some form of access restriction within at least one repository (see Table 1). Hathi Trust was the most restrictive repository, with access to 91 percent (21) of its books restricted (see Table 3). Google Books was the second most restrictive repository, at 44 percent (14).

Of the sample works hosted within Google Books, 56 percent (18) were fully accessible to view and download. Another 34 percent (11) of books were subject to ‘snippet view’ limitations. Qualitative analysis revealed that all ‘snippet view’ books within the sample had been published after 1872, suggesting that Google is restricting all hosted content for a period of 140 years after publication as a method of copyright estimation. This was later confirmed by someone involved with the Google Books project, however this policy is not officially stated on the Google Books Web site. In addition to ‘snippet view’ restrictions, nine percent (3) of books hosted by Google were subject to complete restriction. These books had been published after 1872, and were all sourced from the collections of Harvard University Library. Harvard only allows its books to be digitized if they have been confirmed as public domain (Harvard University Library, 2008), so it is likely that they have chosen to restrict all overseas access to books where the foreign copyright status has not been established.

All books blocked within Hathi Trust had been originally digitized by Google (21), and were classified as ‘snippet view’ on the Google Books Web site [20]. Furthermore, none of the books originally digitized by Google Books were able to be downloaded within Hathi Trust, unless the user had an institutional login [21]. Hathi Trust had the highest proportion of books completely blocked to users based in New Zealand, at 48 percent (11), and the lowest proportion of books available for download, at nine percent (2) (see Table 4). When trying to view public domain books restricted by Hathi Trust, a message appears informing the user that “[t]his item is not available online due to copyright restrictions.” Clicking on a link for further information reveals Hathi Trust’s policy of copyright estimation: “due to the variations in copyright law in countries outside the U.S., it is estimated that 1874 is the earliest date foreign works may still be under copyright” (Hathi Trust, 2014).


Table 3


The other four repositories (the Internet Archive, Early New Zealand Books, New Zealand Electronic Text Collection and Project Gutenberg) did not impose access restrictions, and allowed all digitized books to be freely viewed and downloaded from their Web sites. The open policies of these four repositories helped improve overall accessibility rates, particularly the Internet Archive, who often hosted books that had been originally digitized by more restrictive repositories. These open repositories contributed to 96 percent (48) of digitized sample books being freely accessible from at least one repository. This shows that users are highly likely to find an unrestricted version of a digitized text if they make the effort to search all possible repositories.


Table 4


5.4. Usage restrictions

All but three of the digitized books were subject to terms and conditions imposing usage restrictions upon the work (see Tables 5 and 6). Usage restrictions imposed by repositories included: no hosting, no republication, no alterations, scholarly use only, personal use only, non–commercial use only, attribution required, or permission required for all uses. All repositories with restrictions articulated their terms and conditions within a dedicated section of their Web site, and also presented usage conditions alongside or embedded within the digitized book to varying levels of prominence. It is unclear whether these ‘browse wrap’ contracts provide sufficient user notice to be upheld as valid contracts by a court. No repositories adopted ‘click wrap’ contracts.

The New Zealand Electronic Text Collection used a Creative Commons Attribution Share–Alike license upon all books that it identified as being in the public domain. Creative Commons licenses, however, cannot be enforced upon public domain material (Creative Commons, n.d.). Despite this, NZETC would be able to enforce the same terms if they were instead presented as a valid contract. Early New Zealand Books asserted that its digitized versions of public domain books were protected under copyright as a typographical arrangement, with restrictions based upon the assumption that they could be enforced under copyright law. Digitized works are unlikely to meet the threshold of creativity and originality required for copyright protection, however the same usage conditions would be enforceable if presented as a valid contract. All other repositories presented their terms and conditions without any assertion of copyright within the public domain books that they hosted.


Table 5


In addition to terms and conditions, Technological Protection Measures (TPM) had been applied to 83 percent (10) of books hosted by Hathi Trust, and 38 percent (11) of books hosted by Google Books (see Table 5). Hathi Trust blocked the cut and paste function, as well as the ability to download books as a single PDF file (a feature that is only available to users with login credentials from an institutional partner). Google Books used the ‘snippet view’ mode to prevent full–text access to public domain books published within the past 140 years, and also disabled the ability to copy and paste text from these snippets. No other repositories implemented TPM tools.


Table 6


5.5 Procedures for reporting blocked public domain books

Attempts were made to unblock the public domain books that were identified within the sample. On the Google Books Web site, there is a ‘Report an issue’ link at the bottom of every page. Once clicked, the link directs users to a landing page providing instructions for “author[s] or copyright holder[s] who would like to report an issue with a book”, as well as users who would like “report content [to be] removed from Google’s services under applicable laws” (Google, 2014). The emphasis of the landing page is on content removal, and does not explain that users can also report public domain books that are incorrectly blocked or restricted. Furthermore, the landing page is catered towards rights–holders, and does not specifically provide support for general users. This reporting feature was not originally available at the time of data collection in January 2012.

Incorrectly blocked books from the sample were reported to Google Books in November 2013, along with biographical information proving public domain status. These titles are still blocked at the time of writing this paper in May 2014. A message was also sent to Google Books in November 2013 recommending changes to the wording of the landing page. Sending a message to Google Books’ support team required a Partner Program login account. A support team member advised that the issue would be investigated, however no changes have been implemented in response to this request. Follow–up correspondence with the Google Book’s support team was initiated in May 2014, and another support team member has advised that they would look into the issues raised. At the time of publication, no further changes have been made and all reported books remain restricted or blocked.

Hathi Trust provides rights–holders with a standardized take–down procedure for books mistakenly made available online, however they do not provide users with a standardized procedure to report incorrectly restricted books. There is, however, a feedback e–mail address that users can submit requests to. An e–mail listing all incorrectly blocked books was sent to Hathi Trust in November 2013, along with biographical information showing their public domain status. In March 2014 a response was received from Hathi Trust, notifying that access had been enabled for all 11 of these books. While it is encouraging to have this small sample of books made accessible, many more books continue to be incorrectly restricted and large scale efforts are required to confirm the status of other public domain titles.

5.6. Paid copies of public domain works

Below are the results from preliminary research that explores the market for paid copies of public domain books.

   i) Pay–per–download

Overall, 72 percent (36) of digitized books are offered as paid downloads on at least one Web site. On the General Books LLC Web site, 68 percent (34) of digitized sample books are available for purchase. All downloads are offered at US$9.99 each, or users can obtain unlimited access for a monthly fee of US$14.99. Many of these digital downloads appear to be PDFs lifted from the Web sites of other digitizers, such as Google Books or the Internet Archive. On the iTunes Web site, 12 percent (6) of digitized sample books are available for download. Free access is available for two of these. The other four titles can be purchased for US$4.99. If a sample book could not be located within iTunes or General Books LLC, then it was searched for within Bing and Google. These searches yielded no results for paid copies.

   ii) Print–on–demand books

Overall, 72 percent (36) of digitized books are listed on Amazon as printed copies of digitized files. Figure 1 shows the cost of purchasing a copy from Amazon, compared with the cost of purchasing a print–on–demand copy from an online publishing service (http://bookpatch.com/). All Amazon prices are significantly higher than the cost of purchasing the book from a print–on–demand service. Half of Amazon’s copies cost more than twice the price of a single print–on–demand copy. One fifth cost at least five times more than the print–on–demand price. The largest price difference was US$25.12.


Figure 1


5.7. Copyright status of sample books

Out of the 77 sample books with identifiable authors, 66 percent (51) had biographical information that was easily located online via a search taking less than one minute. This biographical information was used to calculate statistics relating to the authors’ lifespans, their age at publication, how long they lived after publication, the length of copyright protection (based on N.Z. copyright law), and the year that copyright expired. The shortest length of copyright protection lasted for 51 years after publication, while the longest period of protection was 113 years. The median length of copyright protection was 75 years, and 95 percent of all books had copyright expire within 102 years (see Table 7). All of the sample books with biographical information (51) had been in the public domain for between 30 and 132 years, with the most recent copyright expiry occurring in 1982. Of these public domain books with biographical information, 35 percent (18) were published after 1872 so would be automatically blocked to New Zealand audiences if digitized by Google Books. Half of these public domain books would not become unblocked under Google’s current policy until after 2020, with the last not becoming unblocked until 2029.


Table 7




6. Discussion

6.1. Repository best practices for copyright determination

   i) Use of biographical data

All of the sample repositories restricting access to public domain books are estimating the copyright status of digitized works, rather than attempting to find accurate biographical information about the author. This has resulted in many public domain books being unnecessarily blocked. One way of addressing this problem would be for repositories to conduct a search for biographical information about authors prior to digitization. This is not an onerous task relative to the overall digitization process. A one–minute search located biographical information 66 percent of the time within the sample. In comparison, a 500–word book takes 30 minutes to scan, as well as further time to process and upload (Kelly, 2006). Any biographical information that is located should be recorded as metadata in a standardized manner, including links to information sources. This biographical data could then be used to automatically apply the copyright laws of each country in an accurate and reliable manner.

   ii) Copyright estimation

Blocking access to digitized books for 140 years after the date of publication restricts many books that are in the public domain. Within the sample, 47 percent (47) of the books were published within this 140–year window, and would therefore be automatically blocked if hosted by Google Books or Hathi Trust. This is despite the fact that all sample books with locatable biographical information were confirmed to have entered the public domain between 30 and 132 years prior to data collection. An estimation period of 140 years is overly conservative, as it requires authors to have lived for 90 years after publication (under New Zealand copyright law). To put this in perspective, a 15–year–old author would need to live beyond the age of 105 for copyright to last 140 years. Furthermore, copyright protection would be meaningless unless there was a living rights–holder who objected to the book being accessible online. Considering that most books lose their economic viability within several decades (Rappaport, 1998), it seems unlikely that rights–holders would object to the use of older heritage works.

One possible way to improve copyright estimation would be to use statistical analysis to develop a more reasonable time period of restriction for works with missing biographical information. Actuarial tables could be used in conjunction with statistics about the age of authors at publication, to determine a more reasonable estimation of copyright duration. For example, 95 percent of books in the sample entered the public domain within 102 years of publication, and all entered the public domain within 113 years. This suggests that 120 years would be a more appropriate period of restriction in situations where copyright estimation is needed, as it still accommodates for authors who published earlier and/or died later than statistical norms. Furthermore, improved feedback interfaces could allow rights–holders to easily request the removal of their content in the unlikely event of a copyright breach.

   iii) Transparent practices

Repositories do not always clearly articulate the process they use to determine the copyright status of a work. Furthermore, feedback mechanisms are often either non-existent, non–responsive or complex, making it difficult for users to alert repositories to situations where copyright status is incorrect. Repositories should clearly indicate the copyright status of each book and inform users of the processes used to determine copyright, especially if a process of estimation has been used. Users and rights–holders should be provided with an easy way to submit feedback if they disagree with a repository’s copyright classification of a digitized book. For example, repositories should introduce standardized feedback forms that allow submission of biographical information for consideration. A set of standardized criteria could then be used to assess the validity of a user’s submission. Users should be sent an e–mail notification when their submission is received, and informed of the outcome once a decision has been made.

6.2. Copyright law

   i) Limiting the length of copyright protection

Extending the duration of copyright negatively impacts on the way that culture and knowledge is shared and built upon. Not only does it lengthen the time required for works to enter the public domain, but it increases the likelihood that information about rights–holders and authors will become lost over time. Furthermore, the benefits of copyright extension are of limited monetary significance and accrue to an extremely small proportion of rights–holders. A significant body of scholarly literature shows that the social costs of copyright extension do not outweigh the benefits. Legislators should refrain from increasing the period of copyright protection, particularly when negotiating trade agreements such as the TPPA.

   ii) ‘Safe harbor’ provisions for digitizers

Informal discussions with people associated with Google Books and Hathi Trust revealed that the fear of legal action has been a strong incentive for conservative access policies, particularly in regards to access restrictions upon works with an unconfirmed copyright status. In cases of accidental infringement, copyright law should provide digitizers with some form of ‘safe harbor.’ This should be grounded on evidence that good faith steps have been taken to determine the status of copyright. A standardized set of best practices should be articulated in regards to copyright determination, as well as a multifactor test that can be used to assess whether a digitizer has taken sufficient steps to determine the copyright status of a work. Safe harbor legislation could also be used to firmly establish an acceptable period of copyright estimation, assuming that reasonable methods had first been undertaken to source accurate biographical information. This would provide a legal framework that encourages repositories to adopt less conservative access policies.

   iii) Registration formalities

A lack of any registration formalities has contributed to a proliferation of orphan works with an unknown copyright status or an untraceable rights–holder. The orphan works problem could be addressed by introducing some form of international registration regime in conjunction with a centralized database of rights information. Even a voluntary registration regime would greatly improve the way that rights information is stored and accessed, providing a standardized mechanism that allows rights–holders to easily connect with potential users of their content. Searching a registry database could also be one way for digitizers to demonstrate ‘good faith’ when determining copyright status, in conjunction with a comprehensive system of best practices.

6.3. Contract law

A variety of terms and conditions are being applied to public domain books after the act of digitization, even though the underlying work is free of copyright protection. It is highly concerning that New Zealand law appears to allow contractual restrictions to govern the use of books that have entered the public domain. All repositories use ‘browse–wrap’ contracts to articulate their terms and conditions. Depending on the extent to which these conditions are prominent on the Web site, they may lack the level of user notice that courts require for a contract to be enforceable. A few minor changes in the presentation of terms and conditions, however, would allow digitizers to easily meet the threshold of contract validity and legally restrict the public domain.

New legal frameworks are needed to protect the public domain from being eroded by contracts. A public–interest balancing test could be used to assess the acceptable scope of contractual restrictions, in conjunction with an enumerated list of restrictions deemed acceptable. This would allow digitizers to only enforce restrictions that are fair and justifiable when considering a range of stakeholders. Of particular importance, legislators should strive to uphold the public interest in freely accessing public domain books, while also recognizing the importance of incentivizing the development of quality repositories. To achieve this goal, it may be desirable to establish a limited rights regime that provides digitizers with the option to enforce a narrow range of reasonable restrictions for a short period of time after digitization. A limited rights regime would also provide clear precedent against repositories wishing to claim full copyright protection within their digitized works.

6.4. Archival provisions

Some repositories prohibit any individual or organization from hosting files, even though there is a legitimate public interest in having public domain files archived by multiple repositories. Legislation should allow not–for–profit organizations and government institutions to archive any public domain work that has been digitized. To achieve this, these institutions should be allowed to circumvent DRM and overturn contracts restricting public domain works, and should also be permitted to make these works available to the public. To ensure a high quality of archival repositories, institutions could be required to abide by a set of best practices, while also being prevented from conducting illegitimate practices such as certain types of commercial exploitation.

6.5. Paid access

   i) Pay–per–download

High prices are being charged for access to digital copies of public domain books by repositories that have adopted a pay–per–download model. Although a substantial amount of work is undertaken by repositories that conduct mass digitization, the overall cost–per–unit is low and does not justify such large profit margins. The actions of General Books LLC are particularly concerning since they appear to be merely aggregating the content of other repositories, such as Google Books and the Internet Archive, and charging users for access. There are a number of innovative ways for repositories to make money from public domain books, without compromising user access. For example, targeted advertising could be embedded within repository Web sites based on text and metadata from the book. Repositories could also recommend users with relevant books for purchase such as historical titles that provide context, or recently published books that explore similar themes. Web sites could also create premium paid content to complement the underlying public domain text, such as educational resources and multimedia files. Monetization of public domain books should be based on adding value for users, rather than charging for access to the underlying public domain content.

   ii) Print–on–demand

Using digital files to print copies of public domain books can be an extremely useful resource, providing an opportunity to purchase a physical copy at a lower price than an original. This study shows, however, that some publishers are charging rates that significantly exceed the costs of production. One possible solution is for repositories to provide direct links to several print–on–demand services, so that users can easily choose a print–on–demand service directly from the repository Web site. This would promote healthy competition between print–on–demand publishers, encouraging prices to reflect production costs more reasonably. A higher visibility of print–on–demand services may also result in higher profits through increased transactions. Online bookstores that sell physical copies of digitized public domain books should also provide a link to a free digital copy. This would ensure that consumers do not unnecessarily purchase a physical copy just because they couldn’t find a digital file online.



7. Conclusion

The findings of this research suggest that a high proportion of digitized public domain books are being restricted by online repositories. Out of a sample of 100 public domain books, only three are hosted by repositories that do not impose any form of usage restriction. Furthermore, 48 percent (24) of all digitized books are hosted by a repository that restricts or blocks access, with the most restrictive repository limiting or blocking access to 91 percent (21) of sample books within its collection.

The widespread application of usage restrictions upon public domain books is characteristic of an online environment where user agreements and terms of service are ubiquitously imposed as a precondition to access content. Without legal frameworks articulating the scope of acceptable content restriction, governments are diminishing their ability to use copyright law as a tool to balance public and private interests. New legislative frameworks are needed to limit the power of contract law over public domain works, otherwise private actors will have the ability to perpetually control the scope of legal content use through the application of online contracts. Such legislation could be complemented with a limited rights regime, providing digitizers with the option to enforce a narrow range of reasonable restrictions for a short period of time after digitization. Initiatives like these would allow legislatures to achieve a better balance between public and private interests, benefitting a wider variety of stakeholders.

Almost all access restrictions applied to public domain books within the sample were the result of repositories using a process of estimation to assess copyright status. Within the sample, a one–minute search located accurate biographical information about authors two–thirds of the time. This task takes a fraction of the time required to digitize a book, which involves 30 minutes to scan 500 pages (Kelly, 2006). Digitizers should incorporate the sourcing of copyright information within the overall process of digitization, and copyright estimation should only be used as an option of last resort. Furthermore, copyright estimation periods should better reflect statistical norms regarding the actual duration of copyright protection. The current estimation period of 140 years, used by Google Books and Hathi Trust, is far too conservative. If hosted under this policy, 47 percent of sample books would be restricted. This is despite the fact that all books with locatable biographical information were confirmed as being in the public domain for between 30 and 132 years.

The fear of legal action has strongly influenced the conservative access policies of some repositories, especially when the copyright status of a title has not been confirmed. There is a strong argument for repositories to be protected by some form of ‘safe harbor’ in situations where accidental infringement has occurred. Legal protection, however, should be grounded upon evidence that several ‘good faith’ steps have been followed during attempts to determine copyright status. Such steps could include documented searches of known databases of biographical information, as well as the use of online search engines.

Unfortunately, a considerable amount of biographical information disappears or becomes difficult to locate over time. This problem is compounded by a lack of any formal database of biographical information, as well as an ever–increasing period of copyright protection. Some form of centralized database of biographical information about authors should be established to improve the way copyright status is determined. Legislators should also strongly resist any extension of copyright duration, particularly when negotiating international trade agreements such as the TPPA. A strong body of scholarly literature supports this stance, and shows that the social costs of copyright extension do not outweigh the benefits.

While it is promising that such a high proportion of public domain books have been digitized, significant work is needed to ensure that this material is freely available for access and use. The extent of restriction revealed by this study suggests that our cultural heritage is vulnerable, and risks becoming encompassed within a modern enclosure movement if action is not taken. Many improvements are needed to both policy and practice if we want the public domain to remain an intellectual commons, enriching society with resources for new knowledge and creativity. End of article


About the authors

Alex Clark is a Research Assistant in the School of Information Management at Victoria University of Wellington. He has a Bachelor of Arts in Media Studies and International Relations from Victoria University. He completed his undergraduate studies while on exchange at the University of Texas at Austin and the University of California, Berkeley, where he studied Internet and communications law. His research explores the future of online content distribution, and uses qualitative and quantitative methods to assess emerging trends within different types of media including books, journalism, music and academic research.
E–mail: alex [dot] clark [at] vuw [dot] ac [dot] nz

Brenda Chawner is a Senior Lecturer in the School of Information Management, Victoria University of Wellington. She has over 30 years experience in using and developing library applications of IT, and currently teaches courses on metadata and advanced information technology for information managers in Victoria University of Wellington’s Master of Information Studies programme. Her research interests focus on the ways in which technology enhances or restricts access to information, and she has published on open access, free/libre and open source software, the use of social media by information professionals, and copyright and licensing of digital information. Brenda has a B.Sc. and an M.L.S. from the University of Alberta, and a Ph.D. from Victoria University of Wellington.
E–mail: Brenda [dot] chawner [at] vuw [dot] ac [dot] nz



1. In some countries, such as Norway, moral rights within copyright are granted in perpetuity. Economic rights within copyright, however, remain time–limited (Hannemyr, 2009).

2. The economists were: George A. Akerlof, Kenneth J. Arrow, Timothy F. Bresnahan, James M. Buchanan, Ronald H. Coase, Linda R. Cohen, Milton Friedman, Jerry R. Green, Robert W. Hahn, Thomas W. Hazlett, C. Scott Hemphill, Robert E. Litan, Roger G. Noll, Richard Schmalensee, Steven Shavell, Hal R. Varian and Richard J. Zeckhauser.

3. Rappaport, 1998. Books within Rappaport’s sample were originally copyrighted between 1923–1942, and were renewed between 1951–1970. At the time of his study, between 28 and 47 years had passed since renewal.

4. Information ascertained from conversations with those involved with the New Zealand Electronic Text Collection and Early New Zealand Books.

5. Crews, 2012, p. 826.

6. Lindsay, 2002, p. 42.

7. Hotmail v. Van Money Pie Inc.; Moore v. Microsoft Corp.; I. Lan Systems, Inc. v. Netscout Service Level Corp.; Forrest v. Verizon Communications, Inc.; Scherillo v. Dun & Bradstreet, Inc.; Feldman v. Google.

8. Specht v. Netscape Communications Corporation; Ticketmaster Corporation v. Tickets.com Inc.; Williams v. America Online Inc.; Pollstar v. Gigamania Ltd.

9. Southwest Airlines Co. v. BoardFirst, L.L.C.; Major v. McCallister.

10. Burgess, 1986, p. 258.

11. Technological Protection Measures are generally viewed as a subset of Digital Rights Management, however the terms are often used interchangeably. In New Zealand law, the term Technological Protection Measure is used.

12. Lessig, 2006, p. 56.

13. Corbett, 2010, p. 195.

14. 2011/711/EU, at https://ec.europa.eu/digital-agenda/sites/digital-agenda/files/PPPs%20for%20digitisation.doc_0.pdf, accessed 20 May 2014.

15. Niggemann, et al., 2011, p. 7.

16. Partial access refers to repositories that enable the ability to view small segments of text within the book.

17. Includes any technical restrictions placed upon the use of digitized books.

18. Includes licenses, contractual restrictions, and any form of request attempting to limit use.

19. Not subject to any TPM or terms and conditions.

20. Although access to the book is completely blocked, users can search for the frequency and page number of keywords within the text.

21. There are currently no New Zealand libraries that are institutional partners with Hathi Trust.



Enclosing the public domain: The restriction of public domain books in a digital environment
by Alex Clark and Brenda Chawner.
First Monday, Volume 19, Number 6 - 2 June 2014
doi: http://dx.doi.org/10.5210/fm.v19i6.4975

