|
Project Gutenberg (PG) was launched by Michael Hart in 1971 in order to provide a library, on what would later become the Internet, of free
electronic versions (sometimes called e-texts) of physically existing books. The texts
provided are mostly in the public domain, either because they were never
under copyright, or because their copyrights have expired. There are also a few
copyrighted texts that Gutenberg has made available with the authors' permission. The project was named after the 15th-century German printer Johannes Gutenberg who propelled the movable type
printing press revolution.
General information
For the most part, Project Gutenberg concentrates on historically-significant literature and reference works. The slogan of the
project is "break down the bars of ignorance and illiteracy", chosen because the project hopes to continue the work of spreading
public literacy and appreciation for our literary heritage that public libraries began in the early 20th century. Whenever possible, Gutenberg releases are available in plain ASCII text. Other formats may be released as well, when submitted by volunteers. For years, there has been discussion
of using some type of XML, although progress on that has been slow. Proprietary formats which
are not easily editable, such as PDF, are generally
not considered to fit in with the goals of Project Gutenberg, although a few have been added to the collection.
While most Project Gutenberg releases are in English, there are now significant numbers in German, French, Italian, Spanish, Dutch, Finnish, and Chinese, as well as a few
in other languages. All Project Gutenberg texts may be obtained and redistributed by readers for no fee: the only restriction
placed on redistribution is that the unaltered text must contain the Project Gutenberg header. If the redistributed text
has been modified, the file must not be labelled as a Gutenberg text.
As of 2004, the project has released over twelve thousand electronic books, almost entirely produced by volunteers, and remains active.
Anyone can become a proofreader by signing up to the Distributed Proofreaders site [1] , and volunteering to proof one
page at a time.
History
In 1971, Michael Hart was attending the University of
Illinois. Hart obtained access to a Xerox Sigma V mainframe computer in the
university's Materials Research Lab, as his best friend and his brother's best friend were two of the four operators of that
particular machine. He was given an operator's account with a virtually unlimited amount of computer time; that access has since
been variously estimated to have been worth $100,000 or $100,000,000. Hart spent the next hour and a half trying to think of
something to do with the computer that would be worth that much money. This particular computer happened to be one of the 15
nodes on the computer network that would become the Internet. Hart believed that computers would one day be accessible to the general public
and decided to make works of literature available for free in electronic form. He happened to have a copy of the United States Declaration of
Independence in his backpack, and this became the first Project Gutenberg e-text.
By the time U. of I. stopped hosting Project Gutenberg in the mid-1990s, Hart was
running it from Illinois Benedictine College. Later he came to a similar arrangement with Carnegie Mellon University, which agreed to administer
Project Gutenberg's finances. It was not until the year 2000 that Project Gutenberg was formally organized as an independent
legal entity, and it is now a non-profit corporation
chartered in Mississippi with an IRS ruling that donations to it are
tax-deductible.
Since the Project's early days, the time required to digitize a book has decreased dramatically. Books are generally not typed
in, but are instead converted into text with the aid of optical character recognition (OCR) software. Despite these advances, books still need to be
heavily proofread and edited before they can be added to the collection.
Other projects inspired by Project Gutenberg
Literature
Project Gutenberg of Australia is an official sister project
of PG. While the primary Gutenberg site is bound by U.S. copyright
law and especially the Sonny Bono Copyright Term Extension Act which in some cases has retroactively
extended the duration of copyright to ninety-five years, PG Australia produces
e-texts in accordance with Australian copyright
law, which differs from US law in defining when works enter the public domain. Thus, PG Australia is able to produce and host
e-texts that otherwise would be illegal for Project Gutenberg in the United
States to host, due to different copyright laws, while some texts from the US project cannot be hosted there. PG Australia
also focuses on digitizing Australian material. However, due to the negotiation of a renewed free trade agreement
between Australia and the United States, the availability of these texts may not be continued, because of copyright changes
included in the agreement.
PG-EU is a new sister project which operates under the copyright law of the European Union. One of its aims is to include as many languages as possible into Project Gutenberg. It
operates in Unicode to ensure that all alphabets can be represented easily and
correctly.
Aozora Bunko is a similar project in Japan, which focuses on digitizing
non-copyrighted texts under Japan's copyright law and
distributing them for free. Most of the texts provided are Japanese literature and translations from English literature.
Project Runeberg is a similar project for the Nordic language texts, begun in 1992.
Project Ben-Yehuda brings public domain Hebrew texts to the
internet, and was inspired by Project Gutenberg. It was begun in 1999. A project by the
National Yiddish Book Center in Amherst, Massachusetts is attempting to produce digital
versions of its entire collection of Yiddish books.
In 2000, Charles Franks
founded Distributed Proofreaders, which allows the
proofreading of scanned texts to be distributed among many volunteers over the Internet. To make this possible, volunteers scan and run optical character recognition software on books, then
place the results on a website for volunteer "proofers" to check. The book passes through two proofreading rounds. With thousands
of volunteers each working on one or more pages, a reasonably-sized book can be proofed in several hours. Other volunteers
post-process the books and post them to Project Gutenberg.
The Million Book Project aims to digitize a million
public domain books by 2005. In order to process such a large number of books in such a short time, they generally skip the
time-consuming transcription process and store their books as compressed image files.
Music
There is a sub-project within Project Gutenberg working on digitizing Sheet
music.
The Mutopia project attempts to do for music what Project
Gutenberg does for literary works.
Related projects
See list of digital library
projects for a more comprehesive list of digital library efforts.
See also open content.
External links
|