This page tells you how to find and get Project Gutenberg eBooks if:
Find our RSS feed in the cache/feeds location. Updated daily after 2am U.S. Eastern time.
The “posted” list is where every new eBook is announced as it is being uploaded to the Project Gutenberg servers. New books are then available for download, typically within 2 hours. The list has a once-daily digest option, and also online public archives.
The Project Gutenberg collection is available from dozens of sites offering access via http/https, ftp, rsync, and a few other methods. See our listing of mirror sites to choose the location, access method, or speed. Mirrors generally do not have a friendly Web-based front end, but do have the collection. See the mirroring how-to for details.
Updated at least monthly. These plain text files provide the basic information about each eBook, and are good for searching from your own system (for example, use control-F in a Web browser or word processor). They are the accession lists for Project Gutenberg. Note that these files are not recommended for automation (that is, to use as input to generate a computerized database). Instead, use one of the catalog files mentioned below.
If GUTINDEX.ALL is too big for you or you prefer separate annual lists, you can download GUTINDEX files by year.
Not part of Project Gutenberg - check laws of the country where you are, before accessing or redistributing any eBooks.
You can navigate the directory/folder contents starting at /dirs, however this is not very user-friendly.
All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.
Note that the exact same metadata is available as a per-eBook .rdf file. These are found in the cache/epub (i.e., cache/generated) directory, accessible by mirroring or by the directory/folder listings above. The large XML/RDF file is simply a concatenation of all the per-eBook metadata.
Project Gutenberg metadata does not include the original print source publication date(s). Because Project Gutenberg eBooks are substantially different from the source book(s), we track the Project Gutenberg publication date (“release date”), but do not include print source information in the metadata. Differences almost always include dehyphenation, removing page headers/footers, changes to typography during markup, and sometimes relocation of images, footnotes, captions, etc. In addition, Project Gutenberg eBooks sometimes come from multiple print editions.
Many eBooks include scans of the title page or other pages, which may indicate original print publication. If matching a Project Gutenberg eBook to a particular print edition is important to you, it is likely this will need to be done by direct comparison of a print source with the eBook.
Project Gutenberg distributed the catalog in MARC format, and then discontinued when server upgrades left our software non-functional. In addition, a legacy program prepared by a volunteer, pgrdf2marc.pl, worked with a previous version of the XML/RDF data, but does not work with the currrent version.
Kiwix is an application that lets you download a large collection and use it locally. A copy of the Project Gutenberg content was made available in November 2018, and may be updated periodically.