NLM Digital Collections

About Digital Collections

Mission

Digital Collections provides access to the National Library of Medicine's distinctive digital content in the areas of biomedicine, health care and the history of medicine. Our unique digital collections are freely available for download worldwide and in the public domain unless otherwise indicated.

About the Collections

Collection Development Policy

The policies and guidelines for building NLM’s Digital Collections are described in the Collection Development Guidelines of the National Library of Medicine. The Guidelines define the range of subjects to be acquired and the extent of the Library's collecting effort within these subjects. They also address selection issues presented by a range of formats and literature types. The Guidelines are reviewed and updated periodically to reflect emerging changes in health care and advances in medical research.

Texts

The texts within Digital Collections that are held in NLM's collection were digitized at NLM using docWorks (dW) image processing software, which produces several files per page and per book. After cropping, deskewing and reviewing the source images, additional image derivatives and metadata are then created using NLM-defined scripts. A small number of texts were digitized from original or microfilm by a vendor offsite.

The texts comprising the Medicine in the Americas collection were digitized for a multi-institutional digital library project, the Medical Heritage Library, which uses Internet Archive to host its collection. Therefore, NLM routinely deposits copies of its digitized books to Internet Archive. More information on the Medical Heritage Library can be found here.

Films

Most of the films available in Digital Collections originate from NLM's motion picture reel and videotape holdings. Streaming versions of these titles are derived from a range of master formats. The films cover numerous medical and scientific topics, including public health, surgical procedures, mental health, child development, cancer, infectious disease, and substance abuse. Each film with sound is transcribed, and time-coded captions are created to satisfy Section 508 accessibility requirements and provide enhanced search functionality.

Still Images

The majority of still images in Digital Collections come from the Images from the History of Medicine collection, including fine art, photographs, engravings, and posters that illustrate the social and historical aspects of medicine dating from the 15th to 21st century. Still images are available as a master TIFF or a standard JPG file.

Archival Materials

The more than 30 fully digitized Archives and Personal Papers Collections feature thousands of archival materials covering public health and health policy, mental health, child development, and molecular biology in the 19th and 20th centuries. Digital Collections also provides access to the more than 30,000 archival items that comprise the curated Profiles in Science collections documenting the modern trailblazers of science, medicine, and public health.

Software

The software available in the collection includes historical software developed by NLM, such as the interactive tutorial for Grateful Med, HowTo Grateful Med.

Digitization

The specifications for NLM Digital Repository objects can be found here.

Repository Technical Overview

Software

The Digital Collections repository is primarily composed of open source technologies. The Fedora Repository Software provides an underlying XML-based framework for structuring, managing, preserving and disseminating digital content. Apache Solr and Lucene are used to index our content and drive the full-text and faceted metadata searching within Digital Collections.

The website's homepage, search functionality and resource summary pages are provided using the Blacklight open-source discovery interface.

Digitized texts are presented via Universal Viewer, a community-developed open source project, developed by Digirati. The repository's stored JPEG2000 page images are dynamically converted to regular JPEG for display via the IIIF AWS Serverless Application.

Images are provided by the IIIF AWS Serverless Application and presented using the OpenSeaDragon Image Viewer.

Digitized films are presented using VideoJS embedded on the page.

Preservation

Digital Collections uses the following strategies to help ensure the durability of the managed content:

  • Every master file (source images and videos) is stored with an MD5 checksum, a numerical value unique to the file which can be recomputed to ensure the file has not been altered. Checksums are verified periodically to ensure the integrity of the content.
  • All repository content and services are replicated at a secondary data center capable of taking over all repository functions if NLM's main data center is unavailable. A third copy of master content is stored off-site at a third-party location.

Web Service

Digital Collections offers a Web service that facilitates programmatic search of the Dublin Core metadata and full-text OCR in the repository, with search requests and responses in XML format. More information, including the specifications of the service request and output, is available here.

History

For more information on the history of NLM's repository development, including the initial functional requirements and software evaluations, see the digital repository project history page.

For information about Digital Collections, contact us.

Page last updated: August 2021