What Can AI Do for Special Collections? Improving Access and Enhancing Discovery
Presenters: Sonia Yaco and Bala Singu
In this enlightening presentation, Sonia Yaco and Bala Singu explore the transformative potential of Artificial Intelligence (AI) in the realm of special collections. Drawing from a year-long study conducted at Rutgers University, they delve into how AI can significantly improve access to and enhance the discovery of rich archival materials.
Introduction
Special collections in libraries house a wealth of historical and cultural artifacts. However, accessing and extracting meaningful insights from these collections can be challenging due to the nature of the materials, which often include handwritten documents, rare photographs, and other hard-to-process formats.
The presenters highlight a "golden opportunity" at the intersection of rich collections, an ever-expanding set of AI tools, and a strong desire to maximize the utility of these collections. By applying AI in meaningful ways, they aim to mine this wealth of information and make it more accessible to scholars and the public alike.
The William Elliot Griffis Collection
The focal point of the study is the William Elliot Griffis Papers at Rutgers University. This extensive collection documents the lives and work of the Griffis family, who were educators and missionaries in East Asia during the Meiji period (1868-1912). The collection includes manuscripts, photographs, and published materials and is heavily utilized by scholars from Asia, the United Kingdom, and the United States.
Margaret Clark Griffis
The study specifically focuses on Margaret Clark Griffis, the sister of William Elliot Griffis. She holds historical significance as one of the first Western women to educate Japanese women. By centering on her diaries, biographies, and photographs, the presenters aim to shed light on her contributions and experiences.
Strategies for Mining the Collection
To unlock the wealth of information within the Griffis collection, the presenters employed several strategies:
- Extracting Text to Improve Readability: Utilizing AI tools to transcribe handwritten and typewritten documents into machine-readable text.
- Finding Insights in Digitized Text and Photographs: Applying natural language processing and image analysis to gain deeper understanding.
- Connecting Text to Images: Linking textual content with corresponding images to create a richer narrative.
Software Tools Utilized
The project explored a variety of AI tools, categorized into:
- Generative AI for Text and Images
- Natural Language Processing Tools
- Optical Character Recognition (OCR) Tools
- Other Analytical Tools
In total, they examined nearly 26 software tools, assessing each based on cost and learning curve. The tools ranged from free and user-friendly applications like ChatGPT 3.5 to more complex and subscription-based services like ChatGPT 4.0 and DALL·E API.
Project Demonstrations
The presenters showcased three key demonstrations to illustrate the capabilities of AI in handling special collections:
1. Improving Readability
One of the primary challenges with special collections is the difficulty in reading handwritten and typewritten documents, especially those written in old cursive styles. To address this, the team used OCR tools to convert these documents into ASCII text, making them more accessible for computational analysis.
Handwritten Material
The team focused on transcribing Margaret Griffis's handwritten diary entries. They used tools like eScriptorium, Transkribus (AM Digital), and ChatGPT-4 to process the text. Each tool had varying levels of accuracy and challenges:
- eScriptorium: A free tool with a moderate learning curve, it achieved an initial accuracy of around 89%.
- Transkribus (AM Digital): A commercial tool with a higher cost but offered competitive accuracy.
- ChatGPT-4: While powerful, it faced issues with "hallucinations," generating text not present in the original material.
By combining these tools, they improved the transcription accuracy significantly. For instance, feeding the eScriptorium output into ChatGPT-4 enhanced the accuracy to approximately 96%.
Typewritten Material
For typewritten documents, such as William Griffis's biography of his sister, tools like Adobe Acrobat provided efficient OCR capabilities with high accuracy. These documents were easier to process compared to handwritten materials.
2. Finding Insights with AI
Once the text was extracted, the next step was to derive meaningful insights using AI techniques:
Translation
To make the content accessible to international scholars, the team utilized translation tools:
- Google Translate: A free tool suitable for smaller text volumes.
- Googletrans API: An API version of Google Translate, which had reliability issues and limitations on volume.
- Google Cloud Translation API: A paid service offering high reliability for large-scale translations.
Text Analysis and Visualization
Using natural language processing tools, the team performed analyses such as named entity recognition and topic modeling. They employed Voyant Tools, a free, open-source platform that offers various analytical capabilities:
- Identifying key entities like names, places, and dates.
- Visualizing word frequencies and relationships.
- Creating interactive geographic maps based on the text.
Photographic Grouping
With over 427 photographs in the collection, the team sought to group images programmatically based on content similarities. By leveraging Python scripts and AI algorithms, they clustered photographs that shared visual characteristics, such as shapes, subjects, and themes.
3. Connecting Text and Images
One of the most innovative aspects of the project was linking textual content with corresponding images to enrich the narrative:
Describing Photographs Using AI
The team used ChatGPT to generate detailed descriptions of photographs. For example, given a photograph with minimal metadata labeled "small Japanese print," ChatGPT produced an extensive description, identifying elements like traditional attire, expressions, and possible historical context.
This process significantly enhances the discoverability of images, providing researchers with richer information than previously available.
Adding Metadata and Generating MARC Records
Beyond descriptions, the AI tools were used to generate metadata and even create MARC records for cataloging purposes. This automation can streamline library workflows and improve access to collections.
Generating Images from Text and Matching to Real Images
Taking the connection a step further, the team explored generating images based on extracted text and then matching these AI-generated images to real photographs in the collection:
- Extract Text Descriptions: Using ChatGPT to identify descriptive passages from the diary.
- Generate Images: Employing tools like DALL·E to create images based on these descriptions.
- Match to Real Images: Programmatically comparing AI-generated images to actual photographs in the collection to find potential matches.
While not perfect, this method opens up new avenues for discovering connections within archival materials that might not be immediately apparent.
Limitations and Takeaways
Limitations
- Infrastructure Needs: AI requires significant resources, including computational power, software costs, and staff time.
- Technical Expertise: A background in programming and software development is highly beneficial. Collaboration with technical staff is often necessary.
- Learning Curves: Many AI tools, even free ones, come with steep learning curves that can be challenging to overcome.
- Human Intervention: AI tools are not fully autonomous and require human oversight to ensure accuracy and relevance.
Takeaways
- Combining Tools Enhances Effectiveness: Using multiple AI tools in conjunction can yield better results than using them in isolation.
- Start with Accessible Tools: Begin with user-friendly software like Adobe Acrobat for OCR and Google Translate for initial forays into AI applications.
- Incorporate AI into Workflows: Integrate AI tools into existing library processes to improve efficiency and output quality.
- Partnerships are Crucial: Collaborate with technical staff, data scientists, and computer science departments to leverage expertise.
Recommendations for Libraries
The presenters offer practical advice for libraries interested in leveraging AI for their special collections:
- Begin with Easy-to-Use Software: Tools like Adobe Acrobat and Google Translate can have an immediate impact with minimal investment.
- Experiment with Text Analysis: Use platforms like Voyant Tools to gain insights into your collections and explore new research possibilities.
- Enhance Metadata Creation: Utilize AI to generate or enrich metadata, improving searchability and access.
- Seek Funding Opportunities: Apply for grants to support more extensive AI projects, such as large-scale photograph organization.
- Collaborate with Technical Experts: Engage with technical staff within or outside your institution to support complex AI initiatives.
Conclusion
The presentation underscores the significant potential of AI in unlocking the hidden treasures within special collections. By improving readability, finding insights, and connecting text with images, AI tools can make collections more accessible and enhance scholarly research.
The journey involves challenges, particularly in terms of resources and expertise, but the rewards can be substantial. As AI technology continues to evolve, libraries have an opportunity to embrace these tools, transform their workflows, and open their collections more fully to the world.
Questions and Further Discussion
During the Q&A session, attendees posed several insightful questions:
- Tools for MARC Records: The presenters used ChatGPT-4 to generate MARC records from photographs, finding it effective for creating initial catalog entries.
- Batch Processing: When asked about processing multiple images, they noted that while interactive interfaces might limit batch sizes, using APIs and programmatic approaches allows for processing larger volumes.
- Applying Techniques to Other Formats: The techniques discussed are applicable to manuscripts, maps, and even video materials. Tools like Whisper can transcribe audio and video content, enhancing accessibility.