Translate

Search This Blog

Thursday, November 28, 2024

Breaking Down Barriers: How Automated Tools Can Increase Faculty Participation in Open Access

Build Your Own AI Tool: Scripting with Google's PaLM and Python for Library

Presented by Eric Silverberg, Librarian at Queens College, City University of New York



Introduction

In this presentation, Eric Silverberg shares his journey in developing an automated tool to assist faculty at Queens College in depositing their scholarly articles into the institutional repository. Recognizing the low participation of faculty in the School of Education, he sought to simplify the process by leveraging Google's PaLM API and Python scripting.

Background and Motivation

The Importance of Open Access

  • Personal Commitment: Eric emphasizes the significance of making educational research openly accessible, aligning with his values and background as a classroom teacher.
  • University Mission Alignment: As a public institution, the City University of New York aims to make its research available to the public.
  • Impact on Education: Open access to research empowers policymakers, administrators, and teachers by providing them with valuable insights and data.

Challenges with Faculty Participation

  • Faculty were generally unaware of the institutional repository or found the process too cumbersome.
  • Understanding open access policies for each journal can be complex and time-consuming.
  • Manually checking policies via Sherpa Romeo for numerous publications is inefficient.

Problem Statement

The core issue was automating the extraction of journal names from faculty citations to retrieve open access policies from Sherpa Romeo's API without manual intervention.

Initial Approach

  • Coding APA Rules: Attempted to parse citations by coding the rules of APA formatting.
  • Encountering Exceptions: Faculty citations varied significantly, with inconsistencies and creative deviations from standard formats.
  • Limitations: The approach became impractical due to the numerous exceptions, leading to excessive coding for edge cases.

Leveraging Google's PaLM API

Discovering PaLM

  • He learned about Google's PaLM API, which powers the language model behind Bard (now Gemini).
  • Recognized its potential for natural language understanding and processing.

Implementing PaLM for Journal Extraction

  • Simple Prompting: Used straightforward prompts like "What is the name of the journal in this citation?"
  • High Accuracy: PaLM effectively extracted journal names even from inconsistently formatted citations.
  • Automation: Enabled batch processing of citations without manually coding for formatting exceptions.

Technical Implementation

Setting Up the Environment

  1. API Key Connection: Established a connection to PaLM's API using a free API key.
  2. Selecting the Model: Chose the text generation model suitable for processing text inputs.
  3. Python Scripting: Used Python to write functions for automating the process.

Key Components of the Script

Part A: Connecting to PaLM

# Connect to PaLM API
import google.generativeai as palm
palm.configure(api_key='YOUR_API_KEY')

# Select the text generation model
models = [model for model in palm.list_models() if 'generateText' in model.supported_generation_methods]
model = models[0].name

Part B: Extracting Journal Names

# Function to get journal name
def get_journal_name(citation):
    prompt = f"What is the name of the journal in this citation?\n{citation}"
    completion = palm.generate_text(model=model, prompt=prompt, temperature=0, max_output_tokens=800)
    return completion.result
  • Temperature Parameter: Set to 0 to minimize randomness and ensure consistent outputs.
  • Max Output Tokens: Defined to control the length of the response.

Automating the Entire Process

  1. Input Data: Collected faculty citations in a spreadsheet.
  2. Journal Extraction: Used the `get_journal_name` function to populate journal names next to citations.
  3. OA Policy Retrieval: Sent journal names to Sherpa Romeo's API to get open access policies.
  4. Output Report: Generated a comprehensive report detailing OA policies for each publication.

Example Output

An example of the output report includes:

  • Citation: Full citation provided by the faculty.
  • Journal Name: Extracted using PaLM.
  • OA Policies: Detailed information on preprint, accepted manuscript, and final version policies.
Citation 4:
[Full Citation Here]

Journal: African Journal of Teacher Education

OA Policies:
- Submitted Manuscript: [Policy Details]
- Accepted Manuscript: [Policy Details]
- Final Version of Record: [Policy Details]

Challenges and Considerations

Dealing with Sherpa Romeo's API

  • Data Structure: The API returns data nested in complex ways, requiring careful parsing.
  • Error Handling: Implemented to manage cases where OA data was missing or incomplete.

Faculty Engagement

  • Planned to share the generated reports with faculty to encourage repository deposits.
  • Recognized the need for feedback to refine the tool and process.

Next Steps and Potential Enhancements

  • User Feedback: Gather input from faculty like Professor N'Dri T. AssiĆ©-Lumumba, who agreed to pilot the tool.
  • Automation of Deposits: Consider scripting the submission of articles into the repository, pending faculty permission.
  • Exploring Other APIs: Investigate alternatives like OpenAlex for OA policy data, potentially simplifying the process.
  • Improving PDF Handling: Explore methods to reverse engineer formatted PDFs back into Word documents for easier repository submissions.

Audience Questions and Responses

Is there a template available?

Answer: Yes, the code shared is largely based on Google's documentation. You can access Eric's script on GitHub and modify it for your needs.

How are citations received from faculty?

Answer: Currently, citations are obtained directly from faculty CVs. The process may evolve based on faculty feedback and scalability considerations.

Does the tool handle abbreviated journal names?

Answer: Yes, PaLM effectively recognizes and extracts abbreviated journal names, which is particularly useful in fields where abbreviations are common.

Why use Sherpa Romeo instead of OpenAlex?

Answer: Familiarity with Sherpa Romeo's API led to its initial use. OpenAlex may offer a more streamlined API, and exploring it could be beneficial for future iterations.

Can ChatGPT be used for journal name extraction?

Answer: While ChatGPT could perform similar tasks, using PaLM's API allows for automation within the script, eliminating the need for manual input and handling larger batches efficiently.

Could the process be further automated to deposit articles?

Answer: Automating the entire submission process is an intriguing idea. It would require careful consideration of repository submission protocols and faculty permissions.

Conclusion

Eric Silverberg's innovative approach demonstrates how AI tools like Google's PaLM can address practical challenges in academic libraries. By automating the extraction of journal names and retrieval of OA policies, the process becomes more efficient, encouraging greater faculty participation in open access initiatives.

The project underscores the potential of AI in streamlining workflows and enhancing access to scholarly research. Ongoing feedback and collaboration with faculty will be essential in refining the tool and maximizing its impact.

Resources and Contact Information

Eric welcomes questions, collaborations, and feedback on the project.

Acknowledgments

Special thanks to Natalie Swanberg for participating in the pilot and to all attendees for their insightful questions and engagement.

Building AI Competency in Library Staff: The Key to Success

At the Helm of Innovation: Librarians at the Forefront of AI Engagement and Integration

Presented by the Library Team at Georgetown University's International Campus in Qatar



Introduction

The advent of artificial intelligence (AI) has ushered in a new era of opportunities and challenges in the academic landscape. Recognizing the transformative potential of AI, the library team at Georgetown University's International Campus in Qatar embarked on a proactive journey to engage with and integrate AI tools across the campus. This article delves into their comprehensive approach, highlighting staff development initiatives, experimentation with AI, faculty outreach, and the incorporation of AI into daily operations.

Staff Development: Building AI Competency

The foundation of the library's AI integration strategy was robust staff development. The acting director of library services emphasized the importance of equipping the team with the necessary resources, time, and training to navigate the evolving AI landscape.

Workshops and Training Sessions

  • ALA's AI Literacy Workshop: The team participated in the American Library Association's workshop on "AI Literacy Using ChatGPT and Artificial Intelligence in Instruction," which provided valuable insights into AI applications in educational settings.
  • Collaborative Learning: The library facilitated special sessions and collaborations with colleagues to foster a culture of continuous learning and shared expertise.

Access to AI Tools

  • ChatGPT Account: A dedicated ChatGPT account was secured for the librarians, serving as a sandbox environment to explore and understand the capabilities and limitations of AI language models.
  • Skilltype Investment: The library invested in Skilltype, a talent management and development platform that provided personalized learning paths, including AI-related courses through LinkedIn Learning.

Experimenting with AI: Collaborative Exploration

Understanding the importance of hands-on experience, the library team engaged in active experimentation with AI tools.

Inter-Institutional Collaboration

The team collaborated with other institutions within Education City, including the Qatar National Library and neighboring universities like Texas A&M and Virginia Commonwealth University. These collaborative sessions focused on:

  • Demonstrating AI Tools: Sharing knowledge about various AI applications and how they can be utilized effectively.
  • Discussing Challenges: Identifying pitfalls and limitations of AI tools to develop best practices for their use.

Creative Applications of AI

The library leveraged AI creatively to enhance their services and outreach efforts:

  • Marketing Initiatives: AI tools were used to develop innovative marketing campaigns and materials, showcasing the library's commitment to embracing new technologies.
  • Workshop Development: AI was utilized to design a series of workshops aimed at exploring AI's creative potential, catering to faculty members who were hesitant to integrate AI directly into their courses.

Faculty Outreach: Bridging the Gap

Recognizing the varying levels of acceptance and familiarity with AI among faculty, the library undertook a strategic outreach initiative.

Understanding Faculty Perspectives

The team reached out to faculty members to gauge their plans and comfort levels regarding AI integration in their courses. They discovered that:

  • Some faculty were resistant to incorporating AI, often due to a lack of familiarity or concerns about academic integrity.
  • There was a trend toward eliminating traditional research papers in favor of in-class assessments to mitigate potential misuse of AI tools.

Adaptive Support and Resources

In response, the library developed alternative strategies to support faculty and students:

  • New Workshop Offerings: They created workshops that complemented and supplemented existing information literacy sessions, focusing on ethical and effective use of AI in research.
  • Alternative Assignments: The library assisted faculty in designing alternative assignments, such as podcasting and video discussions, that leveraged technology while addressing concerns about AI misuse.

Incorporating AI into Daily Operations

The library team integrated AI tools into their everyday workflows to enhance efficiency and innovation.

Brainstorming and Content Creation

  • Utilizing AI Language Models: Tools like ChatGPT and Claude were used for brainstorming ideas, drafting content, and refining communications.
  • Enhancing Marketing Efforts: AI-generated content and images were incorporated into marketing materials, increasing engagement and showcasing the library's forward-thinking approach.

AI-Driven Projects

One notable project involved using AI to recreate book covers for a library display:

  • Image Generation: Using tools like Leonardo AI, the team reimagined existing book covers, demonstrating the creative capabilities of AI.
  • Community Engagement: The display sparked interest among students and faculty, serving as a conversation starter about the role of AI in creativity and design.

Instructional Integration: AI in the Pre-Research Process

The Instructional Services Librarian took significant steps to integrate AI into the research instruction provided to students.

Addressing Citation and Academic Integrity

By the summer of 2023, major citation styles (APA, MLA, and Chicago) had issued guidelines on citing AI tools. The library:

  • Collaboration with the Writing Center: Partnered to create a cheat sheet on how to cite AI content and tools correctly.
  • Resolving Challenges: Addressed issues with citation management tools like Zotero, which lacked specific item types for AI-generated content.
  • Promoting Ethical Use: Emphasized the importance of attribution and academic integrity when using AI tools in research.

Overcoming Faculty Resistance

Some faculty members prohibited the use of AI in their syllabi. To navigate this:

  • Educational Frameworks: Utilized the CLEAR framework and UNESCO publications to demonstrate ethical and effective ways to incorporate AI into academic work.
  • Non-Generative AI Tools: Introduced tools like Research Rabbit, which assist in literature mapping without generating text, alleviating concerns about plagiarism.

Integrating AI into Lesson Plans

The librarian incorporated AI tools into instruction sessions, focusing on:

  • Free and Privacy-Conscious Tools: Selected AI applications like Copilot in Microsoft Edge that protect student data and are accessible without cost.
  • Parallel with Existing Tools: Demonstrated how AI can perform similar functions to familiar tools like Credo's concept mapping, easing the transition for both faculty and students.

AI Workshop Series: Empowering the Campus Community

To further AI literacy on campus, the library launched a futuristic-themed workshop series titled "AI's Creative Edge."

Workshop Offerings

  1. Advanced Prompt Engineering: Taught participants how to use AI for brainstorming keywords and concepts to enhance database searches.
  2. Citing AI Content: Provided hands-on training on using Zotero and Grammarly to correctly cite AI-generated material.
  3. Student Perspectives: Invited students to share their experiences and discuss ethical uses of AI tools.

Engagement and Outcomes

The workshop on citing AI content saw the highest attendance, indicating a strong interest in understanding how to use AI ethically within the bounds of academic integrity. This response highlighted the need for ongoing education and support in navigating AI's role in academia.

AI Across the Research Process

The library team developed a comprehensive framework illustrating how AI tools can be integrated at various stages of the research process:

Research Stage AI Applications
Brainstorming Tools for organizing tasks, defining topics, and generating ideas (e.g., Copilot, ChatGPT).
Literature Review Non-generative AI tools for mapping literature and identifying key sources (e.g., Research Rabbit).
Evaluation Using AI to verify sources, assess credibility, and filter results based on journal rankings (e.g., Consensus).
Citing AI-assisted citation tools for proper attribution (e.g., Grammarly add-on with ChatGPT, integrated with Zotero).

Leadership in AI Engagement: A Collaborative Effort

The Data, Media, and Web Librarian discussed the library's leadership role in advancing AI engagement on campus.

Proactive Initiatives

  • AI Literacy Development: Embraced AI as an area of intellectual curiosity and practical application, positioning the library as a knowledge hub.
  • Workshop Series: Expanded offerings to include topics like generative AI in images, music, and video, as well as AI's impact on career development.

Creative Projects and Experimentation

  • AI-Generated Book Covers: Created a library display featuring AI-generated reimaginings of existing book covers, engaging the community in discussions about AI and creativity.
  • Teaching AI Skills: Offered instruction on prompt engineering and image generation, enabling students and staff to interact effectively with AI tools.

Advanced AI Applications

  • GPT-4 and Claude 3 Vision Features: Explored the use of AI to transcribe and analyze handwritten historical documents, enhancing access to primary sources.
  • Support for Course Development: Participated in a pilot course on learning processes and AI, addressing the ethical considerations and potential of AI in education.

Campus Collaboration and Conversations

The library facilitated campus-wide discussions and collaborations regarding AI:

  • Campus Conversations: Organized events where faculty, IT staff, admissions officers, and finance team members shared perspectives on AI's impact in their areas.
  • Faculty Workshops: Engaged with faculty to discuss AI's role in teaching and learning, offering support and resources for integration.
  • Increased Course Support: Provided enhanced support for courses incorporating AI, ensuring that students and faculty have the necessary tools and knowledge.

Overcoming Challenges and Resistance

Throughout their journey, the library encountered challenges, including resistance from faculty and staff hesitant to adopt AI tools.

Addressing Faculty Concerns

  • Demonstrating Value: Showed faculty how AI could enhance research and learning without compromising academic integrity.
  • Alternative Assignments: Assisted in designing assignments that leveraged technology while mitigating concerns about AI misuse.

Engaging Resistant Staff

  • Demonstrations and Training: Conducted sessions to showcase the practical benefits of AI, highlighting efficiency gains and new capabilities.
  • Collaborative Approach: Encouraged open dialogue and shared experiences to ease apprehensions and build confidence in using AI tools.

Conclusion

The library team at Georgetown University's International Campus in Qatar exemplifies proactive leadership in AI engagement and integration. Through dedicated staff development, innovative experimentation, strategic faculty outreach, and the incorporation of AI into daily operations, they have positioned themselves at the forefront of academic innovation.

Their efforts not only enhance the library's services but also contribute significantly to the campus's overall readiness to navigate the evolving landscape of AI in education. By fostering a culture of ethical use, continuous learning, and collaborative exploration, they are shaping a future where AI is harnessed to enrich learning, research, and creativity.

Questions and Engagement

During their presentations and workshops, the library team actively engaged with students and faculty, addressing questions such as:

  • How can AI tools be used ethically in academic work?
  • What are effective strategies for citing AI-generated content?
  • How can resistance to AI adoption among staff and faculty be overcome?

Their willingness to share resources, such as cheat sheets for citing AI content, and to collaborate across departments underscores their commitment to supporting the campus community in embracing AI responsibly and effectively.

Unlocking Hidden Treasures: The Transformative Potential of AI in Special Collections

What Can AI Do for Special Collections? Improving Access and Enhancing Discovery

Presenters: Sonia Yaco and Bala Singu



In this enlightening presentation, Sonia Yaco and Bala Singu explore the transformative potential of Artificial Intelligence (AI) in the realm of special collections. Drawing from a year-long study conducted at Rutgers University, they delve into how AI can significantly improve access to and enhance the discovery of rich archival materials.

Introduction

Special collections in libraries house a wealth of historical and cultural artifacts. However, accessing and extracting meaningful insights from these collections can be challenging due to the nature of the materials, which often include handwritten documents, rare photographs, and other hard-to-process formats.

The presenters highlight a "golden opportunity" at the intersection of rich collections, an ever-expanding set of AI tools, and a strong desire to maximize the utility of these collections. By applying AI in meaningful ways, they aim to mine this wealth of information and make it more accessible to scholars and the public alike.

The William Elliot Griffis Collection

The focal point of the study is the William Elliot Griffis Papers at Rutgers University. This extensive collection documents the lives and work of the Griffis family, who were educators and missionaries in East Asia during the Meiji period (1868-1912). The collection includes manuscripts, photographs, and published materials and is heavily utilized by scholars from Asia, the United Kingdom, and the United States.

Margaret Clark Griffis

The study specifically focuses on Margaret Clark Griffis, the sister of William Elliot Griffis. She holds historical significance as one of the first Western women to educate Japanese women. By centering on her diaries, biographies, and photographs, the presenters aim to shed light on her contributions and experiences.

Strategies for Mining the Collection

To unlock the wealth of information within the Griffis collection, the presenters employed several strategies:

  1. Extracting Text to Improve Readability: Utilizing AI tools to transcribe handwritten and typewritten documents into machine-readable text.
  2. Finding Insights in Digitized Text and Photographs: Applying natural language processing and image analysis to gain deeper understanding.
  3. Connecting Text to Images: Linking textual content with corresponding images to create a richer narrative.

Software Tools Utilized

The project explored a variety of AI tools, categorized into:

  • Generative AI for Text and Images
  • Natural Language Processing Tools
  • Optical Character Recognition (OCR) Tools
  • Other Analytical Tools

In total, they examined nearly 26 software tools, assessing each based on cost and learning curve. The tools ranged from free and user-friendly applications like ChatGPT 3.5 to more complex and subscription-based services like ChatGPT 4.0 and DALL·E API.

Project Demonstrations

The presenters showcased three key demonstrations to illustrate the capabilities of AI in handling special collections:

1. Improving Readability

One of the primary challenges with special collections is the difficulty in reading handwritten and typewritten documents, especially those written in old cursive styles. To address this, the team used OCR tools to convert these documents into ASCII text, making them more accessible for computational analysis.

Handwritten Material

The team focused on transcribing Margaret Griffis's handwritten diary entries. They used tools like eScriptorium, Transkribus (AM Digital), and ChatGPT-4 to process the text. Each tool had varying levels of accuracy and challenges:

  • eScriptorium: A free tool with a moderate learning curve, it achieved an initial accuracy of around 89%.
  • Transkribus (AM Digital): A commercial tool with a higher cost but offered competitive accuracy.
  • ChatGPT-4: While powerful, it faced issues with "hallucinations," generating text not present in the original material.

By combining these tools, they improved the transcription accuracy significantly. For instance, feeding the eScriptorium output into ChatGPT-4 enhanced the accuracy to approximately 96%.

Typewritten Material

For typewritten documents, such as William Griffis's biography of his sister, tools like Adobe Acrobat provided efficient OCR capabilities with high accuracy. These documents were easier to process compared to handwritten materials.

2. Finding Insights with AI

Once the text was extracted, the next step was to derive meaningful insights using AI techniques:

Translation

To make the content accessible to international scholars, the team utilized translation tools:

  • Google Translate: A free tool suitable for smaller text volumes.
  • Googletrans API: An API version of Google Translate, which had reliability issues and limitations on volume.
  • Google Cloud Translation API: A paid service offering high reliability for large-scale translations.

Text Analysis and Visualization

Using natural language processing tools, the team performed analyses such as named entity recognition and topic modeling. They employed Voyant Tools, a free, open-source platform that offers various analytical capabilities:

  • Identifying key entities like names, places, and dates.
  • Visualizing word frequencies and relationships.
  • Creating interactive geographic maps based on the text.

Photographic Grouping

With over 427 photographs in the collection, the team sought to group images programmatically based on content similarities. By leveraging Python scripts and AI algorithms, they clustered photographs that shared visual characteristics, such as shapes, subjects, and themes.

3. Connecting Text and Images

One of the most innovative aspects of the project was linking textual content with corresponding images to enrich the narrative:

Describing Photographs Using AI

The team used ChatGPT to generate detailed descriptions of photographs. For example, given a photograph with minimal metadata labeled "small Japanese print," ChatGPT produced an extensive description, identifying elements like traditional attire, expressions, and possible historical context.

This process significantly enhances the discoverability of images, providing researchers with richer information than previously available.

Adding Metadata and Generating MARC Records

Beyond descriptions, the AI tools were used to generate metadata and even create MARC records for cataloging purposes. This automation can streamline library workflows and improve access to collections.

Generating Images from Text and Matching to Real Images

Taking the connection a step further, the team explored generating images based on extracted text and then matching these AI-generated images to real photographs in the collection:

  1. Extract Text Descriptions: Using ChatGPT to identify descriptive passages from the diary.
  2. Generate Images: Employing tools like DALL·E to create images based on these descriptions.
  3. Match to Real Images: Programmatically comparing AI-generated images to actual photographs in the collection to find potential matches.

While not perfect, this method opens up new avenues for discovering connections within archival materials that might not be immediately apparent.

Limitations and Takeaways

Limitations

  • Infrastructure Needs: AI requires significant resources, including computational power, software costs, and staff time.
  • Technical Expertise: A background in programming and software development is highly beneficial. Collaboration with technical staff is often necessary.
  • Learning Curves: Many AI tools, even free ones, come with steep learning curves that can be challenging to overcome.
  • Human Intervention: AI tools are not fully autonomous and require human oversight to ensure accuracy and relevance.

Takeaways

  • Combining Tools Enhances Effectiveness: Using multiple AI tools in conjunction can yield better results than using them in isolation.
  • Start with Accessible Tools: Begin with user-friendly software like Adobe Acrobat for OCR and Google Translate for initial forays into AI applications.
  • Incorporate AI into Workflows: Integrate AI tools into existing library processes to improve efficiency and output quality.
  • Partnerships are Crucial: Collaborate with technical staff, data scientists, and computer science departments to leverage expertise.

Recommendations for Libraries

The presenters offer practical advice for libraries interested in leveraging AI for their special collections:

  1. Begin with Easy-to-Use Software: Tools like Adobe Acrobat and Google Translate can have an immediate impact with minimal investment.
  2. Experiment with Text Analysis: Use platforms like Voyant Tools to gain insights into your collections and explore new research possibilities.
  3. Enhance Metadata Creation: Utilize AI to generate or enrich metadata, improving searchability and access.
  4. Seek Funding Opportunities: Apply for grants to support more extensive AI projects, such as large-scale photograph organization.
  5. Collaborate with Technical Experts: Engage with technical staff within or outside your institution to support complex AI initiatives.

Conclusion

The presentation underscores the significant potential of AI in unlocking the hidden treasures within special collections. By improving readability, finding insights, and connecting text with images, AI tools can make collections more accessible and enhance scholarly research.

The journey involves challenges, particularly in terms of resources and expertise, but the rewards can be substantial. As AI technology continues to evolve, libraries have an opportunity to embrace these tools, transform their workflows, and open their collections more fully to the world.

Questions and Further Discussion

During the Q&A session, attendees posed several insightful questions:

  • Tools for MARC Records: The presenters used ChatGPT-4 to generate MARC records from photographs, finding it effective for creating initial catalog entries.
  • Batch Processing: When asked about processing multiple images, they noted that while interactive interfaces might limit batch sizes, using APIs and programmatic approaches allows for processing larger volumes.
  • Applying Techniques to Other Formats: The techniques discussed are applicable to manuscripts, maps, and even video materials. Tools like Whisper can transcribe audio and video content, enhancing accessibility.

Exploring the Possibilities of Generative AI: A Deep Dive into Research Tools

Exploring Research-Focused Generative AI Tools for Libraries and Higher Education

Hello everyone, and thank you so much for joining today's session on research-focused generative AI tools. In this presentation, we'll delve into various types of generative AI, with a particular emphasis on research tools like Consensus, Elicit, and Research Rabbit. We'll also discuss the challenges associated with generative AI and consider how these tools impact instruction and library services.



Types of Generative AI

Generative AI is a rapidly evolving field with a variety of applications. Some of the main types include:

  • Chatbots: Conversational AI systems like ChatGPT that can generate human-like text responses.
  • Image Generation and Synthesis Tools: Tools like Midjourney and NightCafe that can create images based on textual prompts.
  • Research Tools: Our focus today is on research tools such as Consensus, Elicit, and Research Rabbit, which aim to enhance the research process.
  • Music and Video Generation Tools: AI systems that can compose music or generate videos.
  • Others: The field is continually expanding, and new tools are being developed as we speak.

Research Generative AI Tools

1. Consensus

Consensus is a search engine that utilizes language models to surface and synthesize insights from academic research papers. According to their website:

"Consensus is not a chatbot, but we use the same technology throughout the product to help make the research process more efficient."

Source Material: The content comes from the Semantic Scholar database, which provides access to a wide range of academic papers.

Mission: Their mission is to use AI to make expert information accessible to all.

Example Usage:

When prompted with the question:

"How do faculty and instructional designers use Universal Design for Learning in higher education?"

Consensus provides a summary at the top of the page, analyzing the top eight papers related to the query. Below the summary, it lists the eight papers, including details like the title, authors, publication venue, and citation count.

Features:

  • Save, Cite, Share: Users can save articles, generate citations, and share them.
  • Citation Generation: Similar to many databases, Consensus can generate citations, though users should verify for minor errors.
  • Study Snapshot: Offers a synthesized overview of a paper's key points and outcomes. Note that generating a snapshot may require AI credits.

AI Credits and Premium Features:

  • AI Credits: Users have a monthly limit of 20 AI credits in the free version, which are used for premium features like generating study snapshots.
  • Premium Version: Offers additional features beyond the free version.

2. Elicit

Elicit is a research assistant that uses language models like GPT-3 to automate parts of the research workflow, especially literature reviews.

Functionality:

  • When asked a question, Elicit shows relevant papers and summarizes key information in an easy-to-use table.

Example Usage:

With the prompt:

"How should generative AI be used in libraries and higher education?"

Elicit provides a summary of the top four papers, including in-text citations that link to the sources.

Features:

  • Paper Details: Includes paper information, citations, abstract summaries, and main findings.
  • Additional Columns: Users can add more columns to the results table to customize the information displayed.

Source Material:

Elicit pulls content from Semantic Scholar, searching over 175 million papers.

3. Research Rabbit

Research Rabbit is a research platform that enables users to discover and visualize relevant literature and scholars.

Mission:

To empower researchers with powerful technologies.

Features:

  • Visualization: Provides visual representations of how papers are interconnected.
  • Explore Options: Users can explore similar work, earlier work, later work, and linked content.
  • Authors: Allows exploration of authors and suggested authors in the field.
  • Export Papers: Users can export lists of papers for further use.

Example Usage:

Starting with one or more articles, users can find similar articles, explore cited works, or see which papers cite the original article. The platform creates a network graph showing the relationships between articles.

Personal Experience:

The presenter found Research Rabbit particularly useful for organizing dissertation literature reviews.

Why Use Generative AI in Libraries?

Generative AI technology is not going away; it's becoming a mainstay in our culture and professional practices. Libraries and librarians need to consider how to respond to this technology.

Supporting Patrons

  • Should we support patrons in using these new tools or try to prevent them from using them?
  • It's a balancing act, considering the benefits and challenges.

Advancing Effectiveness and Efficiency

  • Generative AI tools claim to make research more effective and efficient.
  • Teaching students how to use and evaluate these tools prepares them for future workplaces where such technologies may be prevalent.

Personal Uses of Generative AI

  • Making Paragraphs More Concise: Using AI to refine writing.
  • Rephrasing Assistance: Helping with tricky paraphrasing tasks.
  • Creating Titles: Generating titles for presentations or programs.
  • Organizing Articles: Managing literature for dissertations or research projects.
  • Brainstorming: Generating ideas and exploring new concepts.

Challenges with Generative AI

While generative AI offers many benefits, there are significant challenges to consider.

Privacy and Lack of Transparency

  • Uncertainty about where these tools get their information and how they process data.
  • Users may unknowingly input sensitive information.

Quality and Hallucinations

  • AI can produce inaccurate information or "hallucinations," including ghost sources that don't exist.
  • Some are beginning to refer to these as "fabrications."

Biases and Blind Spots

  • AI models can perpetuate biases present in the training data.

Date Range of Content

  • Some AI tools may have outdated information, as their training data cuts off at a certain point.

Plagiarism and Academic Integrity

  • Students may misuse AI tools, leading to academic integrity violations.
  • Detection tools exist but may produce false positives.

Detection Tools and False Positives

  • Tools designed to detect AI-generated content are not foolproof.

Evaluating Generative AI Tools

The AI ROBOT Test

Developed by Hervol and Wheatley, the AI ROBOT test is a framework for evaluating AI tools, focusing on:

  • Reliability
  • Objective
  • Bias
  • Ownership
  • Type

This framework can be used in information literacy instruction to help students and patrons critically assess AI tools. You can read more about it here.

Additional Resources

The presenter has compiled a LibGuide with articles, videos, podcasts, and other resources on generative AI.

Poll Results

In a previous presentation, attendees were polled on their views regarding generative AI.

Should Librarians Embrace Generative AI?

Most respondents believed librarians should either embrace it or respond somewhere in between embracing and avoiding. Only one person suggested that librarians should avoid it.

Which Generative AI Tools Are Potentially Useful for Your Library?

  • ChatGPT: 134 responses
  • Elicit: 3 responses
  • Perplexity: 118 responses
  • Research Rabbit: 189 responses
  • NightCafe: 40 responses
  • Other: 22 responses
  • Consensus: 103 responses

Upcoming GAL Virtual Conference

The presenter is organizing an upcoming GAL (Generative AI in Libraries) virtual conference titled:

Prompter or Perish: Navigating Generative AI in Libraries

Dates: June 11th, 12th, and 13th

Time: 1 PM to 4 PM Eastern Time

Call for Proposals: Librarians are encouraged to submit proposals and participate in the conference. For more information, visit the conference website.

Contact Information

For further questions or to continue the conversation, you can contact the presenter at:

Email: brienne.dow@briarcliff.edu

Conclusion

Generative AI is a transformative technology with significant implications for libraries and higher education. By understanding and critically engaging with these tools, librarians can better support their patrons and prepare for the future.

Thank you for attending today's session. We look forward to continuing the conversation at the upcoming GAL Virtual Conference.

Wednesday, November 27, 2024

Overcoming Challenges: How NPR Digitized Their Music Collection with AI

Practical Application of AI: Evaluating Music to Build a Music Library

Presented by Jane Gilvin, NPR's Research Archives and Data Team



Introduction

Jane Gilvin delivered a presentation on how her team at NPR utilized artificial intelligence (AI) to automate the identification of instrumental and vocal music to build a digital music library more efficiently. The session focused on the practical application of AI in music cataloging, the challenges faced, and the solutions implemented.

About Jane Gilvin and the RAD Team

  • Jane Gilvin:
    • Member of NPR's Research Archives and Data (RAD) Team for nearly 13 years.
    • Educational background in music and library science.
    • Alumna of San Jose State University's Information Science program.
    • Experience in radio since she was a teenager.
  • The RAD Team:
    • Formerly known as the NPR Library, established in the 1970s.
    • Responsible for collecting NPR programming archives.
    • Provides resources for production, including a comprehensive music collection.

NPR's Music Collection Evolution

The NPR music collection has evolved alongside technological advancements:

  • Vinyl Records: The initial collection comprised vinyl records across various genres.
  • Transition to CDs: Shifted to compact discs (CDs) as CD players became standard in production.
  • Digital Music Files: Moved towards digital files to meet the expectations of quick and remote access to music.

Challenges in Digitizing the Collection

The transition to digital presented several challenges:

  • Converting thousands of physical CDs into digital files for immediate access.
  • Ensuring metadata accuracy and consistency, especially for instrumental and vocal classification.
  • Lack of resources for continuous large-scale ingestion and cataloging of new music.

Solution: Automation with AI

The Robot and ORRIS

  • The Robot: A batch processing system capable of ripping CDs, identifying metadata from online databases, and delivering MP3 and WAV files with embedded ID3 tags.
  • ORRIS (Open Resource and Research Information System): A new database developed to allow users to search, stream, and download songs for production.

Implementing Essentia

  • Essentia: An open-source library and collection of tools used to analyze audio and music to produce descriptions and synthesis.
  • Capabilities: Predicts genre, beats per minute, mood, and most importantly, classifies tracks as instrumental or vocal.
  • Training the Algorithm: Used NPR's extensive archive of over 300,000 tracks with existing instrumental and vocal tags to train the algorithm.

Accuracy and Testing

  • Human Cataloging Accuracy: Ranged from 90% to 98%, averaging around 90% due to human error and limitations.
  • Algorithm Accuracy Goal: Set at 80% to balance the usefulness and the efficiency of the process.
  • Results: The algorithm achieved an accuracy of 86%, meeting the team's criteria.

Integration and Quality Control

Building into the Ingest Process

  • Automated the instrumental/vocal tagging during the ingest process of new tracks.
  • Applied the algorithm to existing tracks that lacked instrumental/vocal classification.

User Feedback Mechanism

  • Added a feature allowing users to report incorrectly tagged songs directly from the ORRIS interface.
  • Provided a quick way for the RAD team to receive notifications and correct metadata errors.

Quality Control Measures

  • Automated spreadsheets generated during the algorithm's run allowed for immediate review of results.
  • Periodic checks to ensure the algorithm continues to perform within the acceptable accuracy range.
  • Addressed any shifts in algorithm performance due to changes in the type of music being ingested or other factors.

Demonstration

Jane provided a live demonstration of how the process works:

  1. Showed the ORRIS search interface and how users can search for and listen to tracks (e.g., Thelonious Monk, David Bowie).
  2. Demonstrated the ingestion of new albums and how the algorithm processes them to classify tracks as instrumental or vocal.
  3. Illustrated the use of the user feedback feature to report incorrect classifications.

Benefits and Outcomes

  • Significantly reduced the time and resources required for music cataloging.
  • Enabled continuous addition of new music to the library despite limited staff time.
  • Improved user satisfaction by providing a reliable point of data for instrumental and vocal tracks.

Challenges and Considerations

  • Training Data Limitations: Ensuring the training data was representative and free from bias or errors.
  • Algorithm Bias: Addressing the overrepresentation of certain genres (e.g., jazz and classical) in the training data to avoid skewed results.
  • Metadata Accuracy: Dealing with inconsistent or incorrect metadata from external sources.

Future Plans

Jane discussed potential future projects:

  • Revisiting other algorithms from Essentia, such as those predicting timbre and mood.
  • Implementing user testing and UX projects to improve data research and user experience.
  • Continuing to refine the algorithm and processes to maintain or improve accuracy.

Questions and Answers

During the Q&A session, several topics were addressed:

Copyright and Licensing Considerations

  • NPR has licenses with major performing rights organizations for the use of music in production.
  • Other libraries considering building a music collection should review legal permissions and terms of use.

Data Labeling and Ongoing QA/QC

  • The team performs periodic quality checks but does not engage extensively in data labeling projects.
  • Emphasis on monitoring algorithm performance and making adjustments as needed.

User Testing and UX Improvements

  • Future plans include conducting user testing to evaluate the effectiveness of additional algorithms (e.g., mood taxonomy).
  • Goal is to enhance the search and discovery experience for users.

Conclusion

Jane concluded by emphasizing how the application of AI allowed the RAD team to develop a less time-consuming ingest and cataloging process. This enabled the continuous growth of the music library, providing valuable resources to production staff while efficiently managing limited staff time.

Contact Information

For further information or inquiries, you can reach out to Jane Gilvin through NPR's Research Archives and Data Team.