Translate

Search This Blog

Saturday, November 23, 2024

Data Science 101: Understanding Statistical Concepts and Analysis

From Couch to Jupyter: A Beginner's Guide to Data Science Tools and Concepts



Introduction

  • Host: Manogna, Senior Data Scientist at Slalom.
  • Presenter: Kiko K., Analytic Scientist at FICO on the Scores Predictive Analytics team.
  • Background:
    • Graduated from UC Berkeley in 2019 with a degree in Applied Mathematics and Data Science.
    • Led teams integrating data science into non-traditional curricula.
    • Passionate about data science's power and community.

Workshop Overview

  • Title: "From Couch to Jupyter—A Beginner's Guide to Data Science Tools and Concepts"
  • Objective: Provide foundational knowledge and tools for beginners in data science.
  • Structure:
    • Introduction to Jupyter Notebook.
    • Basics of Python programming.
    • Understanding data structures and statistical concepts.
    • Interactive code demonstrations.
  • Resources:
    • GitHub repository with tutorial notebooks and datasets.
    • Anaconda installation guide for environment setup.

Key Topics Covered

  • Using Jupyter Notebook
    • Understanding markdown and code cells.
    • Running cells and writing code.
  • Python Basics
    • Data types: integers, floats, strings, booleans.
    • Variables and functions.
    • Arithmetic operations and function calls.
  • Data Structures
    • Arrays with NumPy.
    • Pandas Series and DataFrames.
    • Indexing and slicing data.
  • Data Manipulation and Analysis
    • Importing libraries and reading data files.
    • Handling missing data (NaN values).
    • Filtering and selecting data.
    • Basic statistical calculations: mean, median, standard deviation.
  • Practical Demonstrations
    • Working with a stroke prediction dataset from Kaggle.
    • Visualizing data distributions.
    • Imputing missing values.

Additional Resources

  • Anaconda Installation Guide: For setting up the Python environment.
  • Tutorial Notebooks: Covering various topics in more depth.
  • External Links: Videos and other learning materials for further study.

Conclusion

  • Q&A Session: Addressed audience questions on topics like:
    • Differences between Jupyter Notebook and JupyterLab.
    • Handling missing data and NaN values.
    • Differences between arrays and series.
    • Recommendations for beginners starting with data sets.
  • Final Remarks:
    • Encouraged attendees to explore provided resources.
    • Emphasized continuous learning in data science.
    • Thanked the audience for participation.

Note: The workshop aims to make data science accessible to beginners by providing hands-on experience with tools like Jupyter Notebook and Python, using practical examples and interactive code demonstrations.

Transforming Tutorials: The Impact of AI in University Education

Integrating ChatGPT into Tutorial Sessions to Enhance Critical Thinking in University Students



Introduction

  • Presenters:
    • Sandra Morales: Digital Education Advisor at the Center for Teaching and Learning, Oxford University.
    • Co-Presenter: A colleague also working at Oxford University.
  • Session Overview:
    • Context of tutorials at Oxford University.
    • Experience using AI in psychology tutorials.
    • Recommendations for integrating AI.
    • Time for questions if available.

Context of Tutorials at Oxford University

  • Tutorial Structure:
    • Small group teaching sessions with one tutor and 1-3 students.
    • Tutors encourage analytical and critical thinking to deepen subject knowledge.
    • Different types of tutorial sessions based on student needs:
      • Feedback sessions.
      • Problem-solving activities.
      • Questioning techniques.
      • Collaborative discussions.
      • Content knowledge exploration.
  • Organizational Diversity:
    • Tutorials are organized independently by different programs and divisions.
    • Tutors tailor sessions according to their students' specific needs.

Authority and Knowledge in AI

  • Key Discussion Points:
    • Questioned who holds authority and expertise in the rapidly evolving field of AI.
    • Considered the challenges of making recommendations in a new and developing area.
    • Noted that AI's disruptive impact is comparable to significant events like Brexit and COVID-19.
    • Highlighted the difficulty in identifying reliable authorities on AI.

Experience Using AI in Tutorials

  • Learning Pathways Development:
    • Developed during the pandemic to integrate AI tools into teaching.
    • Utilized platforms like Canvas and Microsoft Teams.
    • Integrated ChatGPT at different stages:
      • Knowledge application.
      • Online and in-class collaboration.
      • Personalized learning experiences.
  • Example from Language Center Tutor:
    • Applied the learning pathway structure in tutorials.
    • Included ChatGPT in various learning stages for enhanced interaction.
    • Both tutor and student engaged with ChatGPT during sessions.
  • Student Feedback:
    • Students appreciated tutor support while working with ChatGPT.
    • Valued the collaborative process involving AI tools.

Enhancing Critical Thinking with AI

  • Central Question: Is critical thinking the answer to effectively utilizing generative AI?
  • Approach:
    • Aimed to use AI tools to support analysis, evaluation, decision-making, and reflection.
    • Sought to familiarize students with AI to enhance critical engagement.

Implementing ChatGPT in Psychology Tutorials

  • Methods:
    • Introduced ChatGPT to students unfamiliar with the tool.
    • Used ChatGPT during one-on-one tutorial sessions.
    • Observed students' interactions, focusing on prompt engineering.
    • Assigned tasks such as designing a curriculum or preparing a lecture.
  • Observations:
    • Students' prompting styles varied based on personality.
    • Language used in prompts included:
      • Imperative commands (e.g., "Write me a university-level...").
      • Polite requests (e.g., "Hello, can you please...").
      • Directives specifying roles (e.g., "I want you to be an expert...").
    • Noted that prompting language mirrored students' personalities.

Developing an AI Competency Framework

  • Inspiration: Based on the Common European Framework of Reference for Languages.
  • Competency Levels: Ranged from novice to expert users.
  • Five Modes of Engagement:
    • Tool Selection: Choosing appropriate AI tools.
    • Prompting Techniques: Crafting effective prompts.
    • Interpreting Outcomes: Understanding AI-generated responses.
    • Integrating AI: Applying AI in professional practice.
    • Tool Development: Making decisions about AI tool development.
  • Self-Evaluation Tool:
    • Created for students and staff to assess their AI proficiency.
    • Helps identify current competency level before engaging with AI tools.

Proposed Framework for Tutorials

  • Integration of ChatGPT:
    • Recommended using ChatGPT as a companion in tutorial sessions.
    • Applicable across various session types (feedback, problem-solving, etc.).
  • Implementation Process:
    • Self-Evaluation:
      • Students assess their initial proficiency with AI.
      • Facilitates personalized support from the tutor.
    • Prompting Practice:
      • Focus on developing effective communication with AI.
      • Emphasizes the importance of prompt language and structure.
    • Reflection and Awareness:
      • Encourage students to document their AI interaction process.
      • Discuss successes and areas for improvement.
    • Self-Monitoring:
      • Promote autonomy in controlling AI usage.
      • Foster critical thinking about AI's role in learning.
  • Objective:
    • Enhance critical thinking skills.
    • Empower students to use AI tools effectively and responsibly.

Student Perspective

Quote: Emphasized taking control over AI tools rather than allowing AI to dictate the learning process.

Insight: Highlights the importance of maintaining critical oversight when using AI.

Ongoing Work

  • Canvas Course Development:
    • Creating online resources for academics and students.
    • Aimed at educating users about AI integration in learning.
    • Courses are currently under development and not yet widely available.

Conclusion

  • Acknowledgments:
    • Thanked the audience for their attention.
    • Noted that the proposed framework is a starting point for discussion.
  • Future Considerations:
    • Recognized the need for ongoing dialogue about AI's role in education.
    • Invited feedback and collaboration to refine approaches.

Note: The presenters emphasized that the framework and recommendations are preliminary and subject to further refinement based on collective input and evolving understanding of AI in educational contexts.

Exploring the Role of Technology in Curriculum Design: A Collaborative Project

Presentation on Digital Tools and Technologies in Curriculum Design



Introduction

  • Presenters:
    • Jess Humphries: Deputy Director of WIHEA (Warwick International Higher Education Academy) and Academic Developer at the University of Warwick.
    • Aishwarya: Master's student at Warwick Business School and Project Officer on the team.
    • Emily Hater: Learning Technologist from the University of Brighton.
    • Lucy Childs: Senior Lecturer from the University of Brighton.
    • Other Team Members: Matt, Hita Parsi (Academic Developer), and Ola (student).

Background and Rationale

  • Project Initiation: Started in October of the previous year as a collaboration between the University of Warwick and the University of Brighton.
  • Funding: Supported by WIHEA to explore collaborative projects between institutions.
  • Aim: To investigate the role of technology in curriculum design and address existing research gaps.

Existing Work in the Field

  • Key References:
    • JISC Reports (2015/2016): Highlighted the role of technology in enabling curriculum design and stakeholder engagement.
    • QAA's Digital Taxonomy for Learning: Provided a framework for digital learning.
    • "Beyond Flexible Learning" by Advance HE: Discussed flexible learning approaches.
    • Recent JISC Report: "Approaches to Curriculum and Learning Design across UK Higher Education" focusing on post-COVID strategies.
    • Padlet Board by Danielle Hinton: Compiled over 100 universities' curriculum design approaches.
  • Vocabulary Importance: Clarified terms like hybrid, HyFlex, asynchronous, and synchronous learning.

Project Aims

  • Exploration: How technology is used in curriculum design for inclusivity and accessibility.
  • Gap Filling: Addressing specific gaps in existing research.
  • Focus: The role of technology in the curriculum design process, not just delivery.

Institutional Approaches

  • University of Warwick
    • Workshops for Course Leaders: Offering resources for departmental collaboration.
    • Moodle Site Development: "Curriculum Development Essentials" for asynchronous learning.
    • Technology Use: Padlet, Miro, Moodle, and online ABC workshops.
  • University of Brighton
    • Collab Curriculum Design Process: A light-touch approach developed two years ago.
    • Process Components:
      • Planning meetings with course teams.
      • Teams area and Padlet board for collaboration.
      • Two course design workshops focusing on aims, rationale, and assessment strategies.
      • Two module design workshops on learning outcomes and learning activities.
    • Key Tools: Microsoft Teams, Padlet, OneNote, and an online toolkit.

Methodology

  • Survey Design: Created to fill research gaps identified in previous studies.
  • Distribution: Nationwide via various channels.
  • Participants: 27 respondents, including module leaders, professional staff, and course leads.
  • Survey Focus Areas:
    • Post-pandemic modes of delivery and space usage.
    • Preferred digital tools and technologies at different curriculum design stages.
    • Collaborators and stakeholders involved.
    • Time and workload allocations for curriculum design.
    • Benefits, opportunities, barriers, and challenges.
    • Reward and recognition in the curriculum design process.

Survey Findings

  • Digital Tools Used:
    • AI Tools: ChatGPT, Midjourney.
    • Collaboration Tools: Microsoft Teams, SharePoint, Padlet, OneDrive, Miro.
    • Presentation Tools: PowerPoint, Google Slides, Prezi.
    • Others: Animation apps, community-building apps, data analysis tools.
  • Modes of Delivery:
    • Blend of Online and On-Campus: 39% prefer online, 33% prefer on-campus.
    • Hybrid Models:
      • Hybrid: Staff decide the mode of engagement.
      • HyFlex: Students decide the mode of engagement (less common but growing).
  • Stakeholders Involved:
    • Primary: Academic colleagues, professional staff in quality enhancement.
    • Others: Students, alumni, external bodies (PSRBs), employers, marketing, and communications teams.
  • Accessibility and Flexibility:
    • Needs Addressed:
      • Remote work accommodations.
      • Students with part-time jobs or varying schedules.
    • Technological Solutions:
      • Collaborative platforms accessible to external participants.
      • Features like collaborative document editing, version history, security measures.
  • Workload and Time Allocation:
    • Discrepancy Noted: Actual time spent often exceeds allocated time.
    • Examples: Some allocated 80 hours but spent 200 hours.
    • Lack of Formal Allocation: Many lacked official time allotments for curriculum design.
  • Use of AI in Curriculum Design:
    • High Interest: 95% would use AI tools.
    • Applications:
      • Brainstorming ideas.
      • Generating content and learning outcomes.
      • Image generation.
    • AI Tools Mentioned: Generative text models (e.g., ChatGPT), AI image generators, subject-specific AI like Math GPT and Music LLM.
  • Barriers and Challenges:
    • Top Barriers:
      • Limited time to learn and implement new technologies.
      • Licensing and subscription issues for preferred tools.
    • Other Challenges:
      • Technical difficulties.
      • Lack of training and support.
      • Resistance to change among staff.
  • Reward and Recognition:
    • Concerns:
      • Time allocation for curriculum design tasks.
      • Recognition in promotions and leadership opportunities.
      • Compensation methods for student involvement.
    • No Clear Solutions: Highlighted as areas needing attention.

Next Steps

  • Interviews: Conducting in-depth interviews to build on survey findings (two completed so far).
  • Focus Areas:
    • Use of digital technology and AI in curriculum design.
    • Strategies for inclusivity and flexibility.
  • Invitation: Open call for participation from other institutions and individuals.

Discussion Questions

  • Examples Sought:
    • Digital technologies that have made curriculum design more inclusive, flexible, or collaborative.
    • How these technologies were implemented.
  • AI Usage:
    • Do you use AI tools like ChatGPT in your curriculum design?
    • What are the opportunities and challenges associated with AI in this context?

Conclusion

  • Project Status: Ongoing with evolving insights.
  • Collaborative Effort: Involvement of both staff and students enriches perspectives.
  • Community Engagement: Encouraged attendees to share experiences and insights.

Note: The presenters emphasized the importance of technology in enhancing the curriculum design process and are actively seeking collaborations and discussions to further this research.

Friday, November 15, 2024

The Rise of Open Source AI in Libraries

Open Source AI in Librarianship: A New Path Forward

Introduction

Artificial intelligence (AI) is reshaping various fields, and librarianship is no exception. The emergence of open-source AI models is not just a passing trend but a potent tool that equips library professionals with new, adaptable solutions to revolutionize collection management, reference services, and research support. Open-source AI, emphasizing transparency and accessibility, provides robust AI solutions without proprietary restrictions or exorbitant costs. This development presents thrilling opportunities and challenges in an environment where budgets are often tight and user needs vary.

This blog explores the pros and cons of open-source AI in libraries, highlighting how these technologies can enhance services such as digital literacy programs and patron privacy. However, it is essential to consider whether libraries are fully prepared for the responsibilities that accompany these advances, including potential challenges such as the need for technical expertise, data security, and ethical considerations.

Pros of Open-Source AI in Librarianship

1. Cost Efficiency and Accessibility

Libraries frequently operate under limited budgets, making investing in advanced proprietary AI tools difficult. Open-source AI changes this dynamic by providing robust and low-cost solutions that libraries of all sizes can afford. For instance, running models like GPT-Neo or BLOOM on local servers, rather than paying for ongoing subscriptions to proprietary models, can significantly lower operational costs. This makes AI accessible to smaller libraries and those in under-resourced areas.

Furthermore, open-source AI allows libraries to offer more advanced services. From machine learning-driven cataloging to AI-powered reference support, libraries can now implement features previously only available through expensive external platforms. AI-based recommendation systems, for example, can be integrated directly into library catalogs, enabling patrons to discover related materials and resources without relying on costly services.

2. Flexibility and Customization

Every library serves a unique community with specific needs. Open-source AI models allow librarians to customize technology to meet these needs. By fine-tuning AI on local collections and community-specific data, libraries can create more personalized experiences for their patrons. For example, an open-source model trained on a library's unique collection metadata can enhance catalog search systems to understand local search habits better and provide more relevant results.

This customization is particularly beneficial for specialized libraries, such as medical or legal libraries, where tailored AI models help curate and provide access to specialized knowledge. By utilizing open-source AI, these libraries can adapt the model's language processing capabilities to include field-specific terminology, thus enhancing their value as information hubs.

3. Enhanced Patron Privacy

Libraries have a long-standing commitment to protecting user privacy, a value that aligns with the transparency and autonomy of open-source AI. Unlike proprietary AI models that operate on third-party servers, open-source AI allows libraries to run models in-house. This ensures that sensitive patron data remains within the library's secure network, which is crucial as libraries increasingly handle data-intensive services like reading histories, research habits, and personal information through online portals and digital lending platforms.

With open-source models, libraries can also modify their data collection practices to anonymize patron interactions and delete unnecessary records, aligning with best data privacy practices and protecting patron rights.

4. Supporting Digital Literacy and Equity

As open-source AI becomes more accessible, libraries have a unique opportunity to spearhead digital literacy initiatives and bridge the digital divide. The potential of AI-driven tools and resources to boost digital literacy is vast. Through programs designed to introduce patrons to these tools, libraries can help foster essential digital skills within their communities. For instance, a library could use open-source AI tools to educate patrons about data privacy, the workings of AI algorithms, and the role of AI in everyday technologies.

By offering workshops and creating resources that demystify AI, libraries empower patrons—especially those from underserved communities—to navigate an increasingly digital world. Such educational efforts are a testament to libraries' unwavering commitment to promoting equitable access to information and closing technological gaps within communities.

5. Creating Open Educational Resources (OER)

Libraries have long embraced open educational resources (OER) to provide free and accessible learning materials. With open-source AI, libraries can contribute innovatively to OER by developing AI-assisted instructional materials or personalized learning guides. For example, libraries could leverage AI to create language-specific tutorials or interactive learning modules that enhance educational offerings.

This strategic integration of open-source AI into library services enriches the learning experience and reinforces libraries' roles as vital educational partners in their communities.


Wednesday, November 13, 2024

Pros and Cons of Using Large Language Models (LLMs) in National Security

LLMs present promising tools for enhancing operational efficiency and data handling in national security. Their shortcomings in reliability, strategic reasoning, and the ethical implications of influence operations underscore the necessity for cautious and well-regulated usage.






Pros

1. Operational Efficiency and Data Processing:  

Large Language Models (LLMs) are recognized for quickly processing and summarizing vast amounts of unstructured data, streamlining operations in national security environments. This efficiency enables analysts to concentrate on more complex tasks instead of organizing data.


2. Enhanced Decision Support:  

Proponents argue that LLMs can assist decision-makers by providing historical insights and identifying patterns across large datasets, which might be overwhelming for human operators alone. This capability could offer a significant strategic advantage, particularly in intelligence and strategic planning.


3. Cost Efficiency for Psychological Operations:  

LLMs present a scalable and cost-effective alternative for information influence campaigns, potentially replacing more labor-intensive human efforts in psychological operations (psyops). Utilizing LLMs could strengthen national influence without requiring extensive resources.


Cons

1. Lack of Reliability in Chaotic and HighStakes Environments:  

Critics point out that LLMs cannot generate reliable probability estimates in unpredictable situations like warfare. Unlike meteorology, grounded in physics and dependable data, military decision-making encounters the "fog of war," rendering LLM outputs unpredictable and risky.


2. Bias and Hallucinations:  

LLMs can produce "hallucinations"—pieces of misleading or incorrect information—without any inherent means to verify their accuracy. This limitation is especially concerning in national security contexts, where decisions based on false data could result in catastrophic consequences.


3. Ethical Concerns Regarding Influence Operations:  

Using LLMs to influence operations raises ethical questions, mainly about whether the technology is employed to mislead or manipulate foreign populations. Critics argue that this undermines democratic values and has the potential to damage international relations, even if it serves national interests.


4. Limitations in Strategic Reasoning:  

LLMs primarily analyze historical data and may need help formulating innovative strategies for unprecedented situations. Military strategy often requires intuition and adaptability—qualities that LLMs lack, limiting their suitability for high-level strategic decision-making.


5. Risk of Adversarial Use and Escalation:  

There are concerns that adversarial nations may exploit LLMs in cyber operations, including disinformation campaigns or psychological warfare, potentially leading to escalated AI-based conflicts. Robust countermeasures would be necessary to mitigate these risks.