From Couch to Jupyter: A Beginner's Guide to Data Science Tools and Concepts

Introduction

Host: Manogna, Senior Data Scientist at Slalom.
Presenter: Kiko K., Analytic Scientist at FICO on the Scores Predictive Analytics team.
Background:
- Graduated from UC Berkeley in 2019 with a degree in Applied Mathematics and Data Science.
- Led teams integrating data science into non-traditional curricula.
- Passionate about data science's power and community.

Workshop Overview

Title: "From Couch to Jupyter—A Beginner's Guide to Data Science Tools and Concepts"
Objective: Provide foundational knowledge and tools for beginners in data science.
Structure:
- Introduction to Jupyter Notebook.
- Basics of Python programming.
- Understanding data structures and statistical concepts.
- Interactive code demonstrations.
Resources:
- GitHub repository with tutorial notebooks and datasets.
- Anaconda installation guide for environment setup.

Key Topics Covered

Using Jupyter Notebook
- Understanding markdown and code cells.
- Running cells and writing code.
Python Basics
- Data types: integers, floats, strings, booleans.
- Variables and functions.
- Arithmetic operations and function calls.
Data Structures
- Arrays with NumPy.
- Pandas Series and DataFrames.
- Indexing and slicing data.
Data Manipulation and Analysis
- Importing libraries and reading data files.
- Handling missing data (NaN values).
- Filtering and selecting data.
- Basic statistical calculations: mean, median, standard deviation.
Practical Demonstrations
- Working with a stroke prediction dataset from Kaggle.
- Visualizing data distributions.
- Imputing missing values.

Additional Resources

Anaconda Installation Guide: For setting up the Python environment.
Tutorial Notebooks: Covering various topics in more depth.
External Links: Videos and other learning materials for further study.

Conclusion

Q&A Session: Addressed audience questions on topics like:
- Differences between Jupyter Notebook and JupyterLab.
- Handling missing data and NaN values.
- Differences between arrays and series.
- Recommendations for beginners starting with data sets.
Final Remarks:
- Encouraged attendees to explore provided resources.
- Emphasized continuous learning in data science.
- Thanked the audience for participation.

Note: The workshop aims to make data science accessible to beginners by providing hands-on experience with tools like Jupyter Notebook and Python, using practical examples and interactive code demonstrations.

ChatGPTLibrarian: Bridging ChatGPT and Librarianship

Translate

Search This Blog

Saturday, November 23, 2024

Data Science 101: Understanding Statistical Concepts and Analysis