KAFP 6265 - Data science and public policy

Explore the intersection of data science and public policies in this comprehensive, hands-on course. Learn how to leverage data to inform evidence-based decision-making and address complex societal challenges (e.g., environmental, educational). During twelve sessions, you will develop the technical skills expected to extract public policies insights from large datasets. More specifically, you will learn: - Data Science methodologies such as data collection, cleaning, and analysis. - How to conduct exploratory data analysis and leverage basic statistical concepts. - How to use machine learning techniques for predictive modeling. Join us on a journey where data meets the most challenging public policy topics, empowering you to make a meaningful impact on the public sector. Séance 1 : Introduction to Data Science for Public Policies Séance 2 : Data Manipulation and Analysis with Spreadsheets Séance 3 : Geospatial Data Visualization with Tableau Séance 4 : Introduction to Python Programming Séance 5 : Introduction to Python Programming Séance 6 : Gathering Data from Publicly Available Sources with Scrapers Séance 7 : Data Cleaning and Manipulation with Python Séance 8 : Exploratory Data Analysis with Python Séance 9 : Data Visualisation with Python Séance 10 : Inferential Statistics with Python Séance 11 : Introduction to Machine Learning Séance 12 : Introduction to Machine Learning and Course Summary
Noah FRÖHLICH,Silvia TULLI,Arnaud WEISS
Cours magistral seul
English
- Homeworks: Students should allocate 30 minutes to 1 hour per week to practice and complete due tasks. - Final Project: Approximately 5 to 8 hours, including time during classes to work on it. Since it's a group project, the workload should be distributed evenly among team members. Students are encouraged to incorporate personal interests or other coursework into their projects.
No prior knowledge is required, the course is designed for students who are new to data science. Just keep in mind that you will learn programming during the course, which requires consistent practice.
Spring 2023-2024
Evaluation will be based on: - Class Participation (10%): Active participation in class and engagement with the course material - Homework (20%): Bi-weekly homeworks. - Final Project (70%): The final project is a group project where you will have the opportunity to apply data science techniques to a real-world public policy issue. Your project should include data collection, analysis, and presentation of findings. The topic is chosen by the group of students. We welcome project ideas related to various topics, e.g., public health, criminal justice, education, immigration, reproductive rights, drug use, adoption of emerging technologies, climate change.
The proposed session structure is as follows: - ~ 5 minutes warm-up with interactive questions. - ~ 10 minutes recap of the previous class and homework discussion. - ~ 20 minutes introduction to a new data science concept. - ~ 20 minutes introduction to a data science tool. - ~ 5 minutes of break. - ~ 60 minutes hands-on work on data.
Kahneman D., Sibony O., Sunstein C. R. (2021). Noise: A Flaw in Human Judgment, London: William Collins, 2021, 464 pp.
Python Data Science Handbook: Essential Tools for Working with Data. 2nd Edition. by Jake VanderPlas. Released January (2022). Publisher(s): O'Reilly Media, Inc. ISBN: 9781491912058.
Christian B., Griffiths Tom. (2016), Algorithms to Live By: The Computer Science of Human Decisions
Banerjee, A. V., & Duflo, E. (2019). Good economics for hard times. Chicago. Banerjee, Abhijit V., and Esther Duflo. 2019. Good Economics for Hard Times.
Pearl, J., & Mackenzie, D. (2019). The book of why. Penguin Books.