Machine Learning in Policy (90-904, crosslist 10-830) will be taught in the Spring semester of 2018. Classes begin 1/18/18 and end 5/3/18. Spring break is observed the week of 3/12. Time: MW 10:30-11:50 AM. Room: Hamburg Hall 1007
Jeremy C. Weiss, M.D. Ph.D., Assistant Professor of Health Informatics; jeremyweiss@cmu.edu
Office hours: Wednesdays at 4pm, 2101F (E)
TA Dylan Fitzpatrick, Ph.D. student in Machine Learning and Public Policy; djfitzpa@gmail.com
Office hours: Tuesdays at 2pm, Location Hamburg Hall 3034
Faculty assistant: Carole McCoy, HBH 2102
Machine learning, a field derived primarily from computer science and statistics, has matured and gained wide adoption. Alongside exponential increases in data measurement and availability, the ability to develop appropriate and tailored analyses is in demand. As practitioners in the social sciences consider machine learning methods, however, limitations and externalities of the applications of machine learning techniques are being identified, such as overconfidence in settings with concept drift, lack of generalizability due to selection bias, and magnification of inequities. Machine Learning in Policy seeks to (1) demonstrate motivations and successes of machine learning, to (2) contrast them with more classical methods, and to (3) investigate the promise and cautions of machine learning for public policy.
The course will cover variety of topics, including:
For policy students, Machine Learning in Policy will develop your skills in machine learning methods motivated by policy applications. For machine learning students, Machine Learning in Policy will develop your ability to formulate machine learning techniques that inform real-world policy and will demand that the formulations address the applications, consequences and limitations of existing techniques.
Students will present mathematical formulations in TeX and markdown and implement algorithms in Python. Machine Learning in Policy will also involve a substantial discussion component. Approximately 25% of class time will be devoted to discussions of recent applications of machine learning in policy settings. Therefore, attendance in class is required. Many readings will come from the health care field, however, the methods will apply across policy domains.
Background in either machine learning or policy is required. This is a PhD level course. Experience in Python is highly recommended.
(Tentatively,) Grades will be based on:
All grades are tallied and at the end of the course they are scaled to meet the Heinz grading policy.
The project and that is submitted for grading is to be the work of the individual or team alone. Similarly, completed homework assignments is to be your work alone, although you are encouraged to discuss the problems with your classmates. Results that are identical or nearly identical across projects may be regarded as cheating. Penalties for cheating include lowering your grade including failing the course. In extreme cases, the instructors may recommend the termination of your enrollment at CMU.
Homework Policy: The lowest homework grade will be dropped. If the project grade is lower than any homework grade, all homeworks will be counted and the project grade will count for 15% less of the total grade.
Late Work Policy: You are expected to turn in all work on time (at the start of class on the due date). Assignments turned in within 48 hours of the deadline will be marked down 20% per day. Additional late assignments will not be accepted. Extenuating circumstances (such as illness) that results in lateness beyond the flexibility in the homework policy will entail partial credit and overhead assigned work.
Wellness Policy: Take care of yourself and take care of others around you. There are resources to help you both in Heinz and around the University. The Counseling and Psychological Services (CaPS) help line is 412-268-2922. If the situation is life threatening, call the police.
Concepts of machine learning
Topics in machine learning and policy
There is not a required textbook. Readings will come from many sources and will be provided in Canvas, on the schedule, or in class. Useful references include: Efron and Hastie's Computer Age Statistical Inference, Bishop's Pattern Recognition and Machine Learning, Murphy's Machine Learning: a Probabilistic Perspective, Mitchell's Machine Learning, Hastie et al's Elements of Statistical Learning, and Hastie et al's Statistical Learning with Sparsity. To complement Heinz offerings in econometrics, we will provide relevant sections of Greene's Econometric Analysis.
(Tentatively,) Python, numpy, scikit-learn, pandas, matplotlib, pdb, pytorch; git; LaTeX; markdown
git cheat sheet
git interative tutorial