Está en la página 1de 12

Econometrics and Data Analysis I

Yale University, Spring 2018

Updated on: February 16, 2018

ADMINISTRATIVE
INSTRUCTOR
Professor John Eric Humphries (john.humphries@yale.edu)
Office: 37 Hillhouse, Room 32.
Office Hour: Wednesdays 2pm-3pm and by appointment.

TIME AND LOCATION


Lecture Time: Tuesdays and Thursdays, 1:00pm–2:15pm.
Lecture Location: Harkness Hall, 100 Wall Street, Room 119.

TEACHING FELLOWS
• Kritika Narula (kritika.narula@yale.edu)
Sections: Friday, 2:30 pm WLH 114 and 3:30 pm WLH 113,
Office hours: Tuesdays 7:00pm-9:00pm in Bass Library Rm L34A or Bass Cafe
• Eduardo Pinheiro Fraga (eduardo.fraga@yale.edu)
Sections: Friday, 10:30 am and 11:35 am, WLH 113
Office hours: Mondays 6:30pm-8:30pm in Bass Library Rm L30A
• Soumitra Shukla (soumitra.shukla@yale.edu)
Sections: Thursday, 7:00 pm and 8:00 pm, WLH 114
Office hours: Wednesdays 4:00pm-6:00pm in Bass Library Rm L34A

TUTORS AND OTHER RESOURCES


• Peer Tutors:
– Han-ah Sumner (han-ah.sumner@yale.edu), Monday and Wednesday 7:30-8:45pm
in Saybrook Dining Hall.
– Zhaochen Tai (zhaochen.tai@yale.edu), Thursdays 9:am-11:30am in Public Area
of Hillhouse 17.
• R Tutor: Saul Downie. Office hours:
– Mondays 7:30-9:00pm (Saybrook Dining Hall)

1
Yale University Econ 131, Spring 2018

– Wednesdays 7:30-9:00pm (Saybrook Dining Hall)


– Fridays 12:00-2:00pm (Bass Cafe).
• The residential college math and science tutors are another great resource for questions
(three are Econ PhD Students).
• Statlab walkin hours for R and computer help available here

WEBSITE
On Canvas at: https://yale.instructure.com/courses/34777

ABOUT THE COURSE

This course will teach you how to judge quantitative information and how to use data to
answer quantitative questions, with a focus on questions in the social sciences. We will
cover three areas:
1. Probability, the study of uncertainty, such as the uncertainty faced by investors,
insurers, and people in everyday life.
2. Statistics, the science of analyzing and interpreting data, such as what a marketing
department might know about past consumer purchases.
3. Linear regression, a statistical method used to estimate links between two or more
variables. For example, how does setting a minimum wage affect employment and
earnings?
The prerequisites for this course are introductory microeconomics and familiarity with single
variable calculus. This course fulfills the econometrics requirement for the economics major.

In most econometrics classes, mathematical methods are introduced and then applied to a
few examples. This class turns that around. We will focus on substantive questions first,
and then introduce mathematical methods that will help us answer them. By the end of
the class, you will have acquired several concrete skills. Specifically, you will:
1. Understand the strengths and weaknesses of different methods.
2. Be able to choose appropriate methods to answer real-world questions.
3. Understand the math behind methods like linear regression and hypothesis testing.
4. Understand the intuition behind these methods.
5. Be able to apply these methods to analyze real data with a powerful statistical
analysis package (R)
The methodology covered in this course is broadly applicable throughout the social sci-
ences, and numerous applications will be discussed, though we will focus on substantive
applications in economics, particularly in the following areas:
1. Environmental/Natural Resource Economics
2. Intergenerational Mobility
3. Discrimination (loans, job market, police force)
4. Finance – Asset diversification, CAPM

2
Yale University Econ 131, Spring 2018

GRADES

I reserve the right to change this breakdown if it becomes necessary. Your grade will be
based on the following components:
1. Online Quizzes (15%)
You will have four short on-line quizzes, two before the midterm and two after. The quizzes
are open book/open notes, but you cannot collaborate with other students on the quizzes
or discuss the quizzes with other students until after their solutions are posted. The lowest
quiz score will be dropped.
Quizzes will be posted on Tuesdays after class during the weeks where there is not a
homework due or the midterm. Quizes will remain live for 48 hour once posted and will
not include content from the class on the day the quiz is due.
• posted 01/30/18, due 02/01/18
• posted 02/13/18, due 02/15/18
• posted 04/03/18, due 04/05/18
• posted 04/17/18, due 04/19/18

2. Problem Sets (30%)


There will be 7 problem sets during the semester. These problem sets will be primarily
empirical and based on academic research papers. You may work in groups of up to four
people on the problem sets, but you must turn in your own individual assignment, and
must indicate on your submission the other members of your group. Your problem set
with the lowest grade will be dropped.
The problem sets must be submitted through the course web site by 11:00am on the due
date. Any problem set not submitted by 11:00am on the due date will not be accepted
without a note from your residential dean emailed to me before the due date.
The teaching assistants will grade a randomly selected subset of the problems from each
assignment, check for completion of the other parts of the assignment, and post complete
solutions after the problem sets are due.

3. Midterm Exam (20%)


Midterm will be in-class on Tuesday 02/27/2018. Exams are closed book, but you may
bring one double-sided page of notes for the midterm.

4. Final Exam (30%)


Final exam is cumulative and is scheduled for 05/08/2018. Exams are closed book, but
you may bring two double-sided page of notes for the final.

5. Participation (5%)
Participation will be based on attendance, engagement in class, coming to office hours, or
asking and answering questions on Piazza. Students can access Piazza from the coure’s

3
Yale University Econ 131, Spring 2018

canvas page (asking anonymous questions on Piazza does not count for participation).

6. OPTIONAL Empirical Project


The empirical project is your opportunity to use the tools you learn to answer a question
you come up with and that you care about. You may work in groups of up to four students,
and the expectations for the paper will be scaled based on number of students in the group.
The empirical project should be 8 to 20 pages, addressing a research question of your
choosing, and applying the methods from this course to a relevant data set. As part of the
course, students will submit proposals for the project (as a group, if you are working in a
group). The proposal should be one to two pages long, and detail your research question,
which data set you will use to address your research question, an explanation of why the
data set is appropriate for your research question, and a statement that you have confirmed
that you have access to the data set. Proposals are due on Tuesday 03/06/18 (the week
before spring break). Proposals will be reviewed for feasibility, and once approved students
cannot change their mind about completing the project.
The empirical project is optional and will add work to an already challenging course. If
completed, it will count for 15% of your total grade, scaling the other parts of your course
to account for a total of 85% of your grade.
This course will use Piazza as a discussion board and at times to receive feedback on the
course. Please post questions to Piazza rather than emailing myself or the TAs whenever
possible.

Curve and Grading


The course follows the department-mandated curve set for all introductory courses in the
Economics Department.
If students believe that there has been a mistake in in their grading, the student must
prepare a written statement describing in detail the mistake, which then should be emailed
to all three teaching fellows. If the student still believes a mistake has been made after
hearing back from the teaching fellows, they may submit to me a written statement
highlighting in detail the mistake and the response from the TFs.
Changing assigned grades is extremely unlikely and reserved for clear errors made by myself
and the teaching fellows. Re-grading will not be considered unless submitted in writing via
email as described above.

LECTURES

Lectures will be interactive. Lecture slides will be posted on the course webpage, but are
not designed as a substitute for attending lecture. If you cannot attend a particular lecture,
please augment the lecture slides with the notes from a student who did attend the lecture.
I highly encourage students to not use laptops, phones, and tablets during class. If
students choose to use laptops or tablets to take notes, they may not use them to access
non-course-related websites, Apps, or email during class. The teaching assistants will take
note of students who are misusing technology during class (i.e., on facebook, on NYT,

4
Yale University Econ 131, Spring 2018

or on their phones) and students who persistently misuse technology will not receive full
participation credit for the course.

SECTIONS/LABS

Discussion sections in this class may be quite different from what you are used to as your
TF’s will not be summarizing the week’s lectures or going over problems in front of the
class. Instead, each week you will be using the methods you learned in lecture to analyze
real data and answer real research questions. The sections will be designed to help you
with your problem sets, with some sections directly helping you with a problem set and
other sections not directly helping with a problem set but covering skills that you will need
for future problem sets. Everyone should bring their laptop with R installed and ready to
go. Your entire hour will be spent interacting with your computer and each other, with an
expert (your TF) nearby to answer any questions.

ACCEPTABLE USE POLICY

You are free to use any published materials (e.g., a textbook), in preparing Econ 131
assignments or for learning the material more generally. Similarly, you are free to use online
resources such as stackoverflow questions or R tutorials. You are also strongly encouraged
to work with others in your class. This is particularly helpful for learning to program.
Each person must turn in their own assignment.

The use of any solution materials prepared in a previous year for Econ 131, other than
materials distributed this academic year by the course faculty, is strictly prohibited and
constitutes cheating. This includes 1) any notes, spreadsheets, or handouts distributed by
me in a prior term of Econ 131; and 2) any notes, solutions, or spreadsheets prepared by
former students of Econ 131, in either written or electronic form.
This policy means you should not solicit or use solutions to previous years’ problem sets.
The reason for this policy is that access to previous year’s materials can create serious
inequities between fellow students, and jeopardize the integrity of the academic environment.
Any potential violation of this policy will be reported. Academic disciplinary
actions will be taken against those who violate this policy.
We take cheating and plagiarism VERY seriously. Every class you have ever take probably
states that cheating will not be tolerated, but we mean it.
Cheating or plagiarism will result in a 0 on the assignment and reported to the department.
You are welcome to work together in groups up to 4, but you are required to submit your
own write-up and your own code. Please take precautions to avoid putting the Teaching
Assistants or myself in a situation where we are forced to decide if two documents are
“too similar”. As future researchers, consultants, bankers, entrepreneurs, etc, learning to
do honest work in a timely manner is more important than getting everything correct.
If you are uncertain, please add proper citation. For example, if you relied heavily on
a group-member’s code for one part of an assignment, then you should make a footnote
highlighting this fact. This may result in a slightly lower grade, but as long as proper
credit is clearly given, it does not constitute cheating. The one exception to this rule is
using past material from any previous version of this course.

5
Yale University Econ 131, Spring 2018

SOFTWARE

Much of the course work in Econ 131 will involve analysis of data using R, an open
source implementation of the object-oriented programming language S. It is widely used
by applied statisticians and its libraries implement a wide variety of statistical and graph-
ical techniques with applications to a range of disciplines, such as the agricultural and
biological sciences, genetics, neuroscience and economics. R can be downloaded from
https://cran.r-project.org. We will provide some handouts on the use of R, the TFs
will help you with R in sections, and the program documentation is excellent. There are
also many excellent and free R references available online, for example, Econometrics in
R by G. Farnsworth that is available for free. If your time permits and you want to dig
deeper, there are also more programming oriented references such as An Introduction to R
by W. N. Venables, D. M. Smith and the R Core Team. However, I recommend learning
by trial and error, as it is the most time efficient approach and sufficient for the type of
coding problems that we will consider.
If you have never used R (and have never used another programming language), I would
recommend completing one or both of these free online introduction tutorials:
• www.codeschool.com/courses/try-r
• www.datacamp.com

TEXTBOOKS

There is no required textbook for this course.


While there is no required textbook, you may consider some optional textbooks if you are
having trouble following the material. An excellent econometrics textbook is Introduction
to Econometrics, 2nd or 3rd edition, by Stock and Watson (Addison-Wesley, 2010). It’s
coverage of probability and statistics is somewhat rudimentary, but it’s treatment of
regression methods is excellent and the book should serve you well as a reference in the
future.
For students without a strong mathematical background, you may also find the following
(optional) text useful: Probability and Statistical Inference, 8th or 9th ed., by Robert Hogg,
Elliot Tanis, and most recently Dale Zimmerman (Pearson, 2010 or 2015). Hogg et al
provides much deeper coverage of the concepts covered in the first half of the course than
does Stock and Watson. The most important method we will cover during the course is
linear regression and I highly recommend Paul Allison’s Multiple Regression: A Primer .
The writing is extremely clear and he covers both the intuition and mathematics behind
the method.

ACKNOWLEDGEMENTS

This class is in large part derived from the econometrics course that Professor Edward
Vytlacil taught at Yale in Fall 2017 and Professor Nicholas Ryan taught at Yale in Spring
2017. These courses were in turn derived in large part from the course Professor Douglas
McKee taught at Yale in Fall 2015, which in turn was heavily influenced by the course

6
Yale University Econ 131, Spring 2018

Professor Lanier Benkard taught at Yale in Fall 2010. The course structure, slides, and
problem sets are also influenced by discussions with Dr. Rebecca Toseland and the course
material from Professor Raj Chetty’s Econ 45 at Stanford University, as well as discussions
with Professor Lisa Kahn from Yale School of Management. I’m extremely grateful to
them for sharing their syllabus, lecture slides, assignments, handouts, exams, and advice.
In addition, Majed Dodin and Eduardo Fraga prepared the data sets for this course and
helped tremendously with the construction of the problem sets and revisions to the slides.
I am grateful to them as well. I also thank Winnie van Dijk for sharing material from the
econometrics class she taught at the University of Chicago. The diversity statetment on
this syllabus draws heavily from publicly posted statements by Michelle Morgan and Nancy
Niemi. I take full responsibility for any mistakes that I may have added to the material.

Do not redistribute any of these materials without written permission.

CLASSROOM POLICIES

• Any student with a documented disability needing academic adjustments or accom-


modations is requested to speak with me during the first two weeks of class. All
discussions will remain confidential. Students with disabilities should also contact
Judy York in the Resource Office on Disabilities, 203-432-2324.
• This class is committed to an inclusive learning environment. All students, teaching
staff, and the professor are expected treat each other with respect and dignity at all
time. This includes posts on Piazza. At this time, I am allowing anonymous posts
to lowering the barrier to asking questions, but this should not be used to make
negative statements of any kind.
• All community members should enjoy an environment free of any form of harassment,
sexual misconduct, discrimination, or intimate partner violence. If you encounter
sexual harassment, sexual misconduct, sexual assault, or discrimination based on
race, color, religion, age, national origin, ancestry, sex, sexual orientation, gender
identity, or disability please contact the Title IX Coordinator, Stephanie Spangler,
at stephanie.spangler@yale.edu (203.432.4446) or any of the University Title IX
Coordinators, who can be found at: http://provost.yale.edu/title-ix/coordinators”

MISC

Note that the economics program has changed its CIP code. The new CIP code (for
Classification of Instructional Programs) by the National Center for Education Statistics
at the Department of Education is 45.0603 (Econometrics and Quantitative Economics)
rather than the old one 45.0601 (Economics, General).

This syllabus serves as a road map for the course and is subject to change as needed.

7
Yale University Econ 131, Spring 2018

SCHEDULE (subject to change)

PART I: PROBABILITY AND STATISTICS

Week 1: Introduction
Lecture: January 16 and 18.
Lab: Open office hours (in Harkness) for help installing R and questions about
the course or first homework.
PS: Problem set 1 assigned, due January 25.
Topics:
– Course overview
– Terminology and concepts: experiments, outcomes, and events
– Brief introduction to probabilities and conditional probabilities.
– Introduction and overview of R

Week 2: Probability and Random Variables


Lecture: January 23 and 25.
Lab: Introduction to working with Data and visualization in R.
PS: PS 2 assigned and PS1 due on January 25.
Topics:
– Multiple Events, Probability Rules
– Probability Tables and Venn Diagrams
– Conditional Probability: Definition and Intuition
– Independence of Events and Information sets
– Bayes’ Rule
– Probability trees
– Definition of Random Variables
– Dummy Variables
– Expected Value, Variance and Covariance

Week 3: Random Variables and Moments of Distributions.


Lecture: January 30 and February 1
Lab: Descriptive Statistics – Discrimination in Mortgage Loans, based on “Mort-
gage Lending in Boston: Interpreting HMDA Data.”
Quiz: Quiz 1 assigned January 30 and due by February 1.

8
Yale University Econ 131, Spring 2018

Topics:
– Discrete vs. Continuous Random Variables
– Cumulative Distribution Functions
– Probability Density and Probability Mass Functions.
– Expected Values and Variances of Continuous Random Variables
– The Bernoulli and the Binomial Distribution
– The Uniform and the Normal Distribution
– Joint, Marginal and Conditional Distributions
– Correlation and covariance

Week 4: The Central Limit Theorem and the Normal Distribution.


Lecture: February 6 and 8.
Lab: Review of main concepts and PS2.
PS: PS3 assigned and PS2 due on Febuary 8.
Topics:
– Population and Sample
– Estimators as Sequences of Random Variables
– Properties of Estimators
– The Weak Law of Large Numbers
– Distributions of Estimators
– Convergence in Distribution
– Asymptotic Distributions as Approximations
– The Central Limit Theorem
– Linear Transformations of Normals

Week 5: Sampling and Uncertainty.


Lecture: February 13 and 15.
Lab: Simulations – Law of Large Numbers (and visualization in R)
Quiz: Quiz 2 assigned February 13 and due February 15.
Topics:
– Standard Errors
– Exact and Asymptotic Confidence Intervals
– One-sided and two-sided T-Tests
– Type I and Type II Errors

9
Yale University Econ 131, Spring 2018

– P-values and the Interpretation of Test Results


– Testing Differences in Means
– Small Sample Situations and T-tests

Week 6: Hypothesis Testing and Conditional Expectations.


Lecture: February 20 and 22.
Lab: Review of main concepts and PS3.
PS: PS4 assigned and PS3 due on Febuary 22.
Topics:
– Hypothesis testing continued
– Conditional expectations
– Law of iterated expectations

Week 7: Midterm and Causality.


Lecture: February 27 and March 1.
Lab: T-Tests – Labor Market Discrimination, based on “Are Emily and Greg
More Employable Than Lakisha and Jamal? A Field Experiment on Labor
Market Discrimination.”
Topics:
– Randomized Control Trials
– Interpretting observational and experimental data
– Causality

PART II: REGRESSION AND CAUSALITY

Week 8: Univariate Regression.


Lecture: March 6 and 8
Lab: Review of main concepts and PS4.
PS: PS5 assigned and PS4 due on March 8.
Topics:
– Mechanics of univariate regression
– Correlation vs. slope
– Intrepretting regression estimates
– Goodness of fit
– Confidence intervals and statistical significance

10
Yale University Econ 131, Spring 2018

– Prediction

SPRING BREAK

Week 9: Multivariate Regression 1.


Lecture: March 27 and 29
Lab: Review of main concepts and PS5.
PS: PS6 assigned and PS5 due on March 29.
Topics:
– Mechanics of multiple regression
– Interpreting multiple regression results
– Controlling for categorical variables with sets of dummy variables

Week 10: Multivariate Regression 2.


Lecture: April 3 and 5
Lab: The Mincer Model
Quiz: Quiz 3 assigned April 3 and due April 5.
Topics:
– Regression F-test
– Joint Tests
– Restricted and Unrestricted models
– Tests of linear restrictions in regression models
– Linear probability models

Week 11: Instrumental Variables


Lecture: April 10 and 12
Lab: Review of main concepts and PS 6
PS: PS7 assigned and PS6 due on April 12.
Topics:
– Estimating causal effects with instrumental variables
– What is an instrument?
– Evaluating instrumental variables

Week 12: Diff-in-Diff and Regression Discontinuity


Lecture: April 17 and 19

11
Yale University Econ 131, Spring 2018

Lab: TBD
Quiz: Quiz 4 assigned April 17 and due April 19.
Topics:
– Difference in differences
– Regression discontinuity
– Relation to instrumental variables

Week 13: Machine Learning and Review


Lecture: April 24 and 26
Lab: Review of PS7 and Review for final.
PS: PS7 due April 26.
Topics:
– Machine learning with LASSO
– Prediction
– Hold-out samples and over-fitting

12

También podría gustarte