IDS/ACM/CS 157
Statistical Inference
Syllabus
[pdf]
Lectures |
Tue &
Thu 9:00am-10:25am in Baxter Lecture Hall |
Instructor |
|
Office |
114 Annenberg |
Email |
kostia@caltech.edu
(please include “157” in the subject line) |
Office Hour |
Thu 1pm-2pm, or by appointment (please,
send an email to schedule) |
Head TA |
Harsh Gandhi (hgandhi@caltech.edu)
|
TA Office Hours |
|
Coure Goals
Statistical Inference is a branch
of Mathematical Engineering that studies ways of extracting reliable
information from limited data for learning, prediction, and decision
making in the presence of uncertainty. The main goals of this course
are:
• Develop statistical thinking and intuitive feel for the
subject,
• Introduce the most fundamental ideas, concepts, and methods
of Statistical Inference, and
• Explain how and why they work, and when they don’t.
If you do well in the class, you should be able to read (and understand)
most contemporary papers that use statistical inference and perform
statistical analysis yourself.
|
Prerequisites
This is an introductory course on
statistical inference. No prior knowledge of statistics is assumed.
However, a solid understanding of Probability is required. Ma 3
or ACM/EE/IDS 116 (or equivalent) is a “hard” prerequisite.
A key part of the course is problem sets, where you will get experience
in using the learned methods and models in applications via simulations
in MATLAB. So, some familiarity with MATLAB (and programming in
general) is desired, but this is a “soft” prerequisite:
MATLAB is easy to pick up on the fly, especially for the purposes
of this course.
|
Textbooks
There is not a single book the
course is based on (I am writing the one!). Good news: I will provide
comprehensive lecture
notes. After each lecture, I will be uploading the corresponding
notes to the course Piazza page together with supplementary materials
for further reading. The course was developed using the following
books, which can be used as supplementary (but not required) textbooks:
• G. Casella & R.L. Berger, Statistical Inference,
2002.
• A.C. Davison, Statistical Models, 2003.
• L.A. Wasserman, A Concise Course in Statistical Inference,
2005.
• M. Lavine, Introduction to Statistical Thought,
2013.
• S.L. Lohr, Sampling: Design and Analysis, 2010.
• D.C. Montgomery, E.A. Peck, & G.G. Vining, Introduction
to Linear Regression Analysis, 2006.
• D. Nolan & T. Speed, Stat Labs: Mathematical Statistics
Through Applications, 2000.
• S. Weisberg, Applied Linear Regression, 2005.
|
Course Plan
The following is a tentative outline
of the topics that I plan to cover this term.
Week 1 |
Introduction,
Summarizing Data |
Week 2 |
Classical
Statistics: Fundamentals of Survey Sampling |
Week 3 |
Modeling
and Inference: A Big Picture , Statistical Functionals |
Week 4 |
Jackknife,
Bootstrap, Method of Moments |
Week 5 |
Maximum Likelihood Estimation |
Week 6 |
Hypothesis Testing: General
Framework, p-Values |
Week 7 |
The Wald, t-, Permutation,
and Likelihood Ratio tests |
Week 8 |
Regression Function, Scatterplots,
Simple Linear Regression, Ordinary Least Squares |
Week 9 |
Properties of OLS Estimates,
Interval Estimation, Prediction, Graphical Residual Analysis
|
|
Grading
Your final grade will be based on
your total score. Your total score is a weighted average of Problem
Sets (60%), Midterm exam (20%), and Final exam (20%). You can increase
your total score by up to 5% if you participate actively in Piazza
discussions in the Q&A
section. Every answer submitted before TAs or instructor answer,
which is later endorsed as “good answer” by TAs or instructor,
gets 1% of the total score. There are no fixed thresholds for grades,
but if your total score is 90% (80%, 70%, 60%), you are guaranteed
at least “A” (“B”, “C”, “D”).
Problem
Sets |
60% |
Midterm |
20% |
Final |
20% |
|
Problem Sets
There will be six Problem Sets.
Problems (and solutions) will be posted on Piazza.
For assignment and due dates see “Important
Dates” below. Late submissions will not be accepted for
any reason,
but the Problem Set with the lowest score will be dropped and not
counted toward your total score. Submitting wrong files or files
in a wrong format is considered as a late submission. Extensions
may be granted for academic, personal, or medical reasons. For extensions,
please email the Head TA.
|
Exams
There will be two
exams: Midterm (based on Lectures 1-9) and Final (based on Lectures
10-16). The Head TA will provide a review session before each
exam. Both exams are take-home, self-timed, and “open-book”:
you can use notes and books, but not your classmates and the Internet.
You can use your computer only as a typing device and for basic
arithmetic operations. No other electronic devices are permitted. |
Collaboration Policy
Here is
a detailed collaboration
policy. In general, collaboration is encouraged everywhere
except for the
exams. Let’s help each other and learn together! If
you get stuck with a homework problem, I encourage you to discuss
it with other students (offline or online on Piazza).
But remember that you will have to prepare and submit your solution
by yourself. No collaboration is allowed on the exams.
|
Important
Dates
|
Available |
|
|
Problem
Set 1 |
1pm
Thu, Apr 11 |
9pm
Thu,
Apr
18 |
Problem
Set 2 |
1pm
Thu,
Apr
18 |
9pm
Thu,
Apr
25 |
Problem
Set 3 |
1pm
Thu,
Apr
25 |
9pm
Thu,
May 02 |
Head TA
Review |
9am
Thu, May 02 |
|
Midterm
Exam |
1pm
Thu, May 02 |
9pm
Tue, May 07 |
Problem
Set 4 |
1pm
Tue, May 07 |
9pm
Tue,
May
14 |
Problem
Set 5 |
1pm
Tue,
May
14 |
9pm
Tue,
May
21 |
Problem
Set 6 |
1pm
Tue,
May
21 |
9pm
Tue,
May 28 |
Head TA
Review |
9am
Thu, May 30 |
|
Final Exam
|
1pm
Thu, May 30 |
9pm
Thu, June 06 |
Websites
• Course
Website (this page)
• Lecture notes, further reading materials, problem
sets, exams, data sets, solutions, announcements, and
class discussions will be managed via Piazza,
which is designed such that you can get a quick help
from your classmates, TA(s), and instructor. Instead
of emailing questions to the teaching staff, I encourage
you to post your questions on Piazza because a) you
will get the answers faster b) your classmates may also
benefit from seeing the answers to your questions.
• Problem sets and exams will be graded via Gradescope.
To submit your solution via Gradescope, your need to
create a single PDF (not images) that contains the whole
solution, and then upload it to Gradescope. Here are
some useful links:
scanning
on a mobile device and submitting
an assignment.
—
If you a registered student, you will
be enrolled on Gradescope by the end of the 1st week
of classes, and you will receive a notification from
Gradescope about your enrollment. Please make sure that
the email that you use on Gradescope is your official
Caltech email.
— If you are a registered student,
but have not been enrolled on Gradescope by the end
of the 1st week of classes, please email the Head TA
as soon as possible and ask to enroll you to Gradescope.
Your absence on Gradescope means that, according to
my records, you are not registered for the course.
— If you want just to audit the course,
it is fine, you will have access to Piazza and all course
materials there (please email me and I will enroll you
on Piazza), but you will not have access to Gradescope
and your submissions will not be graded. If you audit
the course this term, you should not register for the
course in the future.
|
Suggested Study Process
To get the
most out of IDS 157, here is my suggestion on the study
process:
• Attend
Lectures, focus on understanding the big picture of what
is going on.
• Review
Lecture Notes (ideally on the same day they are released),
make sure that everything is clear.
• If something is not clear, ask
on Piazza, and help your classmates by answering their
questions.
• After each Lecture, very briefly summarize
my notes in Your Own Notes, extract the essence.
• Start
working on each Problems Set on the same day it is released
(or as soon as possible after that).
• Aim
at finishing each Problem Set and Exam at least 1 day
before they are due.
• If you get stuck with a problem, ask for hints
on Piazza (unless it is an exam problem, and then you
are screwed
;-)
|
Keep
in Mind
My goal is
to help you understand and learn the material. Understanding
is a creative and time- and effort-consuming process.
If you don’t understand something, please ask to
me. If you are struggling with balancing the workload,
please talk to me. If you have any concerns, please let
me know. Keep in mind that I am here to help.
|
Honor Code
You
must conform to the honor
code:
“No member of the Caltech community shall take
unfair advantage of any other member of the Caltech community.” |
|
|