Read this before you start doing any assignments!
Read this before you start doing any peer review!
All relevant delivery dates are on the schedule. In addition to the deliverables below, there are peer reviews you’ll be delivering as well.
Using GitHub, writing in Markdown, managing a repo with GitHub and GitHub Desktop, and basic Python coding in Jupyter.
We will explore a dataset using the
pandas package, push our programming toolkit further, and practice good habits.
Tables are decent for understanding data and conveying trends and relationships, but figures are often much better.
More often than not, you’ll need to combine multiple data files. Let’s try a bunch of merges, and learn how merging is related to data cleaning.
We’re going scrap some 10-K reports from EDGAR, extract and process the text, and merge some text-based variables to Compustat.
Let’s try to learn from the text something about a firm’s technology, the risks it faces, and investment choices.
In the project, we will define a question, acquire the necessary data, explore the data (for introductory analysis and to look for data issues), refine the data, and then analyze the data using (some number of) appropriate techniques. All of this will be put into a report and presentation.
In order to get use ready for the project, this assignment will focus on practicing that workflow and digging into the execution and interpretation of analytic models we covered in this portion of class.
Check out the link above.