Full instructions for the proposals

In the project repo, modify the README as a group. Use the template below.

  1. The research question should be precise (NOT VAGUE), the hypothesis clear, and the metrics well defined. Thorough mechanical

  2. The necessary data should be realistically acquirable over our shorter time frame. There are a lot of data resources on the website, including FRED, ourworldindata.com, SEC’s EDGAR.

Acknowledgment: We are effectively answering questions 1.1-1.3 and 2.1-2.3 from DS100 in this proposal.

Research Proposal: < Title >

By X, Y, and Z

Research Question

This section should cover:

  1. What do we want to know or what problems are we trying to solve? As in ASGN 5, you should list (1) the “bigger” question/debate/problem you’re interested in, and also (2) the specific research question(s) you’ll actually try to answer.

  • The research question will be smaller in scope than the big picture question. But the answer to your specific research question should shed light on the bigger question (although it likely won’t conclusively answer it).

  • The answer to your specific research question should shed light on the bigger question (although it likely won’t conclusively answer it).

  1. What are our hypotheses?

  2. What are our metrics of success?

Necessary Data

This section should cover:

  1. What does the final dataset need to look like (mostly dictated by the question and the availability of data):

    • What is an observation, e.g. a firm, or a firm-year, etc.

    • What is the sample period?

    • What are the sample conditions? (Years, restrictions you anticipate (e.g. exclude or require some industries)

    • What variables are absolutely necessary and what would you like to have if possible?

  2. What data do we have and what data do we need?

  3. How will we collect more data?

  4. What are the raw inputs and how will you store them (the folder structure(s) for each input type).

  5. Speculate at a high level (not specific code!) about how you’ll transform the raw data into the final form.