# 3.2.8. Exercises¶

Try these and compare to a friend. There are many ways to solve each, so if your approaches differ, explain it to each other!

```import pandas as pd
import pandas_datareader as pdr # IF NECESSARY, from terminal: pip install pandas_datareader
import datetime
import numpy as np

start = datetime.datetime(2017, 1, 1) # you can specify start and end dates this way
end = datetime.datetime(2021, 1, 27)
macro_df = pdr.data.DataReader(['GDP','CPIAUCSL','UNRATE'], 'fred', start, end)
```

## 3.2.8.1. Part 1¶

During class, I used this dataframe to go over Pandas vocab, and we show how to

• access 1 variable (note: `pd` calls this a “series” object, which is a 1D object instead of a 2D object)

• access multiple vars

• access, print, and change column names

• access, print, reset, and set the index

Questions:

• Q0: Do each of the four new golden rules for initial data exploration, from the lecture.

• Q1: What is the second series above?

• Q2: What is the frequency of the series?

• Q3: What is the average ANNUAL GDP, based on the data?

```# do your work here
```

## 3.2.8.2. Part 2¶

• Q4: Download the annual real gdp from 1960 to 2018 from FRED and compute the average annual percent change

• Q5: Compute the average gdp percent change within each decade

```# do your work here
```

## 3.2.8.3. Part 3¶

First, I’ll do the work to load January data on unemployment, the Case-Shiller housing index, and median household income in three states (CA/MI/PA).

Tip

Run this block yourself, line-by-line, and part-by-part to figure out what I’m doing.

For example, just run the first three lines to download the data, then run

```macro_data.resample('Y')
```

Try other arguments inside resample to see what works (and what it does) and what doesn’t work.

```# LOAD DATA AND CONVERT TO ANNUAL

start = 1990 # pandas datareader can infer these are years
end = 2018
'LXXRSA','DEXRSA','WDXRSA', # case shiller index in LA, Detroit, DC (no PA  available!)
'MEHOINUSCAA672N','MEHOINUSMIA672N','MEHOINUSPAA672N'], #
'fred', start, end)
macro_data = macro_data.resample('Y').first() # get's the first observation for each variable in a given year

# CLEAN UP THE FORMATING SOMEWHAT

macro_data.index = macro_data.index.year
print("\n\n DATA BEFORE FORMATTING: \n\n")
print(macro_data[:20]) # see how the data looks now? ugly variable names, but its an annual dataset at least
macro_data.columns=pd.MultiIndex.from_tuples([
('Unemployment','CA'),('Unemployment','MI'),('Unemployment','PA'),
('HouseIdx','CA'),('HouseIdx','MI'),('HouseIdx','PA'),
('MedIncome','CA'),('MedIncome','MI'),('MedIncome','PA')
])
print("\n\n DATA AFTER FORMATTING: \n\n")
print(macro_data[:20]) # this is a dataset that is "wide", and now
# the column variable names have 2 levels - var name,
# and unit/state that variable applies to
```
• Q6: for each decade and state, report the average annual CHANGE (level, not percent) in unemployment

• Q7: for each decade and state, report the average annual PERCENT CHANGE in house prices and household income

```# do your work here
```