{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compounding returns\n",
    "\n",
    "```{warning}\n",
    "**Do NOT use `.mean()` to cumulate or compound returns!!!!!** \n",
    "```\n",
    "\n",
    "## The math\n",
    "\n",
    "Let $r_t = (P_t+D_t)/P_{t-1}$ be the simple return for a single period with dividends included. E.g. $r_t=0.05$.\n",
    "\n",
    "Let $R_t = (1+r_t)$ be the **gross** (thus, the capital \"R\") return for a single period. E.g. $R_t=1.05$.\n",
    "\n",
    "The **gross return** for a number of periods is \n",
    "\n",
    "$$\n",
    "R[0,T] = \\prod_{t=1}^T R_t\n",
    "$$\n",
    "\n",
    "and the **simple return** for the same set of periods is simply the above minus 1: \n",
    "$$\n",
    "r[0,T] = \\left(\\prod_{t=1}^T R_t\\right) - 1 \n",
    "$$\n",
    "\n",
    "We can compute this in pandas pretty easily:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The code\n",
    "\n",
    "Assume you have simple periodic (little $r_t$) returns in a variable called \"ret\" in `df` for various assets and you want to compound them over a longer period of time (e.g. monthly).\n",
    "\n",
    "To adapt the code below to your use\n",
    "1. replace \"df\" with the name of your dataframe\n",
    "1. replace \"asset\" with the name of the variable(s) identifying the assets whose returns you want to compound\n",
    "1. replace \"timeperiod\" with the name of the variable(s) containing the measurement of time **that you want to compound to**, for example, a year variable\n",
    "\n",
    "Here are two ways to accomplish the compounding:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(df\n",
    "   # compute gross returns for each asset\n",
    "   .assign(R = 1+df['ret'])\n",
    "   # for each portfolio and time period...\n",
    "   .groupby(['asset','timeperiod'])\n",
    "   # get the gross returns, and cumulate by taking the product\n",
    "   ['R'].prod()\n",
    "   # subtract one to get back to simple returns\n",
    "   -1\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(df\n",
    " # for each asset and time period\n",
    " .groupby([\"asset\", \"timeperiod\"])\n",
    " # agg() does the function inside for each group (asset-timeperiod combo)\n",
    " # 1+x['ret'] gets gross return (R) for each observation in the data\n",
    " # (1+x{'ret']).prod() takes the product of all R for that group\n",
    " # and then subtract one\n",
    " .agg(ret=lambda x: ((1 + x[\"ret\"]).prod()) - 1)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optional: Log returns\n",
    "\n",
    "Just as an FYI, there is another way to compound returns you might hear about in other contexts: using \"log returns\".\n",
    "\n",
    "A \"log return\" is $ln(1+r)$ or $ln(R)$, using the natural log. It's the _continuously compounded_ rate of return. If you earned a simple return of $r=10\\%$, one month, the continuously compounded rate of return over that month is $ln(1+0.1)=9.5\\%$. \n",
    "\n",
    "Of course, no one speaks in terms of continuously compounding returns, but log returns has its uses. \n",
    "\n",
    "```{note}\n",
    "1. You can **not** add the simple rates of returns across time to get the total compounded return: $r_{[0,1]} +r_{[1,2]} \\neq r_{[0,2]}$\n",
    "2. But you **can** add log returns across time to get the correctly compounded total log return: $ln(1+r_{[0,1]}) +ln(1+r_{[1,2]}) = ln(1+ r_{[0,2]})$\n",
    "\n",
    "Illustration: If returns in January ($r_{[0,1]}$) are 10%, and also 10% in February ($r_{[1,2]}$), your total return for both months ($r_{[0,2]}$) is $(1+r_{[0,1]})(1+r_{[0,2]})=21\\%$. \n",
    "\n",
    "Additionally,  $ln(1.1) +ln(1.1) = ln(1.21)=0.1906$.\n",
    "\n",
    "```\n",
    "\n",
    "This second observation makes it easy to compound returns in large datasets another way than what I showed above. Taking advantage of the fact that $e(ln(x))=x$ (in step 2 below) and $e^a*e^b*e^c=e^{a+b+c}$ (in step 3 below), we can rewrite $R[0,T]$ as:\n",
    "\n",
    "$$ R[0,T] = \\prod_{t=1}^T R_t = \\prod_{t=1}^T ( e(ln( R_t)) ) =e(\\sum_{t=1}^T ln(R_t))$$\n",
    "\n",
    "Thus, to compound returns, you can sum the log returns. Then to convert the compounded log return back to a simple return, just exponentiate and subtract 1.\n",
    "\n",
    "The code looks like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(df\n",
    " .assign(logR = np.log(1+df.ret))\n",
    " # for each asset and time period\n",
    " .groupby([\"asset\", \"timeperiod\"])\n",
    " # sum log returns\n",
    " ['logR'].sum()\n",
    " # exponentiate and subtract 1\n",
    " .assign(ret = lambda x: np.exp(x['logR'])-1\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{warning}\n",
    "You can add log returns for **one** asset _over time_, but you can't combine log returns across assets at the same time. \n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}