3.2.4. Temporary vs. Permanent Methods¶
Note
This page is especially useful in the context of developing code. While I’m figuring out a step, I rarely save the output to my original data - I use temporary methods and print the output. Once I know it’s right, I make the changes permanent and proceed.
3.2.4.1. Temporary Methods¶
When you use a method on an object (e.g. a DataFrame) in python, <object>.<method>(<args>)
performs the method on the object and returns the modified object, as you can see here:
import pandas as pd
# define a df
df = pd.DataFrame({'height':[72,60,68],'gender':['M','F','M'],'weight':[175,110,150]})
# call method on df and print - df.assign yields the modified object!
df.assign(feet=df['height']//12)
height | gender | weight | feet | |
---|---|---|---|---|
0 | 72 | M | 175 | 6 |
1 | 60 | F | 110 | 5 |
2 | 68 | M | 150 | 5 |
This is useful if you want to alter the variable temporarily (e.g. for a graph, or to just print it out, like I literally just did!).
Warning
But the object in memory wasn’t changed by the code above when I used df.<method>
. See, here is the df in memory, and it wasn’t changed:
print(df) # see, the object has no feet! this is the original obj!
height gender weight
0 72 M 175
1 60 F 110
2 68 M 150
3.2.4.2. Permanent changes¶
Warning
If you want to change the object permanently, you have two options1
# option 1: explicitly define the df as the prior df after the method was called
# here, that means to add "df = " before the df.method
df = df.assign(feet1=df['height']//12)
# option 2: define a new feature of the df
# here, "df['newcolumnname'] = " (some operation)
df['feet2']=df['height']//12
print(df) # both of these added to obj in memory
height gender weight feet1 feet2
0 72 M 175 6 6
1 60 F 110 5 5
2 68 M 150 5 5
- 1
You can also do some pandas operations “in place”, without explicitly writing
df =
at the start of the line. However, I discourage this for reasons I won’t belabor here.