1.7. Digging into Py(thon)¶
This page is long but important. It’s structured as a walkthrough - you should run the code on your computer as you read it.
Below, I cover a curated set of topics to get us going in the class.
It is not meant to be the end-all-be-all of python basics. There are very good resources around the web you should and will consult during the semester which contain more in-depth info on topics. https://www.pythoncheatsheet.org/ is a good one.
1.7.1. In addition to this page: Tutorials¶
You can’t learn programmatic material during class sessions, try though I might to make it possible. You can only learn through practice. You should be checking out tutorials and lessons online in your free time.
I suggest that you pick something under the python dropdown here or on the bootcamp page.1 If you’re new to writing code, the bootcamp classes are best. If you’re coming to python from elsewhere, the Whirlwind Tour of Python is right-sized.
As you work through the tutorial of your choice: write code in JupyterLab, run it, and save the files on your computer. If you get stuck, just start a new file.
1.7.2. Walkthrough of Python Essentials¶
Alright, let’s get going.
Below, we cover 122 essential topics you’ll need to understand.
Research strongly indicates that active learning is the most effective way to learn new skills. That’s why I linked to the tutorials above.
This chapter works as a reference page you can return to, but - especially if you are new to Python! - I want you to try to run as much code from this page in your own notebook file so you can add your own notes about python and write and try additional things with code.
To the extent possible, I want you to get comfortable typing commands yourself rather than copy-pasting. This is slightly more painful in the beginning, but much better payoff in the long run.
That said, when you’re short on time, you can copy code blocks on this website to your own notebook files to try to run them two ways:
Quick and dirty: By clicking the “copy” symbol in the upper-right corner of code blocks on the website.
Better option: This video shows you how to download the textbook and lecture slides to your computer, and then you can easily drag and drop both sections of code and markdown from those into your notes.
After copy-pasting, the way to “supercharge” that into active learning is to run the code, and then change something in the code, rerun it, and see what changes.
# Note: these cells aren't showing the answers on the website on purpose! # YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT # YOU: TRY VARIATIONS TOO... print(2+3) # addition print(2-3) # subtraction print(2/3) # division - in Python 3, division of integers (a data type) inherently returns floats (a data type) print(type(2), type(2/3)) # see? print(2//3, type(2//3)) # floor division returns an integer. # FOR YOU to try: use this to tell me how many full hours are in 7643 minutes?
print(2%3) # mod operator print(2*3) # multiplication print(2**3) # 2 to the power of three print(2^3) # ^ is NOT the power operator!!! it is a 'bit' operator - you don't need to know this for now int(2+3*(4+15)/3) # 1. PEMDAS applies # 2. If the last command in a cell return an *object*, jupyter auto prints it w/o needing print() # 3. this should be a float (21.0), but you can convert a float to an int with the int() function
18.104.22.168. Parentheses - Grouping and Calling¶
As the above example shows, parentheses are for grouping (
(4+15)/3 forces addition before division) and calling a function (e.g.
print() means the
22.214.171.124. Logic and comparisons¶
The comparison operators are
!= (not equal),
> (greater than),
>= (equal or greater than),
< (less than), and
<= (equal or less than). Each of these prompts Python to evaluate the truth of the comparison and return
False are booleans, meaning
True is equal to 1, and
False is equal to 0.
# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT # YOU: TRY VARIATIONS TOO... print(3>3) # 3 is not greater than 3, so this evaluates to... # YOU: try 2 of the 3 other comparison operators print(True == 1) print(type(True), int(True), type(False), int(False)) # print() can print a sequence of objects
The logic operators are
not. They evaluate a sequence of statements and return a true or false boolean.
or mean? In common parlance,
or usually means “Do you want A or do you want B? (pick one)”. Mathematically,
or works like a dad joke - You: “Dad, are we rich or poor?” Dad: “Yes”.
# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT a = True # you assign variables by writing: VariableName = Thing. b = False print(a and b) # if both sides of *and* are true, the whole thing is print(a or b) # if either side of *or* is true, the whole thing is print(a and not b) # *not* negates what is after it print(not a or not b) # "not b" is true, so the whole thing is true
The membership operators
not in check whether the left object is or is not in the object on the right side.
# try these... what do you get? a=3 b=[1,2,3] print(a in b) print(a not in b) print(b in a) print(b not in a)
The identity operators
is not check whether the left side and the right side are the same object.
== are NOT the same!!!* Here is an example borrowed from G4G.
list1 =  list2 =  list3=list1 print(list1 == list2) print(list1 is list2) print(list1 is list3)
Parentheses: You can (and certainly will at some point need to) check for the truth of statements involving many variables, and complex logic requests. You can dictate the order Python evaluates statements. So, for example,
if (Poor and TaxRateAtOrBelowNegative10) or (MiddleClass and TaxRateAtOrBelow5) or (Rich and TaxRateBelow15): start_audit()
will audit rich filers if they have less than a 15% tax rate, but will only audit poor tax filers if they had a negative tax rate.
# a few silly examples print((3>3) == False) # 1 is not greater than 2, so this evaluates to... print(3>3 == False) print((3>3) != True)
126.96.36.199. Variables are pointers¶
I’ll simply provide the following warning: Unless you read and understand the link above, any time you write
x=y, you might be creating a secret bug in your code that will cause potentially enormous errors!
x = [1, 2, 3] print(type(x)) y = x print(y) x.append(4) # change x, not y print(y) # y was changed as well... Why? Read the page above!
<class 'list'> [1, 2, 3] [1, 2, 3, 4]
188.8.131.52. Everything is an object¶
Referring again to Whirlwind of Python,
In object-oriented programming languages like Python, an object is an entity that contains data along with associated metadata and/or functionality. In Python everything is an object, which means every entity has some metadata (called attributes) and associated functionality (called methods). These attributes and methods are accessed via the dot syntax.
object.method(<arguments here>) will call the function
object, and the function uses whatever arguments you pass it.
Above, the object
xhas the type attribute of
list, and lists have a “method” called
In the stock prices program we show during the lectures, we imported a package:
import pandas_datareader as pdr. Now, the “package”
pandas_datareaderis actually an “object” (which we call
pdrfor convenience). That object - like any object - has “method” functions. In that code, for example, I called
pdr.get_data_yahoo(stocks)to download stock prices.
Seriously, EVERYTHING is an object.
Lists are objects (duh)
Attributes and methods of objects are themselves objects. Put
type(x.append)at the end of the code block above.
184.108.40.206. Common object types¶
int were covered above.
None. See here.
float. The other (besides
int) main type for numbers.
Beware of comparing floating point numbers! Below is an example, and see here for the explanation.
print(1.0+2.0 == 3.0) print(0.1+0.2 == 0.3) # FALSE?!
str. There are built-in functions that work on strings directly. Let’s look at a few:
a='some string' # a = "some string" is the same. # some functions work on strings directly print(len(a)) # string types also have many functions as methods print(a.upper()) #YOU: type a.<tab> in your notebook, and jupyter will open a list of possible functions!
220.127.116.11. Built-in data structures¶
set. Beginners typically rely on lists extensively, but as you progress, you will find that all four are extremely useful, because their unique traits solve different needs.
Immutable ordered collection
Unordered (key,value) mapping
Unordered collection of unique values
You should absolutely read this and as you do, try the examples, and throw them into your growing personal cheat sheet.
You need to know
how to define/create an object of each type (the examples above)
access elements within each type
modify elements within each type
add or remove elements from each type
when a set is useful
when a dictionary is useful (as opposed to a list)
There are LOTS of functions in python that work on the common object types and the data structures we just introduced.
First, let me illustrate the use of
+ for lists:
L=[8, 5, 6, 3, 7] # use brackets to define a list L.extend() # extend concatenates L.extend([3,4]) # concatenates work the same with more elemens L = L + [13,14] # + concatenates L.append(7) # append adds its entire argument to the list as a new element. L.append() # 7 is an int, so it goes in as an int, but  is a *list*, so append puts a list as the element L.append([8,9]) # see, the last element is [8,9] L
[8, 5, 6, 3, 7, 5, 3, 4, 13, 14, 7, , [8, 9]]
Now, let’s all define this vector:
L=[8, 5, 6, 3, 7].
Exercises: Write code that does the following:
Returns the length.
Returns the largest element.
Returns the smallest element.
Returns the total of the vector.
Returns the first element. See this awesome answer to learn about “slicing” lists in Python. If that link is dead: https://stackoverflow.com/questions/509211/understanding-slice-notation?rq=1
Returns the last element.
Returns the first 2 elements.
Returns the last 2 elements.
Returns the odd-numbered elements (i.e. [8,6,7]).
I’d suggest putting what you just learned about how python indexes an object and how to slice a list into your personal cheat sheet until you have it memorized thoroughly.
18.104.22.168. For loops¶
Python loops are very intuitive. You need to know one thing first:
PYTHON AND INDENTATION
In python, indentations at the beginning of lines are not “up to the user”. Usually, indentations indicate a “block” of code that is run as a unit inside a
# If the condition is true, python will do anything that is in the # block of code belonging to the if statement. But 7 < 5 is false, # so python won't do anything that "belongs" to this "if" if 7 < 5: print('I will not print.') print('Nor will I')
… will not do the same thing as this code:
if 7 < 5: print('I will not print.') print('But I will!')
…because in the second version, the second print line isn’t indented and therefore isn’t part of the “if” statement
Conversely, how you use whitespace within a line is up to you. Both of these lines of code accomplish the same thing:
print( a) print(a)
for <name> in <iterable object>: # you must use the colon! <do some stuff> # you must indent <do some more stuff if you want> # you must indent <do even more stuff> # you must indent
The iterable object can be anything Python can iterate through, e.g. a list. (But not just lists!)
Note: When I write anything inside
<>, you should drop the “<” and “>” symbols when you fill that out.
A very good tip: Use good variable names!
Variable names should be something that communicates the content of the variable! Good names make code easy to read and fix bugs. Bad names set you up for pain and suffering.
Examples of bad variable names:
Examples of goods variable names, (used in for loops so you can see a good habit):
If you are looping over
letters, each object might be called a
for letter in letters
for stock in stocks
for state in states
for state in states: capitol=stateCapitals[state] print(capitol) print(capitol.upper()) # you can use as many lines as you need, just keep indenting # the indents are 4 spaces, or more commonly, a <tab> print(states) # <-- the for loop ends when you write a line of # code (not a comment!) that is unindented
So, for each state, Python will start the indented block of code and run each line within the code block in sequence. So if the list of states is
[Alabama, Alaska, Arizona,...], Python will…
Grab the first element in the list we are looping over and put that thing in a variable called “state”. So, set state = ‘Alabama’.
Set capitol = ‘Montgomery’
Execute the next two lines of code, but they are just comments so nothing happens.
At the end of the indented block of code, python will check if there is another element in the states vector, and if so, repeat the indented block. There is another state after Alabama, so…
Set state = ‘Alaska’
Set capitol = ‘Juneau’
Set state = ‘Wyoming’
Set capitol = ‘Cheyenne’
Is there another state? No? Ok! The for-loop is complete! Python will exit the indented code block and proceed. The next line of code that is not indented is
print(states)and so that’s the next thing it will do.
22.214.171.124. If, elif, else¶
if <condition 1>: # you must use the colon! <do some stuff if condition is true> # remember to indent things inside the if elif <condition 2>: # "elif" as in "Else-If" <do stuff if condition 1 is false but condition 2 is true> else: <if neither condition 1 nor 2 are true, do this>
You can include zero or as many
elifcode blocks as you want
You can omit the
elseblock entirely - you can just do an “if” block if that’s all you need
Whatever is in
<condition>must evaluate to True or False, or 1 or 0 (1 is equivalent to True, 0 is False)
See the “Logic and comparisons” section above on how Python evaluates conditions
For loops, if-elif-else, and while are “flow controls”: They control the order in which the code is executed. You can “nest” many levels of flow-control. Here is an example of putting an
if statement inside a
for state in states: capitol=stateCapitals[state] print(capitol) print(capitol.upper()) if state == "Ohio": print("Michigan is better than Ohio")
while <condition is True>: <do some stuff>
counter = 0 while counter < 7: print(counter) counter += 1 # "+=" is short for "add to myself". # Here, it's an abbreviation for: counter = counter + 1
0 1 2 3 4 5 6
I have one important comment about
while loops: Every time through the loop, there must be a chance for the condition to become False. If not, your code will loop forever!
We won’t use while loops in this class. But if you ever write one, and it is stuck in an infinite loop, you can stop the kernel by typing i, i. Or click the “Terminals and Kernels” tab in the left sidebar and “shutdown” next to your code’s filename.
126.96.36.199. Writing your own functions¶
Writing your own functions is important for improving the clarity of your code because it
separates different strands of logic
allows you to reuse code
prevents copy/paste errors
The syntax of functions
To write a function, write
name the function,
inside parentheses, list the parameters a user might pass as inputs into the function
then, all the code of the function is indented and begins on the next line
return <object>at the end of the function to output the object the function makes (which could be any type of object!)
def <nameOfYourChoice>(<you can specify arguments the function takes, or none>): # and then write your indented code block that is the function.
Any object(s) you want can be given as inputs! You can give as inputs a variable, a list, a dictionary, even a function. Remember, in python, everything is an object.
You choose the name an object will be referred to as within the function.
In the example below, the function calls the first number you give it “x” and then checks if
So, even though
zz=-100, when we run
f(zz), the function acts like we told it
Functions can get “positional” arguments or keyword arguments. Positional arguments are based on the order in which you provide them to the function. Keyword arguments can be specified by including their name, without needing to be in the right spot in the order of arguments (like
Once the code executes a line starting with
return, the function will end and output whatever is on that line.
Students new to functions usually want to end them with something like
print(answer). Don’t! End functions with return, like
return answer. If you use print, you can’t use the function’s output for anything else after it finishes. If you use return, you can save the output and use it (even if it is just to print it).
Any object(s) you want can be returned as outputs! The output can be a list, set, function, dictionary, string. It can be a dictionary with lists inside it, or a list with dictionaries inside it. Go wild if you want! (But only while practicing your skills. In practice, don’t be complex for the sake of it!)
Functions can print things while executing. This can be useful while debugging a function and testing it. Because of that, people new to python often use “print” at the end of functions instead of return.
If you want to reuse the output at any point later in your code, use return instead of print!
def silly_me(): # a simple function for illustration, just outputs a string return "this is a string" a = silly_me() # now "a" is the string "this is a string" print(a) # and you can print the output without putting print in the function
Code that is poorly documented won’t be used. By you, by you in the future, or by others. So you should document it! You do this by adding line(s) immediately after the first line, as the example below shows.
The docstring can be accessed by users via
help(<FunctionName>)the same as any other function. In fact, this is how help is written in all Python functions we’ve used!
Example: The function below shows off positional and keyword arguments, how to write a multiline “docstring”, how the program ends once a return is executed, outputting a list, and setting default values for inputs.
def f(x, a=1, b=1): ''' Returns a list. Element 0 is a+bx, element 1 is 2. The first argument you give goes to x, the second to a, the third to b. If you do not provide a or b, they default to the value 1. ''' if x < 0: return "WHOA THIS IS NEGATIVE" return [a + b * x, 2] # you can return any object(s) you want! this is a list, for example zz = -100 print(f(zz)) # Inside the function, it takes the first arg # and calls it "x". So here x is -100 inside. # because x<0, the func returns WHOA # and never gets to a+b*x print(f(2,2,2)) print(f(1)) # uses the default value of a and b print(f(1,b=3)) # uses the default value of a help(f) # the docstring is useful!
WHOA THIS IS NEGATIVE [6, 2] [2, 2] [4, 2] Help on function f in module __main__: f(x, a=1, b=1) The first argument you give goes to x, the second to a, the third to b. If you do not provide a or b, they default to the value 1.
# this won't work! python requires you to use the keyword arguments AFTER the positional arguments print(f(b=3,1))
File "C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3182615729.py", line 2 print(f(b=3,1)) ^ SyntaxError: positional argument follows keyword argument
188.8.131.52. Advanced: Scope¶
I want you to be generally aware of the concept of “global” and “local” scope. Generally, python objects are available only within the region they are defined and subregions therein. Put differently, objects are available downstream, but not upstream.
Here is an example:
x=1 def silly_func(): xyz = 14 print(x) # variables defined OUTSIDE AND BEFORE a function are visible INSIDE the func silly_func()
print(xyz) # variables defined INSIDE a function are NOT visible OUTSIDE the func
--------------------------------------------------------------------------- NameError Traceback (most recent call last) C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3825219627.py in <module> ----> 1 print(xyz) # variables defined INSIDE a function are NOT visible OUTSIDE the func NameError: name 'xyz' is not defined
Here is a second example to show that changing the downstream variable inside the function won’t change the variable’s value outside the function!
x = 1 def silly_func(): x=2 return x print(silly_func()) print(x) # changing the downstream variable inside the function didn't change the upstream version
1.7.3. Clear output and rerun from the start!¶
I can NOT emphasize this enough: The point of code is to make things reproducible. So code must run from beginning to end and produce the same thing every time.
The nature of developing code is that you’ll run some lines of code, then write more code, then go back and change something above (and run that part again), and then go back down and keep writing and running new code. When you’re done, your code will be broken!
A golden rule
Always look to see if the first executed code block is “” and that all the subsequent code blocks are numbered consecutively. Click on this link to see an example.
If the code you’re looking at doesn’t meet those two rules, I click “Run” > “Restart Kernel and Run All Cells”.
This applies to your own code! Restart and run from scratch regularly.
1.7.4. Stuck on syntax issues for a function?¶
See the tips on the Jupyter Lab page.
Some people prefer to learn through games. That’s how I learned python! I built solvers for Sudoku and the Cracker Barrel golf tee game… Both taught me a LOT about programming in python, problem solving strategies, and data structures. You could try to figure out your Wordle guessing strategy (use can use this list of words). Or, visit Edabit, which has a bunch of games. If you log in, you can search for python challenges that take from 1 minute to … longer… For example, the Museum of Dull Things. If you find any games illuminating, please let me know via the class discussions repo!
The subsection on scope isn’t crucial but is useful to students sometimes.