1.7. Digging into Py(thon)¶
Important
This page is long, but important. It’s structured as a walkthrough - you should run the code on your computer as you read it.
It is not meant to be the end-all-be-all of python basics. There are very good resources around the web you should and will consult during the semester which contain more in-depth info on topics. https://www.pythoncheatsheet.org/ is a good one. What’s below is a curated set of topics to get us going in the class.
1.7.1. Tutorials¶
You can’t learn programmatic material during class sessions, try though I might to make it possible. You can only learn through practice. You should be checking out tutorials and lessons online in your free time.
Two great options:
Codeacademy is great. You can probably blast through the key lessons before a free trial expires (currently=7 days).
Go through #3 to #14 of A Whirlwind Tour of Python.
As you follow either of those, I would put the code you write inside the /learning_python/
(or similar) folder inside your Class Notes repo.
You can call the file(s) whatever you want. (If you want a suggestion for a filename, “Cheatsheet”, “Whirlwind Cheatsheet”, or “Codeacademy Cheatsheet” make sense.)
Our resources page has a python cheatsheet you can download.
Do you prefer to learn through games?
(That’s how I learned python! I built solvers for Sudoku and the Cracker Barrel golf tee game… Both taught me a LOT about programming in python, problem solving strategies, and data structures.)
Assignment 1 in 2022 includes an optional challenge: Optimizing your Wordle guessing strategy.
Edabit has a bunch of games. If you log in, you can search for python challenges that take from 1 minute to … longer… For example, the Museum of Dull Things. If you find any games illuminating, please let me know via the class discussions repo!
Alright, let’s get going. Hopefully digging into Py goes better than Chrissy Teigen’s experience:
1.7.2. Python essentials¶
Below, we cover 121 essential topics you’ll need to understand.
Warning
Research strongly indicates that active learning is the most effective way to learn new skills. That’s why I linked to tutorials above.
This chapter works as a reference page you can return to, but - especially if you are new to Python! - I want you to try to run as much code from this page in your own notebook file so you can add your own notes about python and write and try additional things with code.
To the extent possible, I want you to get comfortable typing commands yourself rather than copy-pasting. This is slightly more painful in the beginning, but much better payoff in the long-run.
Tip
That said, when you’re short on time, you can copy code blocks on this website to your own notebook files to try to run them two ways:
Quick and dirty: By clicking the “copy” symbol in the upper-right corner of code blocks on the website.
Better option: This video shows you how to download the textbook and lecture slides to your computer, and then you can easily drag and drop both sections of code and markdown from those into your notes.
After copy-pasting, the way to “supercharge” that into active learning is to run the code, and then change something in the code, rerun it, and see what changes.
1.7.2.1. Comments¶
In python code blocks, the “#” character tells python to ignore the rest of the line.
Leaving GOOD comments in the code is important! Good, smart code tries to reduce the use of comments by writing code so obvious that it is “self-documenting” (I’ll explain why later),
But for now… you should err on the side of adding MORE comments. Why? If you need to take a break and come back at a later point, comments will help to quickly bring you up to speed if you forgot why you put in a particular line of code. Getting in the habit of using comments is smart.
You’ll become more discerning about comments as you progress. Two articles about how and when to use comments: this link and this link
1.7.2.2. Arithmetic¶
# Note: these cells aren't showing the answers on the website on purpose!
# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT
# YOU: TRY VARIATIONS TOO...
print(2+3) # addition
print(2-3) # subtraction
print(2/3) # division - in Python 3, division of integers (a data type) inherently returns floats (a data type)
print(type(2), type(2/3)) # see?
print(2//3, type(2//3)) # floor division returns an integer.
# FOR YOU to try: use this to tell me how many full hours are in 7643 minutes?
print(2%3) # mod operator
print(2*3) # multiplication
print(2**3) # 2 to the power of three
print(2^3) # ^ is NOT the power operator!!! it is a 'bit' operator - you don't need to know this for now
int(2+3*(4+15)/3) # 1. PEMDAS applies
# 2. If the last command in a cell return an *object*, jupyter auto prints it w/o needing print()
# 3. this should be a float (21.0), but you can convert a float to an int with the int() function
1.7.2.3. Parentheses - Grouping and Calling¶
As the above example shows, parentheses are for grouping ((4+15)/3
forces addition before division) and calling a function (e.g. print()
means the print
function is called on the inputs inside the parentheses).
1.7.2.4. Logic and comparisons¶
The comparison operators are ==
(equals), !=
(not equal), >
(greater than), >=
(equal or greater than), <
(less than), and <=
(equal or less than). Each of these prompts Python to evaluate the truth of the comparison and return True
or False
.
True
and False
are booleans, meaning True
is equal to 1, and False
is equal to 0.
# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT
# YOU: TRY VARIATIONS TOO...
print(3>3) # 3 is not greater than 3, so this evaluates to...
# YOU: try 2 of the 3 other comparison operators
print(True == 1)
print(type(True), int(True), type(False), int(False)) # print() can print a sequence of objects
The logic operators are and
, or
, and not
. They evaluate a sequence of statements and return a true or false boolean.
What does or
mean? In common parlance, or
usually means “Do you want A or do you want B? (pick one)”. Mathematically, or
works like a dad joke - You: “Dad, are we rich or poor?” Dad: “Yes”.
# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT
a = True # you assign variables by writing: VariableName = Thing.
b = False
print(a and b) # if both sides of *and* are true, the whole thing is
print(a or b) # if either side of *or* is true, the whole thing is
print(a and not b) # *not* negates what is after it
print(not a or not b) # "not b" is true, so the whole thing is true
The membership operators in
and not in
check whether the left object is or is not in the object on the right side.
# try these... what do you get?
a=3
b=[1,2,3]
print(a in b)
print(a not in b)
print(b in a)
print(b not in a)
The identity operators is
and is not
check whether the left side and the right side are the same object.
WARNING: is
and ==
are NOT the same!!!* Here is an example borrowed from G4G.
list1 = []
list2 = []
list3=list1
print(list1 == list2)
print(list1 is list2)
print(list1 is list3)
Parentheses: You can (and certainly will at some point need to) check for the truth of statements involving many variables, and complex logic requests. You can dictate the order Python evaluates statements. So, for example,
if (Poor and TaxRateAtOrBelowNegative10) or (MiddleClass and TaxRateAtOrBelow5) or (Rich and TaxRateBelow15):
start_audit()
will audit rich filers if they have less than a 15% tax rate, but will only audit poor tax filers if they had a negative tax rate.
# a few silly examples
print((3>3) == False) # 1 is not greater than 2, so this evaluates to...
print(3>3 == False)
print((3>3) != True)
1.7.2.5. Variables are pointers¶
I’ll simply provide the following warning: Unless you read and understand the link above, any time you write x=y
, you might be creating a secret bug in your code that will cause potentially enormous errors!
To illustrate:
x = [1, 2, 3]
print(type(x))
y = x
print(y)
x.append(4) # change x, not y
print(y) # y was changed as well... Why? Read the page above!
<class 'list'>
[1, 2, 3]
[1, 2, 3, 4]
1.7.2.6. Everything is an object¶
Referring again to Whirlwind of Python,
In object-oriented programming languages like Python, an object is an entity that contains data along with associated metadata and/or functionality. In Python everything is an object, which means every entity has some metadata (called attributes) and associated functionality (called methods). These attributes and methods are accessed via the dot syntax.
So, object.method(<arguments here>)
will call the function method
from/on object
, and the function uses whatever arguments you pass it.
Examples:
Above, the object
x
has the type attribute oflist
, and lists have a “method” calledappend
.In the stock prices program we show during the lectures, we imported a package:
import pandas_datareader as pdr
. Now, the “package”pandas_datareader
is actually an “object” (which we callpdr
for convenience). That object - like any object - has “method” functions. In that code, for example, I calledpdr.get_data_yahoo(stocks)
to download stock prices.Seriously, EVERYTHING is an object.
Lists are objects (duh)
Attributes and methods of objects are themselves objects. Put
type(x.append)
at the end of the code block above.Files
1.7.2.7. Common object types¶
Boolean
and int
were covered above.
None
. See here.
float
. The other (besides int
) main type for numbers.
Warning
Beware of comparing floating point numbers! Below is an example, and see here for the explanation.
print(1.0+2.0 == 3.0)
print(0.1+0.2 == 0.3) # FALSE?!
True
False
str
. There are built-in functions that work on strings directly. Let’s look at a few:
a='some string' # a = "some string" is the same.
# some functions work on strings directly
print(len(a))
# string types also have many functions as methods
print(a.upper())
#YOU: type a.<tab> in your notebook, and jupyter will open a list of possible functions!
1.7.2.8. Built in data structures¶
Python has list
, tuple
, dict
, and set
. Beginners typically rely on lists extensively, but as you progress, you will find that all four are extremely useful, because their unique traits solve different needs.
Type Name |
Example |
Description |
---|---|---|
|
|
Ordered collection |
|
|
Immutable ordered collection |
|
|
Unordered (key,value) mapping |
|
|
Unordered collection of unique values |
Important
You should absolutely read this and as you do, try the examples, and throw them into your growing personal cheat sheet.
You need to know
how to define/create an object of each type (the examples above)
access elements within each type
modify elements within each type
add or remove elements from each type
when a set is useful, when a dictionary is useful (as opposed to a list)
There are LOTS of functions in python that work on the common object types and the data structures we just introduced.
First, let me illustrate the use of .extend()
vs .append()
vs +
for lists:
L=[8, 5, 6, 3, 7] # use brackets to define a list
L.extend([5]) # extend concatenates
L.extend([3,4]) # concatenates work the same with more elemens
L = L + [13,14] # + concatenates
L.append(7) # append adds its entire argument to the list as a new element.
L.append([6]) # 7 is an int, so it goes in as an int, but [6] is a *list*, so append puts a list as the element
L.append([8,9]) # see, the last element is [8,9]
L
[8, 5, 6, 3, 7, 5, 3, 4, 13, 14, 7, [6], [8, 9]]
Now, let’s all define this vector: L=[8, 5, 6, 3, 7]
.
Exercises: Write code that does the following:
Returns the length.
Returns the largest element.
Returns the smallest element.
Returns the total of the vector.
Returns the first element. See this awesome answer to learn about “slicing” lists in Python. If that link is dead: https://stackoverflow.com/questions/509211/understanding-slice-notation?rq=1
Returns the last element.
Returns the first 2 elements.
Returns the last 2 elements.
Returns the odd numbered elements (i.e. [8,6,7]).
Tip
I’d suggest putting what you just learned about how python indexes an object and how to slice a list into your personal cheat sheet until you have it memorized thoroughly.
1.7.2.9. For loops¶
Python loops are very intuitive. You need to know one thing first:
PYTHON AND INDENTATION
In python, indentations at the beginning of lines are not “up to the user”. Usually, indentations indicate a “block” of code that is run as a unit inside a for
, if
, elif
, else
, or def
command.
This code:
# If the condition is true, python will do anything that is in the
# block of code belonging to the if statement. But 7 < 5 is false,
# so python won't do anything that "belongs" to this "if"
if 7 < 5:
print('I will not print.')
print('Nor will I')
… will not do the same thing as this code:
if 7 < 5:
print('I will not print.')
print('But I will!')
…because in the second version the second print line isn’t indented and therefore isn’t part of the “if” statement
Conversely, how you use whitespace within a line is up to you. Both of these lines of code accomplish the same thing:
print( a)
print(a)
Syntax:
for <name> in <iterable object>: # you must use the colon!
<do some stuff> # you must indent
<do some more stuff if you want> # you must indent
<do even more stuff> # you must indent
Comments:
The iterable object can be anything Python can iterate through, e.g. a list. (But not just lists!)
Note: When I write anything inside
<>
, you should drop the “<” and “>” symbols when you fill that out.
A very good tip: Use good variable names!
Variable names should be something that communicates the content of the variable! Good names make code easy to read and fix bugs. Bad names set you up for pain and suffering.
Examples of bad variable names:
x
,y
,z
,vector
,myvector
Examples of goods variable names, (used in for loops so you can see a good habit):
If you are looping over
letters
, each object might be called aletter
:for letter in letters
for stock in stocks
for state in states
Example:
for state in states:
capitol=stateCapitals[state]
print(capitol)
print(capitol.upper())
# you can use as many lines as you need, just keep indenting
# the indents are 4 spaces, or more commonly, a <tab>
print(states) # <-- the for loop ends when you write a line of
# code (not a comment!) that is unindented
So, for each state, Python will start the indented block of code and run each line within the code block in sequence. So if the list of states is [Alabama, Alaska, Arizona,...]
, Python will…
Grab the first element in the list we are looping over and put that thing in a variable called “state”. So, set state = ‘Alabama’.
Set capitol = ‘Montgomery’
Print ‘Montgomery’
Print ‘MONTGOMERY’
Execute the next two lines of code, but they are just comments so nothing happens.
At the end of the indented block of code, python will check if there is another element in the states vector, and if so, repeat the indented block. There is another state after Alabama, so…
Set state = ‘Alaska’
Set capitol = ‘Juneau’
Print ‘Juneau’
Print ‘JUNEAU’
…
Set state = ‘Wyoming’
Set capitol = ‘Cheyenne’
Print ‘Cheyenne’
Print ‘CHEYENNE’
Is there another state? No? Ok! The for-loop is complete! Python will exit the indented code block and proceed. The next line of code that is not indented is
print(states)
and so that’s the next thing it will do.
1.7.2.10. If, elif, else¶
Syntax:
if <condition 1>: # you must use the colon!
<do some stuff if condition is true> # remember to indent things inside the if
elif <condition 2>: # "elif" as in "Else-If"
<do stuff if condition 1 is false but condition 2 is true>
else:
<if neither condition 1 nor 2 are true, do this>
Comments:
You can include zero or as many
elif
code blocks as you wantYou can omit the
else
block entirely - you can just do an “if” block if that’s all you needWhatever is in
<condition>
must evaluate to True or False, or 1 or 0 (1 is equivalent to True, 0 is False)See the “Logic and comparisons” section above on how Python evaluates conditions
Tip
For loops, if-elif-else, and while are “flow controls”: They control the order in which the code is executed. You can “nest” many levels of flow-control. Here is an example of putting an if
statement inside a for
loop:
for state in states:
capitol=stateCapitals[state]
print(capitol)
print(capitol.upper())
if state == "Ohio":
print("Michigan is better than Ohio")
1.7.2.11. While¶
Syntax:
while <condition is True>:
<do some stuff>
For example:
counter = 0
while counter < 7:
print(counter)
counter += 1 # "+=" is short for "add to myself".
# Here, it's an abbreviation for: counter = counter + 1
0
1
2
3
4
5
6
I have one important comment about while
loops: Every time through the loop, there must be a chance for the condition to become False. If not, your code will loop forever!
We won’t use while loops in this class. But if you ever write one, and it is stuck in an infinite loop, you can stop the kernel by typing i, i. Or click the “Terminals and Kernels” tab in the left sidebar and “shutdown” next to your code’s filename.
1.7.2.12. Writing your own functions¶
Writing your own functions is important for improving the clarity of your code because it
separates different strands of logic
allows you to reuse code
prevents copy/paste errors
The syntax of functions
To write a function, write
“def”,
name the function,
inside parentheses, list the parameters a user might pass as inputs into the function
a colon
then, all the code of the function is indented and begins on on the next line
use
return <object>
at the end of the function to output the object the function makes (which could be any type of object!)
def <nameOfYourChoice>(<you can specify arguments the function takes, or none>):
# and then write your indented code block that is the function.
On inputs:
Any object(s) you want can be given as inputs! You can give as inputs a variable, a list, a dictionary, even a function. Remember, in python, everything is an object.
You choose the name an object will be referred to as within the function.
In the example below, the function calls the first number you give it “x” and then checks if
x<0
So, even though
zz=-100
, when we runf(zz)
, the function acts like we told itx=-100
.
Functions can get “positional” arguments or keyword arguments. Positional arguments are based on the order in which you provide them to the function. Keyword arguments can be specified by including their name, without needing to be in the right spot in the order of arguments (like
b=3
below).
On outputs:
Once the code executes a line starting with
return
, the function will end and output whatever is on that line.Students new to functions usually want to end them with something like
print(answer)
. Don’t! End functions with return, likereturn answer
. If you use print, you can’t use the function’s output for anything else after it finishes. If you use return, you can save the output and use it (even if its just to print it).Any object(s) you want can be returned as outputs! The output can be a list, set, function, dictionary, string. It can be a dictionary with lists inside it, or a list with dictionaries inside it. Go wild if you want! (But only while practicing your skills. In practice, don’t be complex for the sake of it!)
Warning
Functions can print things while executing. This can be useful while debugging a function and testing it. Because of that, people new to python often use “print” at the end of functions instead of return.
However, print
DOES NOT EQUAL return
!
If you want to reuse the output at any point later in your code, use return instead of print!
def silly_me():
# a simple function for illustration, just outputs a string
return "this is a string"
a = silly_me() # now "a" is the string "this is a string"
print(a) # and you can print the output without putting print in the function
On documentation:
Code that is poorly documented won’t be used. By you, by you in the future, or by others. So you should document it! You do this by adding line(s) immediately after the first line, as the example below shows.
The docstring can be accessed by users via
<functionName>?
orhelp(<FunctionName>)
the same as any other function. In fact, this is how help is written in all Python functions we’ve used!
Example: The function below shows off positional and keyword arguments, how to write a multiline “docstring”, how the program ends once a return is executed, outputting a list, and setting default values for inputs.
def f(x, a=1, b=1):
'''
Returns a list. Element 0 is a+bx, element 1 is 2.
The first argument you give goes to x, the second to a, the third to b.
If you do not provide a or b, they default to the value 1.
'''
if x < 0:
return "WHOA THIS IS NEGATIVE"
return [a + b * x, 2] # you can return any object(s) you want! this is a list, for example
zz = -100
print(f(zz)) # Inside the function, it takes the first arg
# and calls it "x". So here x is -100 inside.
# because x<0, the func returns WHOA
# and never gets to a+b*x
print(f(2,2,2))
print(f(1)) # uses the default value of a and b
print(f(1,b=3)) # uses the default value of a
help(f) # the docstring is useful!
WHOA THIS IS NEGATIVE
[6, 2]
[2, 2]
[4, 2]
Help on function f in module __main__:
f(x, a=1, b=1)
The first argument you give goes to x, the second to a, the third to b.
If you do not provide a or b, they default to the value 1.
# this won't work! python requires you to use the keyword arguments AFTER the positional arguments
print(f(b=3,1))
File "C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3182615729.py", line 2
print(f(b=3,1))
^
SyntaxError: positional argument follows keyword argument
1.7.2.13. Advanced: Scope¶
I want you to be generally aware of the concept of “global” and “local” scope. Generally, python objects are available only within the region they are defined and subregions therein. Put differently, objects are available downstream, but not upstream.
Here is an example:
x=1
def silly_func():
xyz = 14
print(x) # variables defined OUTSIDE AND BEFORE a function are visible INSIDE the func
silly_func()
1
print(xyz) # variables defined INSIDE a function are NOT visible OUTSIDE the func
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3825219627.py in <module>
----> 1 print(xyz) # variables defined INSIDE a function are NOT visible OUTSIDE the func
NameError: name 'xyz' is not defined
Here is a second example to show that changing the downstream variable inside the function won’t change the variable’s value outside the function!
x = 1
def silly_func():
x=2
return x
print(silly_func())
print(x) # changing the downstream variable inside the function didn't change the upstream version
2
1
1.7.3. Clear output and rerun from the start!¶
Warning
I can NOT emphasize this enough: The point of code is to make things reproducible. So code must run from beginning to end and produce the same thing every time.
The nature of developing code is that you’ll run some lines of code, then write more code, then go back and change something above (and run that part again), and then go back down and keep writing and running new code. When you’re done, your code will be broken!
A golden rule
Always look to see if the first executed code block is “[1]” and that all the subsequent code blocks are numbered consecutively. Click on this link to see an example.
If the code you’re looking at doesn’t meet those two rules, I click “Run” > “Restart Kernel and Run All Cells”.
This applies to your own code! Restart and run from scratch regularly.
1.7.4. Stuck on syntax issues for a function?¶
See the tips in the Jupyter Lab page.
- 1
The subsection on scope isn’t crucial but is useful to students sometimes.