1.7. Digging into Py(thon)¶

Important

This page is long but important. It’s structured as a walkthrough - you should run the code on your computer as you read it.

Below, I cover a curated set of topics to get us going in the class.

It is not meant to be the end-all-be-all of python basics. There are very good resources around the web you should and will consult during the semester which contain more in-depth info on topics. https://www.pythoncheatsheet.org/ is a good one.

1.7.1. In addition to this page: Tutorials¶

You can’t learn programmatic material during class sessions, try though I might to make it possible. You can only learn through practice. You should be checking out tutorials and lessons online in your free time.

I suggest that you pick something under the python dropdown here or on the bootcamp page.1 If you’re new to writing code, the bootcamp classes are best. If you’re coming to python from elsewhere, the Whirlwind Tour of Python is right-sized.

As you work through the tutorial of your choice: write code in JupyterLab, run it, and save the files on your computer. If you get stuck, just start a new file.

1.7.2. Walkthrough of Python Essentials¶

Alright, let’s get going.

Below, we cover 122 essential topics you’ll need to understand.

Warning

Research strongly indicates that active learning is the most effective way to learn new skills. That’s why I linked to the tutorials above.

This chapter works as a reference page you can return to, but - especially if you are new to Python! - I want you to try to run as much code from this page in your own notebook file so you can add your own notes about python and write and try additional things with code.

To the extent possible, I want you to get comfortable typing commands yourself rather than copy-pasting. This is slightly more painful in the beginning, but much better payoff in the long run.

Tip

That said, when you’re short on time, you can copy code blocks on this website to your own notebook files to try to run them two ways:

Quick and dirty: By clicking the “copy” symbol in the upper-right corner of code blocks on the website.
Better option: This video shows you how to download the textbook and lecture slides to your computer, and then you can easily drag and drop both sections of code and markdown from those into your notes.

After copy-pasting, the way to “supercharge” that into active learning is to run the code, and then change something in the code, rerun it, and see what changes.

1.7.2.1. Comments¶

In python code blocks, the “#” character tells python to ignore the rest of the line.

Leaving GOOD comments in the code is important! Good, smart code tries to reduce the use of comments by writing code so obvious that it is “self-documenting” (I’ll explain why later).

But for now… you should err on the side of adding MORE comments. Why? If you need to take a break and come back at a later point, comments will help to quickly bring you up to speed if you forgot why you put in a particular line of code. Getting in the habit of using comments is smart.

You’ll become more discerning about comments as you progress. Two articles about how and when to use comments: this link and this link

1.7.2.2. Arithmetic¶

# Note: these cells aren't showing the answers on the website on purpose! 

# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT
# YOU: TRY VARIATIONS TOO...

print(2+3) # addition
print(2-3) # subtraction
print(2/3) # division - in Python 3, division of integers (a data type) inherently returns floats (a data type)
print(type(2), type(2/3)) # see?
print(2//3, type(2//3)) # floor division returns an integer. 
# FOR YOU to try: use this to tell me how many full hours are in 7643 minutes?

print(2%3) # mod operator
print(2*3) # multiplication
print(2**3) # 2 to the power of three
print(2^3) # ^ is NOT the power operator!!! it is a 'bit' operator - you don't need to know this for now

int(2+3*(4+15)/3) # 1. PEMDAS applies 
                  # 2. If the last command in a cell return an *object*, jupyter auto prints it w/o needing print()
                  # 3. this should be a float (21.0), but you can convert a float to an int with the int() function

1.7.2.3. Parentheses - Grouping and Calling¶

As the above example shows, parentheses are for grouping ((4+15)/3 forces addition before division) and calling a function (e.g. print() means the print function is called on the inputs inside the parentheses).

1.7.2.4. Logic and comparisons¶

The comparison operators are == (equals), != (not equal), > (greater than), >= (equal or greater than), < (less than), and <= (equal or less than). Each of these prompts Python to evaluate the truth of the comparison and return True or False.

True and False are booleans, meaning True is equal to 1, and False is equal to 0.

# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT
# YOU: TRY VARIATIONS TOO...

print(3>3)            # 3 is not greater than 3, so this evaluates to...
                      # YOU: try 2 of the 3 other comparison operators
print(True == 1)      
print(type(True), int(True), type(False), int(False)) # print() can print a sequence of objects 

The logic operators are and, or, and not. They evaluate a sequence of statements and return a true or false boolean.

What does or mean? In common parlance, or usually means “Do you want A or do you want B? (pick one)”. Mathematically, or works like a dad joke - You: “Dad, are we rich or poor?” Dad: “Yes”.

# YOU: TYPE ALL OF THESE OUT ON YOUR OTHER PARTICIPATION SHEET... YOU CAN OMIT THE COMMENTS IF YOU WANT

a = True              # you assign variables by writing: VariableName = Thing. 
b = False
print(a and b)        # if both sides of *and* are true, the whole thing is
print(a or b)         # if either side of *or* is true, the whole thing is
print(a and not b)    # *not* negates what is after it
print(not a or not b) # "not b" is true, so the whole thing is true

The membership operators in and not in check whether the left object is or is not in the object on the right side.

# try these... what do you get?
a=3
b=[1,2,3]
print(a in b)
print(a not in b)
print(b in a)
print(b not in a)

The identity operators is and is not check whether the left side and the right side are the same object.

WARNING: is and == are NOT the same!!!* Here is an example borrowed from G4G.

list1 = [] 
list2 = [] 
list3=list1 
print(list1 == list2)
print(list1 is list2)
print(list1 is list3)

Parentheses: You can (and certainly will at some point need to) check for the truth of statements involving many variables, and complex logic requests. You can dictate the order Python evaluates statements. So, for example,

if (Poor and TaxRateAtOrBelowNegative10) or (MiddleClass and TaxRateAtOrBelow5) or (Rich and TaxRateBelow15):
    start_audit()

will audit rich filers if they have less than a 15% tax rate, but will only audit poor tax filers if they had a negative tax rate.

# a few silly examples

print((3>3) == False) # 1 is not greater than 2, so this evaluates to... 
print(3>3 == False)   
print((3>3) != True)

1.7.2.5. Variables are pointers¶

Read this page!

I’ll simply provide the following warning: Unless you read and understand the link above, any time you write x=y, you might be creating a secret bug in your code that will cause potentially enormous errors!

To illustrate:

x = [1, 2, 3]
print(type(x))
y = x
print(y)
x.append(4) # change x, not y
print(y) # y was changed as well... Why? Read the page above!

<class 'list'>
[1, 2, 3]
[1, 2, 3, 4]

1.7.2.6. Everything is an object¶

Referring again to Whirlwind of Python,

In object-oriented programming languages like Python, an object is an entity that contains data along with associated metadata and/or functionality. In Python everything is an object, which means every entity has some metadata (called attributes) and associated functionality (called methods). These attributes and methods are accessed via the dot syntax.

So, object.method(<arguments here>) will call the function method from/on object, and the function uses whatever arguments you pass it.

Examples:

Above, the object x has the type attribute of list, and lists have a “method” called append.
In the stock prices program we show during the lectures, we imported a package: import pandas_datareader as pdr. Now, the “package” pandas_datareader is actually an “object” (which we call pdr for convenience). That object - like any object - has “method” functions. In that code, for example, I called pdr.get_data_yahoo(stocks) to download stock prices.
Seriously, EVERYTHING is an object.
- Lists are objects (duh)
- Attributes and methods of objects are themselves objects. Put type(x.append) at the end of the code block above.
- Files

1.7.2.7. Common object types¶

Boolean and int were covered above.

None. See here.

float. The other (besides int) main type for numbers.

Warning

Beware of comparing floating point numbers! Below is an example, and see here for the explanation.

print(1.0+2.0 == 3.0)
print(0.1+0.2 == 0.3) # FALSE?!

True
False

str. There are built-in functions that work on strings directly. Let’s look at a few:

a='some string' # a = "some string" is the same. 

# some functions work on strings directly
print(len(a)) 

# string types also have many functions as methods
print(a.upper())
#YOU: type a.<tab> in your notebook, and jupyter will open a list of possible functions!

1.7.2.8. Built-in data structures¶

Python has list, tuple, dict, and set. Beginners typically rely on lists extensively, but as you progress, you will find that all four are extremely useful, because their unique traits solve different needs.

Type Name	Example	Description
`list`	`[1, 2, 3]`	Ordered collection
`tuple`	`(1, 2, 3)`	Immutable ordered collection
`dict`	`{'a':1, 'b':2, 'c':3}`	Unordered (key,value) mapping
`set`	`{1, 2, 3}`	Unordered collection of unique values

Important

You should absolutely read this and as you do, try the examples, and throw them into your growing personal cheat sheet.

You need to know

how to define/create an object of each type (the examples above)
access elements within each type
modify elements within each type
add or remove elements from each type
when a set is useful
when a dictionary is useful (as opposed to a list)

There are LOTS of functions in python that work on the common object types and the data structures we just introduced.

First, let me illustrate the use of .extend() vs .append() vs + for lists:

L=[8, 5, 6, 3, 7] # use brackets to define a list
L.extend([5])    # extend concatenates
L.extend([3,4])  # concatenates work the same with more elemens
L = L + [13,14]  # + concatenates
L.append(7)      # append adds its entire argument to the list as a new element. 
L.append([6])    # 7 is an int, so it goes in as an int, but [6] is a *list*, so append puts a list as the element
L.append([8,9])  # see, the last element is [8,9]
L

[8, 5, 6, 3, 7, 5, 3, 4, 13, 14, 7, [6], [8, 9]]

Now, let’s all define this vector: L=[8, 5, 6, 3, 7].

Exercises: Write code that does the following:

Returns the length.
Returns the largest element.
Returns the smallest element.
Returns the total of the vector.
Returns the first element. See this awesome answer to learn about “slicing” lists in Python. If that link is dead: https://stackoverflow.com/questions/509211/understanding-slice-notation?rq=1
Returns the last element.
Returns the first 2 elements.
Returns the last 2 elements.
Returns the odd-numbered elements (i.e. [8,6,7]).

Tip

I’d suggest putting what you just learned about how python indexes an object and how to slice a list into your personal cheat sheet until you have it memorized thoroughly.

1.7.2.9. For loops¶

Python loops are very intuitive. You need to know one thing first:

PYTHON AND INDENTATION

In python, indentations at the beginning of lines are not “up to the user”. Usually, indentations indicate a “block” of code that is run as a unit inside a for, if, elif, else, or def command.

This code:

# If the condition is true, python will do anything that is in the 
# block of code belonging to the if statement. But 7 < 5 is false,
# so python won't do anything that "belongs" to this "if"
if 7 < 5: 
    print('I will not print.') 
    print('Nor will I') 

… will not do the same thing as this code:

if 7 < 5:     
    print('I will not print.') 
print('But I will!') 

…because in the second version, the second print line isn’t indented and therefore isn’t part of the “if” statement

Conversely, how you use whitespace within a line is up to you. Both of these lines of code accomplish the same thing:

print(      a)

print(a)

Syntax:

for <name> in <iterable object>:     # you must use the colon!
    <do some stuff>                  # you must indent 
    <do some more stuff if you want> # you must indent 
    <do even more stuff>             # you must indent 

Comments:

The iterable object can be anything Python can iterate through, e.g. a list. (But not just lists!)
Note: When I write anything inside <>, you should drop the “<” and “>” symbols when you fill that out.

A very good tip: Use good variable names!

Variable names should be something that communicates the content of the variable! Good names make code easy to read and fix bugs. Bad names set you up for pain and suffering.

Examples of bad variable names: x, y, z, vector, myvector
Examples of goods variable names, (used in for loops so you can see a good habit):
- If you are looping over letters, each object might be called a letter: for letter in letters
- for stock in stocks
- for state in states

Example:

for state in states:
    capitol=stateCapitals[state]
    print(capitol)
    print(capitol.upper())
    # you can use as many lines as you need, just keep indenting
    # the indents are 4 spaces, or more commonly, a <tab>
    
print(states) # <-- the for loop ends when you write a line of 
              # code (not a comment!) that is unindented 

So, for each state, Python will start the indented block of code and run each line within the code block in sequence. So if the list of states is [Alabama, Alaska, Arizona,...], Python will…

Grab the first element in the list we are looping over and put that thing in a variable called “state”. So, set state = ‘Alabama’.
Set capitol = ‘Montgomery’
Print ‘Montgomery’
Print ‘MONTGOMERY’
Execute the next two lines of code, but they are just comments so nothing happens.
At the end of the indented block of code, python will check if there is another element in the states vector, and if so, repeat the indented block. There is another state after Alabama, so…
Set state = ‘Alaska’
Set capitol = ‘Juneau’
Print ‘Juneau’
Print ‘JUNEAU’
…
Set state = ‘Wyoming’
Set capitol = ‘Cheyenne’
Print ‘Cheyenne’
Print ‘CHEYENNE’
Is there another state? No? Ok! The for-loop is complete! Python will exit the indented code block and proceed. The next line of code that is not indented is print(states) and so that’s the next thing it will do.

1.7.2.10. If, elif, else¶

Syntax:

if <condition 1>:                         # you must use the colon!
     <do some stuff if condition is true> # remember to indent things inside the if
elif <condition 2>:                       # "elif" as in "Else-If"
    <do stuff if condition 1 is false but condition 2 is true>
else:
    <if neither condition 1 nor 2 are true, do this>

Comments:

You can include zero or as many elif code blocks as you want
You can omit the else block entirely - you can just do an “if” block if that’s all you need
Whatever is in <condition> must evaluate to True or False, or 1 or 0 (1 is equivalent to True, 0 is False)
See the “Logic and comparisons” section above on how Python evaluates conditions

Tip

For loops, if-elif-else, and while are “flow controls”: They control the order in which the code is executed. You can “nest” many levels of flow-control. Here is an example of putting an if statement inside a for loop:

for state in states:
    capitol=stateCapitals[state]
    print(capitol)
    print(capitol.upper())
    if state == "Ohio":
        print("Michigan is better than Ohio")

1.7.2.11. While¶

Syntax:

while <condition is True>:
    <do some stuff>

For example:

counter = 0
while counter < 7:
    print(counter)
    counter += 1 # "+=" is short for "add to myself". 
                 # Here, it's an abbreviation for: counter = counter + 1

I have one important comment about while loops: Every time through the loop, there must be a chance for the condition to become False. If not, your code will loop forever!

We won’t use while loops in this class. But if you ever write one, and it is stuck in an infinite loop, you can stop the kernel by typing i, i. Or click the “Terminals and Kernels” tab in the left sidebar and “shutdown” next to your code’s filename.

1.7.2.12. Writing your own functions¶

Writing your own functions is important for improving the clarity of your code because it

separates different strands of logic
allows you to reuse code
prevents copy/paste errors

The syntax of functions

To write a function, write

“def”,
name the function,
inside parentheses, list the parameters a user might pass as inputs into the function
a colon
then, all the code of the function is indented and begins on the next line
use return <object> at the end of the function to output the object the function makes (which could be any type of object!)

def <nameOfYourChoice>(<you can specify arguments the function takes, or none>):
    # and then write your indented code block that is the function. 

On inputs:

Any object(s) you want can be given as inputs! You can give as inputs a variable, a list, a dictionary, even a function. Remember, in python, everything is an object.
You choose the name an object will be referred to as within the function.
- In the example below, the function calls the first number you give it “x” and then checks if x<0
- So, even though zz=-100, when we run f(zz), the function acts like we told it x=-100.
Functions can get “positional” arguments or keyword arguments. Positional arguments are based on the order in which you provide them to the function. Keyword arguments can be specified by including their name, without needing to be in the right spot in the order of arguments (like b=3 below).

On outputs:

Once the code executes a line starting with return, the function will end and output whatever is on that line.
Students new to functions usually want to end them with something like print(answer). Don’t! End functions with return, like return answer. If you use print, you can’t use the function’s output for anything else after it finishes. If you use return, you can save the output and use it (even if it is just to print it).
Any object(s) you want can be returned as outputs! The output can be a list, set, function, dictionary, string. It can be a dictionary with lists inside it, or a list with dictionaries inside it. Go wild if you want! (But only while practicing your skills. In practice, don’t be complex for the sake of it!)

Warning

Functions can print things while executing. This can be useful while debugging a function and testing it. Because of that, people new to python often use “print” at the end of functions instead of return.

However, print DOES NOT EQUAL return!

If you want to reuse the output at any point later in your code, use return instead of print!

def silly_me():
    # a simple function for illustration, just outputs a string
    return "this is a string"
    
a = silly_me() # now "a" is the string "this is a string"
print(a)       # and you can print the output without putting print in the function

On documentation:

Code that is poorly documented won’t be used. By you, by you in the future, or by others. So you should document it! You do this by adding line(s) immediately after the first line, as the example below shows.
The docstring can be accessed by users via <functionName>? or help(<FunctionName>) the same as any other function. In fact, this is how help is written in all Python functions we’ve used!

Example: The function below shows off positional and keyword arguments, how to write a multiline “docstring”, how the program ends once a return is executed, outputting a list, and setting default values for inputs.

def f(x, a=1, b=1):
    '''
    Returns a list. Element 0 is a+bx, element 1 is 2. 
    
    The first argument you give goes to x, the second to a, the third to b.
    If you do not provide a or b, they default to the value 1.
    '''
    if x < 0:
        return "WHOA THIS IS NEGATIVE"
    return [a + b * x, 2] # you can return any object(s) you want! this is a list, for example

zz = -100         
print(f(zz))      # Inside the function, it takes the first arg
                  # and calls it "x". So here x is -100 inside.
                  # because x<0, the func returns WHOA
                  # and never gets to a+b*x
print(f(2,2,2))   
print(f(1))       # uses the default value of a and b
print(f(1,b=3))   # uses the default value of a
help(f)           # the docstring is useful!

WHOA THIS IS NEGATIVE
[6, 2]
[2, 2]
[4, 2]
Help on function f in module __main__:

f(x, a=1, b=1)
    The first argument you give goes to x, the second to a, the third to b.
    If you do not provide a or b, they default to the value 1.

# this won't work! python requires you to use the keyword arguments AFTER the positional arguments
print(f(b=3,1)) 

  File "C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3182615729.py", line 2
    print(f(b=3,1))
                ^
SyntaxError: positional argument follows keyword argument

1.7.2.13. Advanced: Scope¶

I want you to be generally aware of the concept of “global” and “local” scope. Generally, python objects are available only within the region they are defined and subregions therein. Put differently, objects are available downstream, but not upstream.

Here is an example:

x=1
def silly_func():
    xyz = 14
    print(x) # variables defined OUTSIDE AND BEFORE a function are visible INSIDE the func
    
silly_func() 

print(xyz)   # variables defined INSIDE a function are NOT visible OUTSIDE the func

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
C:\Users\DONSLA~1\AppData\Local\Temp/ipykernel_6932/3825219627.py in <module>
----> 1 print(xyz)   # variables defined INSIDE a function are NOT visible OUTSIDE the func

NameError: name 'xyz' is not defined

Here is a second example to show that changing the downstream variable inside the function won’t change the variable’s value outside the function!

x = 1
def silly_func():
    x=2
    return x
print(silly_func())
print(x)               # changing the downstream variable inside the function didn't change the upstream version

2
1

1.7.3. Clear output and rerun from the start!¶

Warning

I can NOT emphasize this enough: The point of code is to make things reproducible. So code must run from beginning to end and produce the same thing every time.

The nature of developing code is that you’ll run some lines of code, then write more code, then go back and change something above (and run that part again), and then go back down and keep writing and running new code. When you’re done, your code will be broken!

A golden rule

Always look to see if the first executed code block is “[1]” and that all the subsequent code blocks are numbered consecutively. Click on this link to see an example.
If the code you’re looking at doesn’t meet those two rules, I click “Run” > “Restart Kernel and Run All Cells”.
This applies to your own code! Restart and run from scratch regularly.

1.7.4. Stuck on syntax issues for a function?¶

See the tips on the Jupyter Lab page.

1: Some people prefer to learn through games. That’s how I learned python! I built solvers for Sudoku and the Cracker Barrel golf tee game… Both taught me a LOT about programming in python, problem solving strategies, and data structures. You could try to figure out your Wordle guessing strategy (use can use this list of words). Or, visit Edabit, which has a bunch of games. If you log in, you can search for python challenges that take from 1 minute to … longer… For example, the Museum of Dull Things. If you find any games illuminating, please let me know via the class discussions repo!
2: The subsection on scope isn’t crucial but is useful to students sometimes.

LeDataSciFi-2023

1.7. Digging into Py(thon)¶

1.7.1. In addition to this page: Tutorials¶

1.7.2. Walkthrough of Python Essentials¶

1.7.2.1. Comments¶

1.7.2.2. Arithmetic¶

1.7.2.3. Parentheses - Grouping and Calling¶

1.7.2.4. Logic and comparisons¶

1.7.2.5. Variables are pointers¶

1.7.2.6. Everything is an object¶

1.7.2.7. Common object types¶

1.7.2.8. Built-in data structures¶

1.7.2.9. For loops¶

1.7.2.10. If, elif, else¶

1.7.2.11. While¶

1.7.2.12. Writing your own functions¶

1.7.2.13. Advanced: Scope¶

1.7.3. Clear output and rerun from the start!¶

1.7.4. Stuck on syntax issues for a function?¶