Need help? Resources and more¶
Are you stuck?¶
We all get stuck sometimes. Here is the sequence of steps you might follow, although the exact things you do obviously will depend on the task:
Help functions and documentation.
Search Google/Stack Overflow. Stack Overflow in particular is where the programming community asks and answers questions.
Ask your peers by posting an issue in the classmates team
Post your problem as a question on Stack Overflow.
The TA’s email and/or office hours.
My office hours or email. I’ve listed myself last because, although I’m happy to help (I am!), I can’t actually debug for 60+ students.
Asking questions (steps 3-5 above) is an important part of this class! So I’m borrowing the 15 minute rule:
Warning
Once you’ve spent 15 minutes attempting to troubleshoot a problem, you must ask for help!
Which naturally prompts the question of…
How to ask for help (in a way that gets the best answers)¶
This section applies (especially) to Stack Overflow, our discussion repo, and if you follow this, office hours will be more efficient too. Coders expect other programmers will try the obvious (documentation, and previous Stack Overflow threads) and then follow these tips so the community can answer effectively and efficiently.
Making your question effective is something of an art. To make your question effective, the idea is to make things as easy as possible for someone to answer.
Oh, goodie! 1
The act of writing an effective question and/or minimum working example (MWE) will often cause you to answer your own question!
So, here are the elements of a good question:
# 1: Introduce the problem with an informative title
Be specific with your title. It should be brief, but also informative so that when others are looking at the Issues page (and they have a similar error and/or solution), they can easily find it.
Bad title: “I need help!”
Good title: “Getting a ‘file not found error’ when importing scotus.csv”
# 2: Summarize the problem
Introduce the problem you are having. Include what task you are trying to perform, pertinent error messages, and any solutions you’ve already attempted. This helps us narrow down and troubleshoot your problem.
# 3: Include a reproducible example
Including a minimal, complete, and verifiable example of the code you are using greatly helps us resolve your problem. You don’t need to copy all the code from your program into the comment, but include enough code that we can run it successfully (on our computer) until the point at which the error occurs.
# 4: Post your solution
Once you have solved the problem (either by yourself or with the help of an instructor/classmate), post the solution. This let’s us know that you have fixed the issue AND if anyone else encounters a similar error, they can refer to your solution to fix their problem.
# 5: Acknowledgments for this section
A few more tricks to deal with getting stuck¶
Stop coding! Take a quick break and clear your head.
Get out a piece of paper and map out/outline how to solve the problem using plain words (pseudocode).
Add print statements everywhere (old school).
Use debugging tools.
Clear your head more substantively - go for a run, take a nap, get groceries. Returning to the code with a fresh eye will solve problems more often than you would believe.
Sleep! Coding tired is a sure way to make mistakes.
Resources - tutorials and data¶
Note
Anything that is bolded/underlined below is also considered essential.
If you have any favorite resources you like, or found helpful, please let me know ()
THE MOST ESSENTIAL RESOURCES
Help: Google, Stack Overflow, Github help, JupyterLab documentation, Python help
Cheat sheets to bookmark/print! Better yet, download these to your Notes repo, and put them in the “Codebook” folder therein!
Included in this folder: python basics, jupyter notebook, importing data, numpy, pandas, seaborn, and scikit-learn
Anything that is bolded/underlined below is also considered essential.
Python
Essential: A whirlwind tour of python
Essential: datacamp.com has many self guided lessons
Lessons 3 and 5 of the official documentation
This has to be the best list of Python resources on the internet.
Data Science
Visualization
Essential: Kaggle’s Data viz tutorial is excellent. It has reproducible code and data, using python.
Essential: An Economist’s Guide to Visualizing Data is excellent as well.
Essential: Data Visualization: A practical introduction, by Kieran Healy especially discusses the “whys” of visualization in a smart way. The walkthroughs are in R, not python, however.
Github, Git, and Version control
Getting started on GitHub and a twitter length description of how a project flows
https://try.github.io/ is awesome if you ever want or need to use Git (just plain “Git”). It has a cheatsheet and some very nice visual walkthroughs, including the “branching” link.
The most thorough yet simple walkthrough of Git and Github use on the web. Applies to python use for the most part.
Data/ML
Scikit (python package) can read in some data, which has data on Boston real estate, wine, a larger california housing dataset
Essential: Pandas can read in a LOT of useful data! Data providers include: Federal Reserve (“FRED”), Ken French, NASDAQ, OECD, Qunadl, TSP, World Bank, and more!
ML competitions with serious prizes at drivendata.org
This comp was interesting. You could start trying to analyze it here. This has a good example of the process you might follow. After you’re done, you can see the winner’s code and discussion of the winning approach
Essential: kaggle.com has ML competitions, some FAQs, tutorials, data and competitions
Real estate data, a tutorial exploring that data, and a pass at a model
Philly based data would be fun. Here is real estate, one option for data, seems ok, N=805
Predict box office for movies. VaultML claims they can do this by reading the screenplays and using textual analysis tools
UC Irvine has a data repo, some of these are available via scikit package
Predicting where the wine is from (wine/location <— easy starter challenge (where is the wine from?)
More good sources: data.gov, data.census.gov, data.world, https://ourworldindata.org is incredible and also has many repos on Github including one that imports data via python, …
Books
Range, by David Epstein is a very interesting book generally, and it touches on prediction skill too
Superforecasters. Here is a decent free summary
- 1
I hope you’re ready for a lot of cheesy writing and bad meme humor this semester.