2.2. THE GOLDEN RULES¶
Following these GOLDEN RULES will save you hours and hours of time.
Tip
The rules below are really golden. How about you copy this page into your class notes repo someplace you can reference them?
Category |
Rule |
---|---|
0. PLAN BEFORE YOU CODE |
A. “Pseudo code” is writing out the broad steps in plain language. I often (almost always for complicated tasks) do this on paper, then translate it to code as an outline (in the code’s comments). |
B. Break the problem into chunks/smaller problems. Large chunks should be in different script files. Within an individual file that is doing one thing, try to break it down. See rule 3.A and 5.B below too. |
|
1. Automation |
A. Automate everything that can be automated, don’t do point-and-click analysis! |
B. If the project involves running multiple code files in order, write a single script that executes all code for the project from beginning to end, like this one. |
|
2. Version control |
A. Store code and data under version control. |
C. Before checking the directory back in, clear all outputs, delete temp files, and then run the whole directory to make sure the outputs reproduce! (Check: Did it work right?) |
|
A. Separate folders/directories and files by function |
|
B. Put input files into an input folder and outputs into a different folder |
|
A + B = your folders and files will be largely self-documenting |
|
C. Make directories portable - they should run on any computer, or if you move them to another place on your computer. See the next rule. |
|
4. Data |
A. Store cleaned data in tables with unique, non-missing “keys” |
B. Keep data normalized as far into your code pipeline as you can |
|
C. Data cleaning and exploration data analysis golden rules here. |
|
5. Functions |
A. Write functions to eliminate redundancy |
B. Write functions to improve clarity |
|
C. Otherwise, don’t write functions |
|
D. Test your functions! Use small examples where you know the right answers, and try variations to see if the function breaks in some cases. |
|
A. Is good… to a point |
|
B. Don’t write documentation you will not maintain |
|
C. Code is better off when it is self-documenting |
|
7. Writing code |
A. Don’t use magic numbers, define once as variables and refer as needed |
B. Write DRY code: Don’t Repeat Yourself! See rule 5.A. |
|
C. Premature optimization (for speed) is the root of all evil. |
|
D. Use self-documenting variable and function names. |
|
8. Look at your data/objects |
As discussed here and many other places. |