1.10. Libraries/Packages¶
1.10.1. Popular, nay, essential packages¶
As the semester proceeds, you will surely need to learn (to some degree) the following packages. For each, you might note the most common and useful functions, and copy common “cookbook” uses of the packages which you can paste into new programs. (E.g. how to open a csv file.)
Note: I do not personally, nor do many programmers, commit to memory many functions of many packages. We simply know what can be done and when needed, we search (tab completion/google/stack overflow) for the command/recipe for that function.
Built-in packages:
os
sys
itertools
re
datetime
csv
Datasci packages (Anaconda installs these for you!), note the aliases here aren’t strictly needed, but by convention, virtually everyone uses the shorter names
pandas as pd
(pd is a short alias for pandas)seaborn as sns
matplotlib as mpl
statsmodels.api as sm
matplotlib.pyplot as plt
numpy as np
sklearn
Web crawling
requests
,requests_html
,urllib
time
andtdqm
beautifulsoup4 as bs4
html5lib
selenium
1.10.2. Installing libraries¶
The Anaconda distribution we installed also installed most of the key data science Python libraries/packages we will use throughout the semester. In the event you need to install a new package to add functionality to Python, e.g. seaborn
(which you already have!), you can
Open Anaconda Prompt (Windows) or Terminal (Mac) or a code cell in Jupyter Lab
conda install seaborn
will install SeabornIf
conda install
doesn’t work for a package, you can try topip install
it. E.g.pip install seaborn
Some packages can’t be pip
installed, but hopefully you won’t need to deal with that this semester, so I’m going to skip discussion of such package installs.