1.1. Motivation,

1.1.1. Or: Should I take this class (A: YES!)

Employers are starving for talented students that can analyze large datasets, and they are willing to pay for it:


Python is #2 among languages in terms of salary:


Within “data science” itself (a broader category than finance jobs with data science), python has the most jobs:


Among the common alternative data science languages, python is the most popular on SO:


Coding skills are valuable to your resume because computing power can make you more productive at tasks. You’ll see an example of this on our first day in class. This is why knowing how to code will open more interview doors and possibly lead to improved job offers (and change the trajectory of your career).

Still, most students in this class do not have a comparative advantage in the labor market in terms of their computer science skills relative to students in CS programs. That’s fine! Most students in this class don’t want pure CS jobs.


Your comparative advantage is probably the ability to apply your business acumen to problems where coding plays a role in analyzing potential business decisions or setting up a production pipeline!

This is VERY valuable!

A saw can be dangerous, useless, or useful - which it is depends on the context you use it in.

ML techniques, computing power, and big data can be dangerous, useless, or useful - which it is depends on the context you use it in. In particular, what are the relevant economics you’re facing? A great coder might ignore the economics, but your previous classes will help you evaluate the economics!


Adding this class to your Lehigh business education gives you a set of complementary skills: The ability to use high-power datasets and analysis within a framework for problem-solving.

1.1.2. How to create value

One way:

  1. Identify a problem: A business need, a market with frictions.

  2. Define your project: A clearly specified question with metrics for success and an idea of impact. Always keep the big picture, and economic context, in mind!

  3. Work collaboratively on the problem: Interesting problems are big, and require teamwork. Sir Edmund Hilary does not summit Everest without Tsenging Norgue.

  4. Acquire data and clean it: Age-old wisdom tells us that if the input is crap, the output will be… Thus, time spent on cleaning data is often more valuable than time spent on modeling.

  5. Explore the data.

  6. Analysis, using appropriate modeling tools. Students get excited about modeling techniques, but this is <25% of the work on most projects.

  7. Deliver the project conclusions to higher-ups in the form of clear business recommendations. Writing should always be geared to the audience, and managers typically want bottom lines, whereas technical leads need more technical justification.


Understanding and practicing each of these steps is valuable, even if you never code again after this class!

You’ve applied these steps in previous classes. (For example, valuing a potential merger.) The goal of this class is to expand the set of problems you can work on!

1.1.3. From here to there…

I’ve designed this class with the hope that you’ll be prepared and able to execute each of those steps.

By the end of the semester, your resume, should you choose to, can include your (burgeoning) proficiency with Python, Github, Machine Learning (ML) tools, web scraping, and data viz, in addition to describing your exploits on Github and the final course project.

So, your journey this semester is hopefully something like:1


1.1.4. Our partnership

Your half of the bargain:

  1. You will have to work outside of class quite a bit.

  2. You will seek help when you need it. Resources are available: me, the TA, the resources section of the website, and your classmates.

    • When you have questions, ask! Falling behind is costly, and asking a question is cheap. In fact, community participation (which includes asking questions) is rewarded explicitly in grading.

    • If you’re confused or having computer issues, someone else surely is too.

My half of the bargain: I will work just as hard as you throughout the semester to improve this new class. I’m very available for help. When something doesn’t work out, I’ll try to improve it.

1.1.5. Ambitious but feasible

This class is ambitious! You will need to learn skills from computer science, statistics, and econometrics just so that we have the toolkit needed to begin analysis. I’m aiming to make each of those components accessible (e.g. we won’t prove any theorems, and I’m boiling down programming to essentials). Still, that menu of skills is not easy to acquire (that’s why employers pay $$$ for it!).

If you’ve never programmed:

  • Students with no programming background have succeeded every semester.

  • I swear, youngens these days have it so much easier!2 Seriously, getting Python up and running has never been quicker, and we will have some working code soon!

  • You will be frustrated at times. This is natural! No programmer exists who has not cursed their computer to the depths of hell.

    • This is completely true: Half the time, it’s a silly typo on line 42 of your code. Like, you literally misspelled “regression” as “regresion”.

    • Corollary: A lot of programming takes place after dark, under the influence of coffee and Red Bull. This is why you misspelled “regresion”. Try to program at times when you have a clearer mind :)

  • Overcoming those frustrating issues feels soooooo good. You’ll feel a sense of accomplishment. Fight for that!

  • Your classmates are in it too, and they can, and surely will, help.

  • Programming is not about memorizing the annoying details of 30 functions. I won’t test you on memorization.


I guess that makes me your old assistant on the journey…