Statement on Generative AI tools

I use GenAI every day in my work. It is a great copilot and writes code well. This is especially true for smaller pieces of code where the inputs are well defined and desired outputs/goals are clear. I generally (but not always) view it as (1) an improvement on the typical documentation available for code packages and Q&A on StackOverflow, (2) a kind of auto-complete, and (3) a “coauthor” to get feedback from.

However, on the kinds of coding I do in my typical day-to-day research, suggestions from state-of-the-art GenAI models only serve as a starting point and usually do not work immediately (or appear to “work” but produce errors when checked closely). I have found fatal flaws in “working” code many times that would have falsely altered my conclusions. Validating that code produced by GPT does exactly what I need is slow but important. Doing this well requires a strong understanding of the task at hand and data science concepts we will cover in the course.

On issues of fact, my baseline assumption is that I have to verify everything. As with checking code, this process is slow but important. Fact-checking GPT outputs is easier when you have institutional knowledge and expertise. Thus, my experience and knowledge base is necessary for GenAI to be useful to me in coding, writing, and research.

With that in mind, your long term goal should be to gain sufficient skills and knowledge such that you can supervise, instruct, validate, and edit the suggestions of GenAI.

So as you learn to code, it’s important that you use GenAI in a way that does not sabotage your long term skill development. There is no substitute for attempting exercises on your own and discovery the answer on your own. Using GPT on exercises can create a mirage of learning; they can hinder your learning if you misuse them to bypass critical thinking tasks that you need to engage in. As beginners in your field(s) of study, you are not (yet) able to identify gaps, biases, or outright misinformation in AI output.

GPT cheating policy

Do not pose any full exercise questions to any GenAI. Using GPT solutions to full exercise prompts in this course is considered cheating for academic purposes.

  • Citing GPT does not matter for this policy - the issue is not plagiarism, but falsified learning outcomes.

  • Enforcement of this is mostly via incentives: If the answer is correct, you won’t have learned the concept well. The gap in your skills will grow as the course proceeds. If the answer is incorrect, you likely won’t be able to figure out why it failed and will receive a poor grade on the assignment.

  • However, I will report any instances of cheating on this policy I am made aware of.

Suggestions for fair and smart use

  1. You can ask it to discuss concepts iteratively through a conversation (much like you would have a conversation with a peer, TA or an instructor). Ways to do that that facilitate learning:

    • “Explain this to me”: Copy in code from the website for more details and discussion.

    • To clarify and explain a coding concept after you have already looked through our website - “What is a for loop in python and how can I use it to ____?”

    • To help you work through the logic of something you do not understand (“Why does this merge [copy code] work for this specific problem? How would changing the parameters cause a problem in my analysis?”)

    • Given a problem description and your proposed algorithm and “talk” through the potential issues.

  2. The most common reason students are tempted to use GPT excessively on exercises: “I’m at a point in the problem, and don’t know what to do next.” This means that either:

    • (A) you have not figured out the conceptual thing you want to do next (e.g. that you need to merge in a variable from another dataset), or

    • (B) that you do not know what function/steps to achieve the conceptual thing (e.g. what function will merge in the variable).

    • In those situations, start first by using our website. It contains discussions, examples, code, or references to virtually everything we do in the course. The site has a search bar. Next, look up documentation from the official website of whichever package you think might be used to solve your next step (Python, Pandas, Seaborn, and Statsmodels). These official documentation websites contain examples and functions you might use. Learning to use online resources like this is a valuable skill.

  3. The other main reason students use GenAI is that they are stuck on an error.

    • Allowed with caveat: Copy and pasting your bad code and the error output into ChatGPT. This often helps solve the issue quickly because error output is often messy and hard for students to parse. (But: Simply reading the error code yourself nearly always points to the issue faster and often the answer.)

    • Caveat: If you copy code into ChatGPT which contains the prompt or enough information, you might be given the answer. This subverts the goal of learning and, if you copy in enough info, can constitute cheating.

    • Please read the help page on the website, which describes

      • The 15 minute rule - ask for help after you have been stuck 15 minutes

      • What to do when you are stuck

      • How to ask for help

This page borrows in a few places from the syllabi of Grusha Prasad at Colgate University and Kirsten Helmer at University of Massachusetts Amherst.