Fork me on GitHub

Overview

This course is intended to help you learn how to analyze large datasets correctly and efficiently. We will discuss methods for analysis of big data, and how to get research done more efficiently using basic scientific computing skills. This course will consist of limited lecture time and extensive hands-on time. We will cover data management, statistical methods, task automation, and how to make data analysis clear and reproducible. No prior programming experience is required. Although most of the data will be biological or environmental, the material learned will be fully translatable to other fields.

Course Goals

At the end of this course students should be able to…

  • understand methods to analyze large datasets reproducibly.
  • effectively manage and analyze data.
  • apply the tools learned in class to research.

Student Learning Outcomes

At the end of this course students should be able to…

  • write a script to visualize a dataset.
  • select appropriate statistical methods to produce unbiased results for a dataset.
  • document code so that it is readable to someone else.
  • produce a completely replicable analysis associated with a research project for a dataset of any size.
  • produce a research paper directly incorporating results of analyses from code.

*bold items refer to goals for students in BIO 539.