Welcome

Welcome#

Welcome to Statistics and Data Science

In this course you will learn how we draw inferences and conclusions about the world from samples of data. Alongside the theory, this course is taught using Python – a popular general-purpose programming language used for a wide range of applciations including research.

Why do I need to know Statistics

Throughout your studies – and in your everyday life – you will be constantly encounter data. News headlines, health advice, research articles, political debates, and advertisements are filled with numbers which supposedly support certain claims. Being able to evaluate the legitimacy of the evidence presented to you is an important skill even beyond the exam. Statistics allows us to:

  • Seperate signal from noise in complex information

  • Judge whether claims are trustworthy

  • Systematically test hypotheses

  • Turn raw data into meaningful knowledge

Why should I learn statistics through programming

Learning statistics through programming presents a great opportunity. Programming gives you a clear, hands-on way to see how statistical ideas work. Here’s why we think this approach is worthwhile:

  • It makes the process transparent, as you will see how each calculation is done, rather than just getting results.

  • It gives you more flexibility as you will be able to apply the same skills to any data set that you encounter

  • It is practical. Python is used in many different fields, so having a good handle on these topics will give you a basis for potential future careers even beyond research.

Think of learning to program like learning a new language. It may feel unfamiliar at first, but the more you use it, the more “fluent” you become. Try not to be hard on yourself in the beginning!

DataCamp

As preparation for the course and support throughout, we use the online coding environment DataCamp. DataCamp is a great way to learn coding as it is interactive - it takes you through tasks step by step, offers you hints and points out where you have made errors. There are also video tutorials on key topics.

I strongly encourage you to use DataCamp for the following:

  • preparatory work (see notes in Chapter 1)

  • support with coding syntax throughout the course

  • extension work DataCamp has data science courses that will take you beyond the scope of this course. DataCamp offers 350+ courses by expert instructors on topics such as importing data, data visualization, and machine learning. They’re constantly expanding their curriculum to keep up with the latest technology trends and to provide the best learning experience for all skill levels.

A big shout out to DataCamp for providing free access for students on this course, thanks!