2.10. Extra Practice#
This is meant to help you practise the same core skills you developed in the previous exercises. Completing these exercises are optional and only meant to provide a little extra practice if you want.
2.10.1. Set up Python Libraries#
As usual you will need to run this code block to import the relevant Python libraries
# Set-up Python libraries - you need to run this but you don't need to change it
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
import pandas as pd
import seaborn as sns
sns.set_theme(style='white')
import statsmodels.api as sm
import statsmodels.formula.api as smf
import warnings
warnings.simplefilter('ignore', category=FutureWarning)
2.10.2. Import a dataset to work with#
Here we will read in a data set which covers a wide range of variables related to Taylor Swift’s discography. Each row of the dataset represents a song, and the columns include both musical features (derived from Spotify’s audio analysis) and metadata such as the song title, album, release year, and popularity score. Here are some key variables, but feel free to explore the dataset further for more information
track_name: Title of the songalbum: Name of the album the song appears onrelease_date: Date the song was releasedpopularity: Spotify popularity score (0–100)duration_ms: Length of the song in millisecondsdanceability: How suitable the track is for dancing (0–1)energy: Intensity and activity level of the track (0–1)acousticness: Degree of acoustic sound (0–1)valence: Positivity or happiness of the musical content (0–1)tempo: Estimated tempo in beats per minute (BPM)loudness: Overall loudness of the track in decibels (dB)
taytay = pd.read_csv("https://raw.githubusercontent.com/SageBoettcher/StatsCourseBook_2026/main/data/taylor_swift_spotify.csv")
display(taytay)
| Unnamed: 0 | name | album | release_date | track_number | id | uri | acousticness | danceability | energy | instrumentalness | liveness | loudness | speechiness | tempo | valence | popularity | duration_ms | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Fortnight (feat. Post Malone) | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 1 | 6dODwocEuGzHAavXqTbwHv | spotify:track:6dODwocEuGzHAavXqTbwHv | 0.50200 | 0.504 | 0.386 | 0.000015 | 0.0961 | -10.976 | 0.0308 | 192.004 | 0.281 | 82 | 228965 |
| 1 | 1 | The Tortured Poets Department | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 2 | 4PdLaGZubp4lghChqp8erB | spotify:track:4PdLaGZubp4lghChqp8erB | 0.04830 | 0.604 | 0.428 | 0.000000 | 0.1260 | -8.441 | 0.0255 | 110.259 | 0.292 | 79 | 293048 |
| 2 | 2 | My Boy Only Breaks His Favorite Toys | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 3 | 7uGYWMwRy24dm7RUDDhUlD | spotify:track:7uGYWMwRy24dm7RUDDhUlD | 0.13700 | 0.596 | 0.563 | 0.000000 | 0.3020 | -7.362 | 0.0269 | 97.073 | 0.481 | 80 | 203801 |
| 3 | 3 | Down Bad | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 4 | 1kbEbBdEgQdQeLXCJh28pJ | spotify:track:1kbEbBdEgQdQeLXCJh28pJ | 0.56000 | 0.541 | 0.366 | 0.000001 | 0.0946 | -10.412 | 0.0748 | 159.707 | 0.168 | 82 | 261228 |
| 4 | 4 | So Long, London | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 5 | 7wAkQFShJ27V8362MqevQr | spotify:track:7wAkQFShJ27V8362MqevQr | 0.73000 | 0.423 | 0.533 | 0.002640 | 0.0816 | -11.388 | 0.3220 | 160.218 | 0.248 | 80 | 262974 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 577 | 577 | Our Song | Taylor Swift (Deluxe Edition) | 2006-10-24 | 11 | 1j6gmK6u4WNI33lMZ8dC1s | spotify:track:1j6gmK6u4WNI33lMZ8dC1s | 0.11100 | 0.668 | 0.672 | 0.000000 | 0.3290 | -4.931 | 0.0303 | 89.011 | 0.539 | 64 | 201106 |
| 578 | 578 | I'm Only Me When I'm With You | Taylor Swift (Deluxe Edition) | 2006-10-24 | 12 | 7CzxXgQXurKZCyHz9ufbo1 | spotify:track:7CzxXgQXurKZCyHz9ufbo1 | 0.00452 | 0.563 | 0.934 | 0.000807 | 0.1030 | -3.629 | 0.0646 | 143.964 | 0.518 | 56 | 213053 |
| 579 | 579 | Invisible | Taylor Swift (Deluxe Edition) | 2006-10-24 | 13 | 1k3PzDNjg38cWqOvL4M9vq | spotify:track:1k3PzDNjg38cWqOvL4M9vq | 0.63700 | 0.612 | 0.394 | 0.000000 | 0.1470 | -5.723 | 0.0243 | 96.001 | 0.233 | 54 | 203226 |
| 580 | 580 | A Perfectly Good Heart | Taylor Swift (Deluxe Edition) | 2006-10-24 | 14 | 0YgHuReCSPwTXYny7isLja | spotify:track:0YgHuReCSPwTXYny7isLja | 0.00349 | 0.483 | 0.751 | 0.000000 | 0.1280 | -5.726 | 0.0365 | 156.092 | 0.268 | 53 | 220146 |
| 581 | 581 | Teardrops on My Guitar - Pop Version | Taylor Swift (Deluxe Edition) | 2006-10-24 | 15 | 1hxLyjC9D9Jpw6EAPKqWv4 | spotify:track:1hxLyjC9D9Jpw6EAPKqWv4 | 0.04020 | 0.459 | 0.753 | 0.000000 | 0.0863 | -3.827 | 0.0537 | 199.997 | 0.483 | 55 | 179066 |
582 rows × 18 columns
2.10.3. Part 1: Distributions#
Let’s have an inital peak into the dataset:
a. have a look at the distribution of the variable popularity? Can you find any songs that you might suspect to be “outliers”?
# Your code here
Your text here
b. have a look at the distribution of the variable duration? Does the distribution look skewed?
# Your code here
Your text here
c. have a look at the distribution of the variable danceability?
# Your code here
d. let’s add a new variable which will classify each song as either a dance song or not.
To do this we will check if each song is above or below the median danceability Think! what percentage of songs should be in each group?
taytay['dancey_song']=taytay.danceability>taytay.danceability.median()
taytay
| Unnamed: 0 | name | album | release_date | track_number | id | uri | acousticness | danceability | energy | instrumentalness | liveness | loudness | speechiness | tempo | valence | popularity | duration_ms | dancey_song | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Fortnight (feat. Post Malone) | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 1 | 6dODwocEuGzHAavXqTbwHv | spotify:track:6dODwocEuGzHAavXqTbwHv | 0.50200 | 0.504 | 0.386 | 0.000015 | 0.0961 | -10.976 | 0.0308 | 192.004 | 0.281 | 82 | 228965 | False |
| 1 | 1 | The Tortured Poets Department | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 2 | 4PdLaGZubp4lghChqp8erB | spotify:track:4PdLaGZubp4lghChqp8erB | 0.04830 | 0.604 | 0.428 | 0.000000 | 0.1260 | -8.441 | 0.0255 | 110.259 | 0.292 | 79 | 293048 | True |
| 2 | 2 | My Boy Only Breaks His Favorite Toys | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 3 | 7uGYWMwRy24dm7RUDDhUlD | spotify:track:7uGYWMwRy24dm7RUDDhUlD | 0.13700 | 0.596 | 0.563 | 0.000000 | 0.3020 | -7.362 | 0.0269 | 97.073 | 0.481 | 80 | 203801 | True |
| 3 | 3 | Down Bad | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 4 | 1kbEbBdEgQdQeLXCJh28pJ | spotify:track:1kbEbBdEgQdQeLXCJh28pJ | 0.56000 | 0.541 | 0.366 | 0.000001 | 0.0946 | -10.412 | 0.0748 | 159.707 | 0.168 | 82 | 261228 | False |
| 4 | 4 | So Long, London | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 5 | 7wAkQFShJ27V8362MqevQr | spotify:track:7wAkQFShJ27V8362MqevQr | 0.73000 | 0.423 | 0.533 | 0.002640 | 0.0816 | -11.388 | 0.3220 | 160.218 | 0.248 | 80 | 262974 | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 577 | 577 | Our Song | Taylor Swift (Deluxe Edition) | 2006-10-24 | 11 | 1j6gmK6u4WNI33lMZ8dC1s | spotify:track:1j6gmK6u4WNI33lMZ8dC1s | 0.11100 | 0.668 | 0.672 | 0.000000 | 0.3290 | -4.931 | 0.0303 | 89.011 | 0.539 | 64 | 201106 | True |
| 578 | 578 | I'm Only Me When I'm With You | Taylor Swift (Deluxe Edition) | 2006-10-24 | 12 | 7CzxXgQXurKZCyHz9ufbo1 | spotify:track:7CzxXgQXurKZCyHz9ufbo1 | 0.00452 | 0.563 | 0.934 | 0.000807 | 0.1030 | -3.629 | 0.0646 | 143.964 | 0.518 | 56 | 213053 | False |
| 579 | 579 | Invisible | Taylor Swift (Deluxe Edition) | 2006-10-24 | 13 | 1k3PzDNjg38cWqOvL4M9vq | spotify:track:1k3PzDNjg38cWqOvL4M9vq | 0.63700 | 0.612 | 0.394 | 0.000000 | 0.1470 | -5.723 | 0.0243 | 96.001 | 0.233 | 54 | 203226 | True |
| 580 | 580 | A Perfectly Good Heart | Taylor Swift (Deluxe Edition) | 2006-10-24 | 14 | 0YgHuReCSPwTXYny7isLja | spotify:track:0YgHuReCSPwTXYny7isLja | 0.00349 | 0.483 | 0.751 | 0.000000 | 0.1280 | -5.726 | 0.0365 | 156.092 | 0.268 | 53 | 220146 | False |
| 581 | 581 | Teardrops on My Guitar - Pop Version | Taylor Swift (Deluxe Edition) | 2006-10-24 | 15 | 1hxLyjC9D9Jpw6EAPKqWv4 | spotify:track:1hxLyjC9D9Jpw6EAPKqWv4 | 0.04020 | 0.459 | 0.753 | 0.000000 | 0.0863 | -3.827 | 0.0537 | 199.997 | 0.483 | 55 | 179066 | False |
582 rows × 19 columns
2.10.4. Part 2: Counts#
a. How many songs does taylor have in each of her albums?
hint if you are having trouble reading the x axis, try using the command plt.xticks(rotation=90)
# Your code here
b. Does Taylor usually balance the amount of Dancey Songs on her albums?
# Your code here
2.10.5. Part 3: Comparing Across Variables#
a. Do Taylors Dancey Songs tend to be more popular?
# Your code here
b. What is Taylors most popular album?
Does this change if you use mean, median, or max song popularity?
# Your code here
c. Does Taylor tend to make her most popular tracks in a certain position on an album?
# Your code here
d. is there a relationship between valence and popularity?
# Your code here
2.10.6. Part 4: Timeseries#
Let’s start by adding a new variable to our dataset called release_year based on the release_date
taytay['release_year']=taytay['release_date'].str[:4].astype(int)
taytay
| Unnamed: 0 | name | album | release_date | track_number | id | uri | acousticness | danceability | energy | instrumentalness | liveness | loudness | speechiness | tempo | valence | popularity | duration_ms | dancey_song | release_year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Fortnight (feat. Post Malone) | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 1 | 6dODwocEuGzHAavXqTbwHv | spotify:track:6dODwocEuGzHAavXqTbwHv | 0.50200 | 0.504 | 0.386 | 0.000015 | 0.0961 | -10.976 | 0.0308 | 192.004 | 0.281 | 82 | 228965 | False | 2024 |
| 1 | 1 | The Tortured Poets Department | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 2 | 4PdLaGZubp4lghChqp8erB | spotify:track:4PdLaGZubp4lghChqp8erB | 0.04830 | 0.604 | 0.428 | 0.000000 | 0.1260 | -8.441 | 0.0255 | 110.259 | 0.292 | 79 | 293048 | True | 2024 |
| 2 | 2 | My Boy Only Breaks His Favorite Toys | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 3 | 7uGYWMwRy24dm7RUDDhUlD | spotify:track:7uGYWMwRy24dm7RUDDhUlD | 0.13700 | 0.596 | 0.563 | 0.000000 | 0.3020 | -7.362 | 0.0269 | 97.073 | 0.481 | 80 | 203801 | True | 2024 |
| 3 | 3 | Down Bad | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 4 | 1kbEbBdEgQdQeLXCJh28pJ | spotify:track:1kbEbBdEgQdQeLXCJh28pJ | 0.56000 | 0.541 | 0.366 | 0.000001 | 0.0946 | -10.412 | 0.0748 | 159.707 | 0.168 | 82 | 261228 | False | 2024 |
| 4 | 4 | So Long, London | THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY | 2024-04-19 | 5 | 7wAkQFShJ27V8362MqevQr | spotify:track:7wAkQFShJ27V8362MqevQr | 0.73000 | 0.423 | 0.533 | 0.002640 | 0.0816 | -11.388 | 0.3220 | 160.218 | 0.248 | 80 | 262974 | False | 2024 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 577 | 577 | Our Song | Taylor Swift (Deluxe Edition) | 2006-10-24 | 11 | 1j6gmK6u4WNI33lMZ8dC1s | spotify:track:1j6gmK6u4WNI33lMZ8dC1s | 0.11100 | 0.668 | 0.672 | 0.000000 | 0.3290 | -4.931 | 0.0303 | 89.011 | 0.539 | 64 | 201106 | True | 2006 |
| 578 | 578 | I'm Only Me When I'm With You | Taylor Swift (Deluxe Edition) | 2006-10-24 | 12 | 7CzxXgQXurKZCyHz9ufbo1 | spotify:track:7CzxXgQXurKZCyHz9ufbo1 | 0.00452 | 0.563 | 0.934 | 0.000807 | 0.1030 | -3.629 | 0.0646 | 143.964 | 0.518 | 56 | 213053 | False | 2006 |
| 579 | 579 | Invisible | Taylor Swift (Deluxe Edition) | 2006-10-24 | 13 | 1k3PzDNjg38cWqOvL4M9vq | spotify:track:1k3PzDNjg38cWqOvL4M9vq | 0.63700 | 0.612 | 0.394 | 0.000000 | 0.1470 | -5.723 | 0.0243 | 96.001 | 0.233 | 54 | 203226 | True | 2006 |
| 580 | 580 | A Perfectly Good Heart | Taylor Swift (Deluxe Edition) | 2006-10-24 | 14 | 0YgHuReCSPwTXYny7isLja | spotify:track:0YgHuReCSPwTXYny7isLja | 0.00349 | 0.483 | 0.751 | 0.000000 | 0.1280 | -5.726 | 0.0365 | 156.092 | 0.268 | 53 | 220146 | False | 2006 |
| 581 | 581 | Teardrops on My Guitar - Pop Version | Taylor Swift (Deluxe Edition) | 2006-10-24 | 15 | 1hxLyjC9D9Jpw6EAPKqWv4 | spotify:track:1hxLyjC9D9Jpw6EAPKqWv4 | 0.04020 | 0.459 | 0.753 | 0.000000 | 0.0863 | -3.827 | 0.0537 | 199.997 | 0.483 | 55 | 179066 | False | 2006 |
582 rows × 20 columns
a. Now let’s track how taylors songs have changed over the years. Start by tracking the popularity of her songs across the years
#Your code here
b. okay now the valence
#Your code here
c. and how about the danceability
#Your code here
2.10.7. Part 5: Open Exploration#
Are there any other interesting variables we should consider?