# Extra Practice

This is meant to help you practise the same core skills you developed in the previous exercises. Completing these exercises are **optional** and only meant to provide a little extra practice if you want. 


### Set up Python Libraries

As usual you will need to run this code block to import the relevant Python libraries

In [1]:
# Set-up Python libraries - you need to run this but you don't need to change it
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
import pandas as pd
import seaborn as sns
sns.set_theme(style='white')
import statsmodels.api as sm
import statsmodels.formula.api as smf
import warnings 
warnings.simplefilter('ignore', category=FutureWarning)

### Import a dataset to work with

Here we will read in a data set which covers a wide range of variables related to Taylor Swift's discography. Each row of the dataset represents a song, and the columns include both *musical features* (derived from Spotify’s audio analysis) and *metadata* such as the song title, album, release year, and popularity score. Here are some key variables, but feel free to explore the dataset further for more information


* `track_name`: Title of the song 
* `album` :  Name of the album the song appears on 
* `release_date` : Date the song was released 
* `popularity` :  Spotify popularity score (0–100) 
* `duration_ms` :  Length of the song in milliseconds 
* `danceability` :  How suitable the track is for dancing (0–1) 
* `energy` :  Intensity and activity level of the track (0–1) 
* `acousticness` :  Degree of acoustic sound (0–1) 
* `valence` : Positivity or happiness of the musical content (0–1) 
* `tempo` :  Estimated tempo in beats per minute (BPM) 
* `loudness` :  Overall loudness of the track in decibels (dB) 

In [55]:
taytay = pd.read_csv("https://raw.githubusercontent.com/SageBoettcher/StatsCourseBook_2026/main/data/taylor_swift_spotify.csv")
display(taytay)

Unnamed: 0.1,Unnamed: 0,name,album,release_date,track_number,id,uri,acousticness,danceability,energy,instrumentalness,liveness,loudness,speechiness,tempo,valence,popularity,duration_ms
0,0,Fortnight (feat. Post Malone),THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,1,6dODwocEuGzHAavXqTbwHv,spotify:track:6dODwocEuGzHAavXqTbwHv,0.50200,0.504,0.386,0.000015,0.0961,-10.976,0.0308,192.004,0.281,82,228965
1,1,The Tortured Poets Department,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,2,4PdLaGZubp4lghChqp8erB,spotify:track:4PdLaGZubp4lghChqp8erB,0.04830,0.604,0.428,0.000000,0.1260,-8.441,0.0255,110.259,0.292,79,293048
2,2,My Boy Only Breaks His Favorite Toys,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,3,7uGYWMwRy24dm7RUDDhUlD,spotify:track:7uGYWMwRy24dm7RUDDhUlD,0.13700,0.596,0.563,0.000000,0.3020,-7.362,0.0269,97.073,0.481,80,203801
3,3,Down Bad,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,4,1kbEbBdEgQdQeLXCJh28pJ,spotify:track:1kbEbBdEgQdQeLXCJh28pJ,0.56000,0.541,0.366,0.000001,0.0946,-10.412,0.0748,159.707,0.168,82,261228
4,4,"So Long, London",THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,5,7wAkQFShJ27V8362MqevQr,spotify:track:7wAkQFShJ27V8362MqevQr,0.73000,0.423,0.533,0.002640,0.0816,-11.388,0.3220,160.218,0.248,80,262974
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
577,577,Our Song,Taylor Swift (Deluxe Edition),2006-10-24,11,1j6gmK6u4WNI33lMZ8dC1s,spotify:track:1j6gmK6u4WNI33lMZ8dC1s,0.11100,0.668,0.672,0.000000,0.3290,-4.931,0.0303,89.011,0.539,64,201106
578,578,I'm Only Me When I'm With You,Taylor Swift (Deluxe Edition),2006-10-24,12,7CzxXgQXurKZCyHz9ufbo1,spotify:track:7CzxXgQXurKZCyHz9ufbo1,0.00452,0.563,0.934,0.000807,0.1030,-3.629,0.0646,143.964,0.518,56,213053
579,579,Invisible,Taylor Swift (Deluxe Edition),2006-10-24,13,1k3PzDNjg38cWqOvL4M9vq,spotify:track:1k3PzDNjg38cWqOvL4M9vq,0.63700,0.612,0.394,0.000000,0.1470,-5.723,0.0243,96.001,0.233,54,203226
580,580,A Perfectly Good Heart,Taylor Swift (Deluxe Edition),2006-10-24,14,0YgHuReCSPwTXYny7isLja,spotify:track:0YgHuReCSPwTXYny7isLja,0.00349,0.483,0.751,0.000000,0.1280,-5.726,0.0365,156.092,0.268,53,220146


## Part 1: Distributions

Let's have an inital peak into the dataset:

**a. have a look at the distribution of the variable <tt>popularity</tt>? Can you find any songs that you might suspect to be "outliers"?**

In [13]:
# Your code here

*Your text here*

**b. have a look at the distribution of the variable <tt>duration</tt>? Does the distribution look skewed?**

In [None]:
# Your code here

*Your text here*

**c. have a look at the distribution of the variable <tt>danceability</tt>?**

In [23]:
# Your code here

**d. let's add a new variable which will classify each song as either a dance song or not.**

*To do this we will check if each song is above or below the median danceability Think! what percentage of songs should be in each group?*

In [74]:
taytay['dancey_song']=taytay.danceability>taytay.danceability.median()
taytay

Unnamed: 0.1,Unnamed: 0,name,album,release_date,track_number,id,uri,acousticness,danceability,energy,instrumentalness,liveness,loudness,speechiness,tempo,valence,popularity,duration_ms,release_year,dancey_song
0,0,Fortnight (feat. Post Malone),THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,1,6dODwocEuGzHAavXqTbwHv,spotify:track:6dODwocEuGzHAavXqTbwHv,0.50200,0.504,0.386,0.000015,0.0961,-10.976,0.0308,192.004,0.281,82,228965,2024,False
1,1,The Tortured Poets Department,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,2,4PdLaGZubp4lghChqp8erB,spotify:track:4PdLaGZubp4lghChqp8erB,0.04830,0.604,0.428,0.000000,0.1260,-8.441,0.0255,110.259,0.292,79,293048,2024,True
2,2,My Boy Only Breaks His Favorite Toys,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,3,7uGYWMwRy24dm7RUDDhUlD,spotify:track:7uGYWMwRy24dm7RUDDhUlD,0.13700,0.596,0.563,0.000000,0.3020,-7.362,0.0269,97.073,0.481,80,203801,2024,True
3,3,Down Bad,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,4,1kbEbBdEgQdQeLXCJh28pJ,spotify:track:1kbEbBdEgQdQeLXCJh28pJ,0.56000,0.541,0.366,0.000001,0.0946,-10.412,0.0748,159.707,0.168,82,261228,2024,False
4,4,"So Long, London",THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,5,7wAkQFShJ27V8362MqevQr,spotify:track:7wAkQFShJ27V8362MqevQr,0.73000,0.423,0.533,0.002640,0.0816,-11.388,0.3220,160.218,0.248,80,262974,2024,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
577,577,Our Song,Taylor Swift (Deluxe Edition),2006-10-24,11,1j6gmK6u4WNI33lMZ8dC1s,spotify:track:1j6gmK6u4WNI33lMZ8dC1s,0.11100,0.668,0.672,0.000000,0.3290,-4.931,0.0303,89.011,0.539,64,201106,2006,True
578,578,I'm Only Me When I'm With You,Taylor Swift (Deluxe Edition),2006-10-24,12,7CzxXgQXurKZCyHz9ufbo1,spotify:track:7CzxXgQXurKZCyHz9ufbo1,0.00452,0.563,0.934,0.000807,0.1030,-3.629,0.0646,143.964,0.518,56,213053,2006,False
579,579,Invisible,Taylor Swift (Deluxe Edition),2006-10-24,13,1k3PzDNjg38cWqOvL4M9vq,spotify:track:1k3PzDNjg38cWqOvL4M9vq,0.63700,0.612,0.394,0.000000,0.1470,-5.723,0.0243,96.001,0.233,54,203226,2006,True
580,580,A Perfectly Good Heart,Taylor Swift (Deluxe Edition),2006-10-24,14,0YgHuReCSPwTXYny7isLja,spotify:track:0YgHuReCSPwTXYny7isLja,0.00349,0.483,0.751,0.000000,0.1280,-5.726,0.0365,156.092,0.268,53,220146,2006,False


## Part 2: Counts
**a. How many songs does taylor have in each of her albums?** 

*hint if you are having trouble reading the x axis, try using the command `plt.xticks(rotation=90)`*

In [None]:
# Your code here

**b. Does Taylor usually balance the amount of Dancey Songs on her albums?** 


In [None]:
# Your code here

## Part 3: Comparing Across Variables
**a. Do Taylors Dancey Songs tend to be more popular?** 


In [None]:
# Your code here

**b. What is Taylors most popular album?** 
* Does this change if you use mean, median, or max song popularity? 

In [48]:
# Your code here

**c. Does Taylor tend to make her most popular tracks in a certain position on an album?**

In [52]:
# Your code here

**d. is there a relationship between valence and popularity?** 

In [57]:
# Your code here

## Part 4: Timeseries

**Let's start by adding a new variable to our dataset called `release_year` based on the `release_date`**

In [71]:
taytay['release_year']=taytay['release_date'].str[:4].astype(int)
taytay

Unnamed: 0.1,Unnamed: 0,name,album,release_date,track_number,id,uri,acousticness,danceability,energy,instrumentalness,liveness,loudness,speechiness,tempo,valence,popularity,duration_ms,release_year
0,0,Fortnight (feat. Post Malone),THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,1,6dODwocEuGzHAavXqTbwHv,spotify:track:6dODwocEuGzHAavXqTbwHv,0.50200,0.504,0.386,0.000015,0.0961,-10.976,0.0308,192.004,0.281,82,228965,2024
1,1,The Tortured Poets Department,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,2,4PdLaGZubp4lghChqp8erB,spotify:track:4PdLaGZubp4lghChqp8erB,0.04830,0.604,0.428,0.000000,0.1260,-8.441,0.0255,110.259,0.292,79,293048,2024
2,2,My Boy Only Breaks His Favorite Toys,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,3,7uGYWMwRy24dm7RUDDhUlD,spotify:track:7uGYWMwRy24dm7RUDDhUlD,0.13700,0.596,0.563,0.000000,0.3020,-7.362,0.0269,97.073,0.481,80,203801,2024
3,3,Down Bad,THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,4,1kbEbBdEgQdQeLXCJh28pJ,spotify:track:1kbEbBdEgQdQeLXCJh28pJ,0.56000,0.541,0.366,0.000001,0.0946,-10.412,0.0748,159.707,0.168,82,261228,2024
4,4,"So Long, London",THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY,2024-04-19,5,7wAkQFShJ27V8362MqevQr,spotify:track:7wAkQFShJ27V8362MqevQr,0.73000,0.423,0.533,0.002640,0.0816,-11.388,0.3220,160.218,0.248,80,262974,2024
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
577,577,Our Song,Taylor Swift (Deluxe Edition),2006-10-24,11,1j6gmK6u4WNI33lMZ8dC1s,spotify:track:1j6gmK6u4WNI33lMZ8dC1s,0.11100,0.668,0.672,0.000000,0.3290,-4.931,0.0303,89.011,0.539,64,201106,2006
578,578,I'm Only Me When I'm With You,Taylor Swift (Deluxe Edition),2006-10-24,12,7CzxXgQXurKZCyHz9ufbo1,spotify:track:7CzxXgQXurKZCyHz9ufbo1,0.00452,0.563,0.934,0.000807,0.1030,-3.629,0.0646,143.964,0.518,56,213053,2006
579,579,Invisible,Taylor Swift (Deluxe Edition),2006-10-24,13,1k3PzDNjg38cWqOvL4M9vq,spotify:track:1k3PzDNjg38cWqOvL4M9vq,0.63700,0.612,0.394,0.000000,0.1470,-5.723,0.0243,96.001,0.233,54,203226,2006
580,580,A Perfectly Good Heart,Taylor Swift (Deluxe Edition),2006-10-24,14,0YgHuReCSPwTXYny7isLja,spotify:track:0YgHuReCSPwTXYny7isLja,0.00349,0.483,0.751,0.000000,0.1280,-5.726,0.0365,156.092,0.268,53,220146,2006


**a. Now let's track how taylors songs have changed over the years. Start by tracking the popularity of her songs across the years**

In [None]:
#Your code here

**b. okay now the valence**

In [None]:
#Your code here

**c. and how about the danceability**

In [None]:
#Your code here

## Part 5: Open Exploration

Are there any other interesting variables we should consider?
