# Does a song with a long title have a longer duration? A PySpark lesson

Tutorials
python
spark
data
analytics

Following the previous post on basic ML with PySpark, we continue with a tutorial on how to run day-to-day data analytics. In this case, we explore the FMA dataset trying to answer if there is a correlation between the title of a song and its duration.

I think this could be a nice example of how to use some data transformations with PySpark to run a data analysis.

The corresponding Jupyter Notebook is available here. If you want to fork the repo:

```
git clone https://github.com/juanmanuel-tirado/pyspark-tutorial
```

Additionally, you can see the rendered version below.

Thanks for reading,

## Related

PySpark MLlib tutorial

Tutorials
python
spark
machinelearning
data

Graph processing: a problem with no clear victor

·3 mins

Data science
Graphs
Opinion
datascience
graphs
opinion
spark
tensorflow

Covid19 spreading in a networking model (part II)

·11 mins

Data science
Graphs
covid19
datascience
graph-tool
graphs
plotly
python

Covid19 spreading in a networking model (part I)

·13 mins

Data science
covid19
datascience
graph-tool
graphs
networks
pandas
plotly
python

Covid-19 forecasting

·15 mins

Data science
covid19
forecasting
matplotlib
pandas
python
statsmodels
time series