# Does a song with a long title have a longer duration? A PySpark lesson

Following the previous post on basic ML with PySpark, we continue with a tutorial on how to run day-to-day data analytics. In this case, we explore the FMA dataset trying to answer if there is a correlation between the title of a song and its duration.

I think this could be a nice example of how to use some data transformations with PySpark to run a data analysis.

The corresponding Jupyter Notebook is available here. If you want to fork the repo:

```
git clone https://github.com/juanmanuel-tirado/pyspark-tutorial
```

Additionally, you can see the rendered version below.

Thanks for reading,

