# PySpark MLlib tutorial

Tutorials
python
spark
machinelearning
data

If you’re familiar with Spark you probably know it offers a Python framework named PySpark that enables developers to use the existing Spark libraries.

I have released a tutorial of the Spark machine learning library (MLlib). This tutorial is not intended to explain any ML theory, although some theory can be found. The tutorial is more a collection of examples around how to manipulate data structures to feed the algorithms implemented by this library.

The corresponding Jupyter Notebook is available here. If you want to fork the repo:

```
git clone https://github.com/juanmanuel-tirado/pyspark-tutorial
```

Additionally, you can see the rendered version below.

## Related

Does a song with a long title have a longer duration? A PySpark lesson

Tutorials
python
spark
data
analytics

Graph processing: a problem with no clear victor

·3 mins

Data science
Graphs
Opinion
datascience
graphs
opinion
spark
tensorflow

Covid19 spreading in a networking model (part II)

·11 mins

Data science
Graphs
covid19
datascience
graph-tool
graphs
plotly
python

Covid19 spreading in a networking model (part I)

·13 mins

Data science
covid19
datascience
graph-tool
graphs
networks
pandas
plotly
python

Covid-19 forecasting

·15 mins

Data science
covid19
forecasting
matplotlib
pandas
python
statsmodels
time series