Recently I read an article titled Train sklearn 100x faster, which is about an open-source Python module named sk-dist. The module implements a "distributed scikit-learn" by extending it’s built-in parallelisation of meta-estimator, such as, pipeline.Pipeline, model_selection.GridSearchCV, feature_selection.SelectFromModel and ensemble.BaggingClassifier, etc., using spark. It was 1AM in the morning. Wise-men and…