1, About ML packages
We wrote the DS-based Lasso and a set of Lasso-based sparse multi-task learning methods. It was supposed to release in these two months with the paper.
2, distributed Clustering
There are about three ways to do Clustering incorporating multiple geo-distributed matrices (just come to my mind)
a, Bagging of local models (meta-analysis)
b, Distributed integrative matrix-factorization + local Clustering
c, Distributed Clustering
It would be great if you could include all in your package.
We done the second one, implemented the distributed integrative matrix-factorization, which extracts the common component of multiple matrices into one. If you like, you can include the result of our method as the input to the clustering. This is very simple to implement.
Datashield has already provided quite many mechanisms to protect the privacy-preserving (see their paper), but most are not specific to machine learning. For distributed integrative matrix-factorization, we only output the incomplete model which I think (but not sure) is robust to model inverse attack. @all, More information especially on the attack of the incomplete model was appreciated. For example, in the 2-server-environment, one shared matrix out of five matrices was returned as the output. This shared matrix was enough for subsequent clustering. The inverse construction was not possible. But the inverse attack… no idea.