Theoretical limitations to statistical modelling in Federated Analysis

wilmar.igl · 18 November 2020 12:55

Hi,

I wonder what are the theoretical limitations, e.g. types of statistical models or types of data, which cannot be (re-)formulated or modelled as federated analysis? Could you share references to support these limitations (or absence of)?

Best, Wilmar

daniela.zoeller · 1 December 2020 13:52

Hi,

It depends on the type of federated analysis you want to use. The approaches in DataSHIELD right now can in theory be extended to all score based statistical approaches, e.g., GLM. It is not usable for non-parametric approaches easily.

If you use differential privacy, e.g., adding a noise term to the data, or secure multi party computation, everything is implementable. The drawback is, that the single-to-noise-ration gets worse or the computational effort gets really heavy.

Best, Daniela

HankCao · 1 December 2020 17:39

From machine learning perspective, the method can be distributed if the iterative update can be decoupled. You might need to check “homomorphic encryption”, which is directly related to your question.

Hank

patricia.ryser-welch · 2 December 2020 07:58

Hi all,

The models can be all implemented in privacy-preserving federated analysis. We need to be adapt our implementation for to prevent disclosure of data through inference and other counter attack. We need to adapt our implementation to work with large datasets too. For example, my work is becoming interesting with non-parametric statistics. The algorithm I use is quite efficient with small data sets; i.e., up to a third of the maximum size of a data frame in R. However, with larger datasets the scalability is affected. It is taking quite a lot of time of my thinking time at the moment. Other federated systems have the same issues with machine learning, and they do not apply privacy preserving computations.

Patricia

Topic		Replies	Views
Poster Presentation on Federated Analysis at ISCB conference, Milan, IT, 2023-08-29 Communications and Outreach conference	1	172	16 November 2023
Poster Presentation on Federated Analysis at PSI conference, London, UK, 2023-06-12 Communications and Outreach conference	2	202	19 August 2023
dsMTL 0.9.9 released Releases	0	237	3 April 2023
Survival models in DataSHIELD Statistical help	2	358	5 June 2022
Collaboration for Federated Analysis project in Cardiac Rehabilitation / Cardiovascular disease Beginner Support welcome , jobs	2	132	19 October 2023

Theoretical limitations to statistical modelling in Federated Analysis

Related topics