LifeCycle wishlist

Not yet confirmed

Within LifeCycle we have made an inventory what is needed to do actual the analysis. The list is prioritised by what is most needed. The list will evolve and be amended over time when LifeCycle members become more familiar with R and DataSHIELD.

Datamanagement

Priority Package Function Information
1. R (3.6.x) Reshape is already in the betatest, but needs amendments to work with non-existing rows (https://www.rdocumentation.org/packages/stats/versions/3.6.0/topics/reshape)
2. R (3.6.x) Merge is already in the betatest ;-D (https://www.rdocumentation.org/packages/base/versions/3.6.0/topics/merge)

Analysis

Priority Package Function Information
1. ? GLMM the Tom Bishop way or the Elinor way?
2. medation https://cran.r-project.org/web/packages/mediation/vignettes/mediation.pdf
medation mediate
medation plot.mediate
medation summary.mediate
medation medsens
medation plot.medsens
medation summary.medsens
medation mediations
medation plot.mediations
medation summary.mediations
medation multimed
medation plot.multimed
medation summary.multimed
3. medflex https://cran.r-project.org/web/packages/medflex/vignettes/medflex.pdf
medflex ExpData
medflex neImpute
medflex NeModel
medflex NeLht
medflex NeEffdecomp
medflex NeWeight
4. metafor https://cran.r-project.org/web/packages/metafor/metafor
metafor rma.uni
metafor forest
5. lme4 https://cran.r-project.org/web/packages/lme4/lme4.pdf
lme4 . glmer

The datamanagement tools are really needed to do the first analysis and will be preferable in the September release. Regarding the analysis tools the LifeCycle WP leaders need to make a decision about which packages need to be developed first regarding analysis. The intention it to determine this in the next WP leaders meeting somewhere half of June.

Hello,

  I'd like to recall that, as shown in the DataSHIELD workshop in

Newcastle last November, a simulation-based approach to approximate GLMM in federated fashion, as compared to the distributed regression approach of Elinor for instance, is also already available .

Also see example (line 560) from submitted documentation now under revision.

  The software could be made more DataSHIELD user-friendly, but it

already works once loaded as external package on the central-analysis computer.

  Please consider if referencing such external functionalities in

the next release can be useful (I can write some documentation for it). Note this simulation-based approach can also be readily used for frailty Cox regression and Nelson-Aalen estimation in DataSHIELD (examples to come).

Feel free to contact me for any inquiry.

Hi,

Before I do any more work on the ‘Tom Bishop GLMM’ (i.e. study level meta analysis of separate GLMMs), I would like to understand a bit more. I have looked at the GitHub links, but was wondering if there was a paper I could look at to get a bit more context.

Thanks

Tom

Hi Tom,

You mean a proposal to do actual research with GLMM I think. Angela wants to use GLMM in her study regarding pets and asthma. I can ask her to give you more context in which she want’s to use the function in DataSHIELD?

Kind regards,

Sido

Hi,

Thank you so much for this information. I will feed it back to the LifeCycle members who requested the methods. I will come back to you on this post.

Kind regards,

Sido

Hi Sido,

Sorry I should have been more specific. I actually was asking @bono for more information about his way of doing GLMM as that might give LifeCycle a solution now, rather than waiting for me to do the study-level meta analysis (SLMA) version.

But, it would also be helpful to get more information about the planned analyses, in case we need to press on with the SLMA version.

Apologies for any confusion Tom

Actually while I am here, we also would like to have Cox regression via DataSHIELD, which is mentioned in @bono `s post. Perhaps this is already possible?

That is on the wish list for RECAP too - I think the group at INESC TEC may have it in their sights…

Best wishes,

 -- Andrei

I guys,

  yes it is possible to approximate (frailty) Cox with the

simulation-based method by now. Unfortunately we are a bit behind with documentation and paper submission. We have a submitted paper on a GLMM Binomial example, whose results I showed on the last DS workshop. I attach abstract below

  @tombishop : I'm avaialble for any inquiry. I can even meet you

on skype next week for a more direct exchange, if this does not break any forum rule.

Abstract

  Title: Recovery of IPD inferences from key IPD summaries only:

Applications within DataShield

  Currently just some Individual Participant Data (IPD) statistical

models, like standard GLM regressions, can be performed within DataShield while protecting privacy. For example, reproduction of an IPD random-effects regression from anonymous IPD summaries can be already challenging here. Our goal is to extend the inferential capabilities within DataShield. To this end we propose a method to reconstruct unavailable original IPD and IPD inferences from empirical IPD marginal moments and correlation matrix only. The approach is rather generic and is based on a copula inversion technique where IPD marginal and dependence modeling is guided by the above summaries. Through practical examples we show our method can well recover fixed and latent effect estimates of an IPD multi-variate Logistic regression, suggesting new applications within DataShield are possible.

Just an update on this:

DM:
1 & 2. Reshape and Merge currently in dsBetaTest - these will be merged into DS5 release (Sept)

Analysis

  1. A GLMM version will be made available in the DataSHIELD testing repository (dsAlpha) post-September 2019.
  2. lme4 may be released at the same time into dsAlpha.

@demetris.avraam is going to take a look at @bono 's code to see if it can be adapted for DataSHIELD. He will keep @tombishop in the loop re this.

@paul.burton says the Metafor package can be installed on the client and used as is? Not sure about the other two packages.

Hello,

@demetris.avraam the code can be definitely be made more DS friendly, including for instance calls to automatically retrieve the needed IPD summaries. However note the code needs only to sit on the central analysis computer hence is perfectly functional right now. Feel free to contact me for any inquiry.

Best

I agree that the metafor package does not require any DataSHIELD development because you run it on the client side. Also, I think lme4 (which is for mixed models) is covered in the requirement for GLMM

Hi everyone,

Thanks @bono, I will have a look in your code and we can discuss it at some point for more details. At the moment, I’m trying to develop a ds.glmm function based on Elinor’s approach.

About Cox regression: @Gijs and Frank, from the Netherlands Comprehensive Cancer Organisation, have developed an algorithm for Cox Proportional Hazards in R for their infrastructure (https://github.com/IKNL/dcoxph) and we believe that their algorithm can be also implemented in DataSHIELD. You can see more details in this paper: https://www.ncbi.nlm.nih.gov/pubmed/26159465. Also @Gijs will give a talk in the DataSHIELD Workshop.

1 Like

Hello all, we have the cox PH algorithm implemented, but it runs on our Docker-based infrastructure. I hope that we can discuss how to bring this algorithm to dataSHIELD during the September workshop. @bono, if you need more info (or if you can’t wait), please reach out!

1 Like

Is there disclosure protection built into the function, or would that need to be addressed if bringing it into DataSHIELD?

No … would be eager to learn how you manage that