Question regarding ds.quantileMean

Hi,

We encountered a problem when using ds.quantileMean(). We have three cohorts with sizes of about 2000, 2000, and 15000, respectively. When we use ds.quantileMean with and without type=“split”, we got strangely high values for the combined method. In fact, the combined 75% quantile is larger than the single 95% quantiles. The mean values seem to be right, and the combined mean is lower than the combined 5% quantile.

We have no access to the individual level data (which is why we are using DS :wink: ), but we tried simulating data. Here, the combined values are close to the large cohort (as expected).

We have a lot of 0 valued variables in the two smaller cohorts (in one cohort, the 5%, 10%, and 25% quantile are 0, in the other the 5%, and 10% quantile), but in the large cohort the 5% quantile is >0. Can this be the source of our problem?

Best wishes, Daniela

1 Like

Hi Daniela,

There was a tiny bug on the ds.quantileMean() function on the way that the function was dealing with missing values. The new version of DS will include the corrected version. In the meantime, and as the issue is related only with the clientside function, you can run the script from the following link https://github.com/datashield/dsBaseClient/blob/master/R/ds.quantileMean.R in your client and then use the function as usuall. The difference with the released version is an addition in line 84 of the code. I expect that this will give you the correct results but let me know if you get anything unexpected.

Many thanks, Demetris

3 Likes

Hi Demetris,

thank you very much! We will try this and get back to you if it doesn’t work!

Best wishes, Daniela

Hi @daniela.zoeller. I forgot to mentioned in my previous reply that for ‘combined’ analysis the function returns the weighted average of quantiles (which is at the moment our best approximation). What we want to develop in the nearest future, is an encryption-decryption algorithm that will be able to rank the elements of a variable from multiple sources and then calculate its actual quantiles.

Thanks for letting us know! We will keep this in mind when we use this function.