Errors while performing ds.mean, ds.dis and more

Hello all,

I am new to DataShield and my first step was to complete the tutorial from your site.

However, I have run into several problems:

  • When running ds.mean(x=‘D$LAB_HDL’, datasources=opals) the following error occurs:

Error in matrix(unlist(ss.obj), nrow = Nstudies, byrow = TRUE)[, 1:4] : subscript out of bounds

I am new to both R and DataShield, so any hint of what could be going wrong here would be of great help.

  • Running ds.dim(‘D’) produces the following error:

Error: Command ‘dimDS(D)’ failed on ‘dstesting-100’: Error while evaluating 'dsBase::dimDS( D )'

The same error seems to occur for a few other functions including ds.table1d(x=‘D$GENDER’), ds.histogram(x=‘D$LAB_HDL’, datasourcs=opals) and ds.glm(formula=D$DIS_DIAB~D$PM_BMI_CONTINUOUS+D$LAB_HDL*D$GENDER, family=‘binomial’).

Has anyone came across similar problems? I appreciate your help!

Hi Tanjascats! Thanks for joining the community!

There are a several things I would like to know in order to help you:

  • are you running on a windows or linux system- and did you follow the instructions for that system?
  • are you following the instructions for DataSHIELD v5.0 or v4.0? (this can be seen on the left hand side navigation panel)

I will be engaging with my colleagues this morning to see if they know any more!

Thank you for the fast reply.

Regarding your questions:

  • I am running on Ubuntu 18.10 and - I followed the instructions for Linux users accordingly
  • I was following the instructions for DataSHIELD v5.0
  • It is worth to mention that some functionalities operate seemingly without any problem, such as ds.quantileMean, ds.log, ds.assign, all sub-setting functions, etc.

Hi,

you say you are new to R, do you know if you have installed devtools at any point you have been using it?

If you haven’t, I would advise installing it in the following way:

First running this command in the ubuntu shell (Terminal):

sudo apt-get install libxml2-dev libcurl4-openssl-dev libssl-dev libgsl-dev -y

And then in R, installing the package by using command:

install.packages("devtools", dependencies=TRUE)

N.B. you should expect the installation process to take upwards of 10 minutes, be prepared to wait!

The resources here and here may be useful for understanding what devtools is for.

Once installed successfully, in R could you run the command: devtools::session_info()

,and then post the output here? This will allow us to check what versions you are running clientside to further help us with troubleshooting.

Hi,

here is the screenshot of a full sequence of the commands with the error message in the end. The second screenshot contains the output of devtools::session_info().

Hi again,

sorry for the delay, has anything changed in the past day and a half?

One more thing could you please check, is what version you are running on the virtual machine/server side?

To do this, you log in via the browser, going to the address 192.168.56.100:8080/ (when you’ve already got the VM/ VMs up and running) and entering the username/password combination: administrator / datashield_test& (this is specified on the wiki tutorial as well in case you need to refer back to it).

Then once you are in you should go through the following route:

^the entrance screen. Click on administration in top right of screen

^inside administration, click on DataSHIELD which is a link under the “Data Analysis” subsection, top RHS of text.

^Inside dataSHIELD, look at the table under Packages. You will see my server-side version is 5.2.0 (because I am running a beta VM). Which version are you running?

Hello,

I have not solved my problem yet, unfortunately. On the server-side, I am running version 4.1.0.

I appreciate your help.

Tanja

Ah right! You’ll be pleased to know, it looks like we have tracked the problem down, fingers crossed!

You can see on this server side version it is running a modified version 4, but on client side, your devtools::sessioninfo told us it was running version 5. This is a mismatch that explains why many of your functions weren’t working- the client and server weren’t able to communicate the commands and requests to each other.

The instructions to fix this are as follows. You need to delete every one of the packages in that table, by clicking on the “remove” link under the “Actions” column. Next, Click on the add package button, and you will see the dialogue come up:

Enter “datashield/dsBase” into first text field, “install a specific dataSHIELD package” Enter “5.0.0” into the second text field, “Git reference”.

Once that is on the server, try logging into the opal server in R again (using the login dataframe, datashield.login(…) etc. ) and see if your functions work now!

My pleasure!

Hi Alex!

It works!

I am very happy to let you know that the incompatible versions were indeed the source of the problem and that now all the functionalities seem to work properly! Thank you very much for your time.

Lovely to hear!

Could you help me out, just by checking that the following graphing functions work for you;

  • ds.histogram(…
  • ds.contourPlot(…
  • ds.heatmapPlot(…

Because a couple of weeks ago when I first followed the instructions the graphs weren’t working for me, if they do for you that would be useful to know.

Secondly, did you end up with v4.1 packages on your server just by following the prescribed instructions? Because if so that is an issue with our documentation that we will try to fix as soon as possible.

Very happy to hear it’s all working for you, good luck using it and please feel free to post any more questions!

Hi Alex,

I must admit I have yet to use these functions. I am more on the development of then the visual representation. If you could show me tomorrow what is happening, I will see what I can do.

Best wishes,

P.

Hi Alex,

I was just extensively testing all the functions; the visualization functions, ds.histogram, ds.contourPlot and ds.heatmap do not work for me either. Interestingly, I remember contour plot and heatmap did work with v4.1 packages on my server. Furthermore, ds.glm(…) does not work, as well.

Regrading the packages on my server, it just came to my mind that some things were already set-up about six months ago (or more) because I am building upon someone else’s work - explaining possibly why I have ended up with v4.1 on the server.

Hi Alex and Tanja,

Can you tell me what arguments are you using in the graphical functions and what kind of error you get? Because the graphical functions have been changed in v5 and probably something was not updated on the wiki’s tutorial.

Demetris,

in my first week when i set up using v5 both ends the graphs never worked. Lately I’ve got pre-release v6 installed, graphs do work on that. Stuart and I have not yet got to the bottom of what is going on with v5.

Tanjascats,

I would just like to point out that we do have extra resources available should you find them useful to get up to speed with R: an introduction to R with 4 parts https://data2knowledge.atlassian.net/wiki/spaces/DSDEV/pages/707231766/Introduction+to+R+tutorial including a couple of practice “homeworks”!

Hi Demetris,

I am using the following calls for graphical functions and getting the errors as stated below:

  • ds.histogram(x=‘D$LAB_HDL’, datasources = opals)

Error: Command ‘histogramDS1(D$LAB_HDL,1,3,0.25)’ failed on ‘dstesting-100’: Error while evaluating 'dsBase::histogramDS1( D$LAB_HDL,1,3,0.25 )'

  • ds.contourPlot(x=‘D$LAB_TSC’, y=‘D$LAB_HDL’, datasources = opals) and
  • ds.heatmapPlot(x=‘D$LAB_TSC’, y=‘D$LAB_HDL’, datasources = opals) show the exact same error message:

Error: Command ‘rangeDS(D$LAB_TSC)’ failed on ‘dstesting-100’:

Is GLM in version 5 working properly in your experience(s)? I am getting a following error:

  • ds.glm(formula=D$DIS_DIAB~D$PM_BMI_CONTINUOUS+D$LAB_HDL*D$GENDER, family=‘binomial’)

Error: Command ‘glmDS1(D$DIS_DIAB ~ D$PM_BMI_CONTINUOUS + D$LAB_HDL * D$GENDER, “binomial”, NULL, NULL, NULL)’ failed on ‘dstesting-100’:

Edit: SOLVED by updating Opal. Thank you Demetris and Alex for help.

Hi Tanja,

The new versions of graphical functions use a new feature of Opal that generates a seed random number generator at each server. So they require Opal 2.14 or higher to operate. The VMs in the wiki might based on an older version of Opal.

@swheater and @alexwesterberg can we update the training VMs in the Wiki?

About the glm: Can you check if it works if you put the formula inside quotation marks?

2 Likes