Installation query and test data

Dear Dshield team, Congratulations for the great efforts.

Installation query: After setting up opal and R (both in same VM). I installed opalr, dsBase and resourcer (through opal) on the server side. On the client side (analysis R environment), i installed dsbaseclient, DSI, DSopal. Do i need to install any other packages either on the Server or Client sides?.

testdata: I am looking for some test data (.csv and associated data-dictionary template) as required by opal to test my installations. Any idea where i can find one. Best regards, Bala

Bala,

Those are the basic packages which are required to perform federated analysis, for packages for additional analysis check Community Packages

Test datasets are available from puppet-datashield/files/testdata at master · datashield/puppet-datashield · GitHub, the test data is in “Opal Archive” format (.zip) for quick upload and CSVs with Data Dictionaries if you prefer.

Stuart

Hi Stuart, Thank you for the inputs. a) I first tested my client side installation by connecting to tables available in https://opal-demo.obiba.org/" and applying some basic statistical analysis. It worked very well.

b) Then i uploaded a the test data from puppet-datashield you suggested to the in-house open server i setup on our cloud. In this case, datashield is able to contact the server but i am not able to apply a function. Since i am new to R, i am not getting where i am making a mistake. Any input from Datashield users would be of great help. I am pasting the output of R session below.

Thanks, Bala

logindata ← builder$build()

conns ← datashield.login(logins = logindata, assign=TRUE, symbol=“D”)

Logging into the collaborating servers Logged in all servers [================================================================] 100% / 1s

No variables have been specified. All the variables in the table (the whole dataset) will be assigned to R!

Assigning table data… Assigned all tables [==================================================================] 100% / 1s

Data dimensions information

ds.dim(x = ‘D’) Aggregated (dimDS(“D”)) [==============================================================] 100% / 0s Error: There are some DataSHIELD errors, list them with datashield.errors()

datashield.errors() $study1 [1] “Command ‘dimDS("D")’ failed on ‘study1’: No such DataSHIELD ‘AGGREGATE’ method with name: dimDS”

solved it: I just figured it out that i have to publish the ‘dsBase’ functions in the opal server in order to be used by the client. I am leaving my leaving previous post undeleted so that someone in future may benefit.

1 Like

Hi,

  1. opalr does not need to be installed in the R server, because it is an Opal client package (i.e. used to communicate with Opal).

  2. when installing DataSHIELD packages (e.g. dsBase), do it via the page Administration > DataSHIELD (and not Administration > R) so that the package’s functions are automatically published. You can also use the opalr::dsadmin.install_package function.

Regards
Yannick

Hi,

I have another question about the packages that Bala has listed:

Are DSI and DSOpal necessary on the client side? They are not listed under the Community Packages on your website, so I had the impression that they become redundant with dsBaseClient. I am missing an overview of which packages are absolutely necessary on the server and client side for the implementation of DataSHIELD…

Kind regards, Charlotte

Hi,

Yes, DSOpal is required to connect to a Datashield server that uses Opal. When installing DSOpal, you’ll get DSI as well. DSI and DSOpal are not performing any computations, they are establishing the connection with the Datashield servers (i.e. broadcasting analysis requests, receiving responses). Then dsBaseClient (and any other Datashield client analysis packages) depends on DSI.

Regards
Yannick

Hi Yannick,

many thanks for your reply! So I’m summarising the minimum requirements mentioned in this thread (for posterity :wink:) -

Server:

  • resourcer
  • dsBase

Client:

  • opalr
  • DSOpal (DSI already included)
  • dsBaseClient (additional analysis packages available, see Community Packages)

Hope this helps anyone who also needed this overview.

Best, Charlotte

Hi Charlotte,

I would regard opalr, DSOpal, DSI as Middleware packages of DataSHIELD, as apposed to Analysis packages.

I will create a new page on the website which covers these package (both Opal and Armadilo)

Stuart

1 Like

You should have a look at this cookbook, it will give you some useful information both for the client and server setups:

https://opaldoc.obiba.org/en/latest/cookbook/r-datashield.html

Hi Yannick,

Many thanks, I will look at it!

All the best

Gabriella