Integrating the DSI module for other backends

Hi everybody,

We are about to implement the DSI module. We stumbled upon a few questions.

  1. Is there somekind of conformance document to know which features we need to implement on the backend? Ideally we would like to have a test instance with the usecases that represent the usage of DataSHIELD. Do we have a testsuite that is capable of handling these usecases?

  2. Which R-types do we support in DataSHIELD?

Thanks so much for your help.

Kind regards,

Sido

Hi,

  1. Currently there are only the reference implementations which are DSOpal (+ Opal) and DSLite. The test suite of the dsBase package might be used to verify the behavior is the one expected, but note that the integration of DSI within dsBase is a work in progress. It should be possible to switch the dsBase test suite from one DSI implementation to another. Regarding a conformance document: that is one of the goals of the CZI grant we submitted recently to have a certification process (documentation and test suite) that can be used to develop and verify the conformance of a DSI implementation.

  2. I don’t think there is a limitation with the R types.

Cheers
Yannick

Along the way we bumped to another few questions:

  1. How long do commands typically last?
  2. Is there a maximum timeout defined?
  3. Is DSOpal and dsBaseClient integration ready for production? How do we use it if not?

Cheers,

Sido

Sido, the integration of DSOpal and dsBaseClient is currently ongoing and is planed to be released as version 6.0 of DataSHIELD. Development VirtualBox VMs have been created which contain the current Opal and dsBase so can be used for testing. Stuart

Hi Sido,

At the time of Bioshare some people were running overnight tasks… Since then the Opal’s strategy has been improved as such:

  • Datashield jobs are parallelized, then the time for a DS analysis is the time taken by the DS node with the longest running job.
  • Opal is managing R sessions and then knows if a R session is busy and most importantly, how long it has been idleing. Every minutes, Opal checks and closes expired R sessions: a R session is considered expired after 4h of idle (default value, configurable). So far nobody is complaining of that. This means also that when people are not cleaning their sessions with datashield.logout(), it takes 4h to be cleaned automatically…

Cheers,
Yannick