Community Progress: "Standard Configurations"

The purpose of this threaded is to facilitate discussion about this how to specify and manage Standard Configurations of DataSHIELD stack: platform (Armadillo, Opal, R, …) and packages (dsBase, dsSurvival, …).

The hope this will help Infrastructure Managers, Data Provides, Developers and Research communicate about update plans and testing.

A draft “Standard Configurations” has been added to the web-site - Standard Configurations.

The use of the “renv” packages (Project Environments • renv) is being considered to capturing the R packages used.

Thanks for this start Stuart.

Is the intention for this page to sign post configurations that are then managed by consortia? I guess as a test case, I want to make a configuration for InterConnect featuring dsBase, dsMediation, dsSynthetic and dsSurvival. I intend to build the Docker image myself and put it on Docker Hub. Will you then assign a name to that configuration for me?

Hi Tom,

the intent is to sign post the standard configuration names, and the packages/system which make up the configuration.

The names can be anything as long a they are unique (currently platforms start with a, b, … and profile names start with m,n,…) but if the names are meaningful to the users that would be better. If you can send me the Opal and Armadillo version, R version, and package versions (and the names you would like)

PS: Are you planning to use docker trust when creating docker images?

Stuart

Hi Tom,

To get the ball rolling, I have added two named standard configurations one for Opal and one for Armadillo: “caravan_terrain” and “dolomite_terrain”. “Terrain” is the name of the profile:

  • dsBase 6.2.0
  • dsMediation 0.0.3
  • dsSynthetic 0.0.2
  • dsSurvival 2.1.0

Stuart

Hi Stuart,

Ok great. I just created an image yesterday but with:

  • dsBase 6.2.0
  • dsMediation 0.0.2
  • dsSynthetic 0.0.2
  • dsSurvival 1.1.0

So could this be Terrain v0.0.1, and then for Terrain v0.0.2 I could recreate it with

  • dsBase 6.2.0
  • dsMediation 0.0.3
  • dsSynthetic 0.0.2
  • dsSurvival 2.1.0

Or do I need another thing starting with T?

Should I call it Terrain in my DockerHub?

And also I have not created one for Armadillo as I simply haven’t had time to get to grips with it yet

Tom,

I will correct the specification of “terrain”, to match reality. The plan would be to have different name for different profiles. But just an initial plan.

Stuart

One other point is that it used Opal 4.4.10…

I guess the question now is how do I use that “terrain” description? Should my image have that name, along with the profile?

Is it based on R 4.2 or 4.2.1, …?

Stuart

I should clarify, I am struggling to work out how the profile name links to the Docker image. At the moment I have called my Docker image surv-synth-med and it has a version tag v0.0.1. This corresponds to the “terrain” profile.

I thought that if I had new versions of the constituent packages, I could release this with a version tag v1.0.0 (for example). This would need a different profile “treehouse”.

In my docker-compose file, I would then set the ROCK-CLUSTER name to treehouse or terrain.

So somewhere we need to have a link between the profile name and the image name + version…

Good question! I just used this:

https://hub.docker.com/layers/rock-base/datashield/rock-base/6.2-R4.2/images/sha256-2b1ae879a4387e1dac6843ea59ac1db61816ee78467c9d30d6769a2aed330b0e?context=explore

It is a good question, as it appears the version of R will depend when the Dockerfile is used to build the image, it will contain the latest R 4.X.X at the time. But given the name implies this image had R 4.2.

It is 4.2.0 (checking via Opal).

But this shows it is quite tricky to get all the labeling of versions etc correct! It would be nice to be able to see from DockerHub what is actually in the image, without the risk of human error from manual labeling. Also it would be good to link back to a commit/release in Github…

And it is not just about the version of R: the R package dependencies have also their own version history. Making a docker image of an R server is then basically taking a snapshot of a whole R ecosystem. Using a renv setting file (which changes can be tracked and tagged) for building the R server image may be a solution?

Yannick