How to install multi-server Datashield architecture from scratch?

Dear DS community.

I would like to install a DataShield Cloud infrastruture from scratch on some barebone servers to understand how everything works. A minimal working example with 2-3 servers should do for starters …

Can you recommend a starting point (website, manual,…)?

What recommendations do you have to easily deploy the setup to other servers (e.g. infrastructure-as-code) using automated deployment tools like Kubernetes, Chef, Puppet, Ansible etc?

using “docker-compose” is a deployment tool is supported by both Opal and Armadillo. See R/DataSHIELD — Opal documentation and GitHub - molgenis/molgenis-service-armadillo: Armadillo; a DataSHIELD implementation, part of the MOLGENIS suite.

There is also a number of (aging) puppet scripts in GitHub - datashield/datashield-infrastructure: Infrastructure set up code, examples and puppet environments for datashield, and associated repos GitHub - datashield/puppet-datashield: Puppet module for datashield, GitHub - datashield/puppet-mongodb: Puppet module for installing mongodb, GitHub - datashield/puppet-opal: Puppet module for installing opal and GitHub - datashield/puppet-r: Puppet module for R and R packages.


Please have a look also at a more complete docker-compose file at:


Dear @yannick, dear @swheater, thanks, what cloud service provider would you recommend which is most compatible with the requirements of DataSHIELD, esp. regarding privacy (e.g. EU-based cloud provider, GDPR compliance, etc) and flexible pricing model?