Install a DataSHIELD server (<1hour) [Tutorial]

Dear all,

I have written a tutorial [1] on how to install a Federated Analysis server in the cloud using the DataSHIELD software based on Docker technology.

I assume that the installation takes less than 1 hour if you follow the instructions closely. I have documented and tested all installation steps starting from a bare-bones Linux installation to avoid any pitfalls for less technical users.

Please let me know if you have any comments or suggestions for improvement.

Kr, Wilmar

Refs: [1] Federated Analysis with R/DataSHIELD – Installation, Data Import, Data Analysis [Tutorial] – Wilmar Igl, PhD

2 Likes

Wilmar,

Thank you, I will try to find time this week to review your document.

Regards,

Stuart

1 Like

This would be wonderful, because I only almost know what I am doing :wink:

Wilmar,

Sorry for the delay. The document you have produced is very impressive. I would make sure the Community’s “Education Theme” (newly set-up) is aware of your Tutorial.

A few minor comments:

  • The Puppet scripts used by the DataSHIELD team for VM creation are now rather out of date. Could be used as the basis of Puppet based deployment, but would require considerable effort.
  • Currently DataSHIELD analysis can be access via two types of servers Opal and Armadillo.
  • If feeling very bold could try the text editors vi and ed
  • In “Configuration of HETZNER” - 4GB Memory could a bit small for all but the initial investigations of DataSHIELD.

Thank you,

Stuart

1 Like

Hi,

Thanks for this document. I have some comments too.

Configure Firewall

  • As you will use a reverse proxy to access Opal (Apache or Nginx as recommended), then only configure firewall for 22 (ssh), 443 (https), 80 (http to be redirected to 443 by the reverse proxy)
  • Rock is only to be accessed by Opal, then do not expose it!

Install DataSHIELD as Docker containers

You do not need to have both Mysql and Mongodb databases. Choose Mongodb instead, as this is the most versatile one.

Import Data

Please refer to https://opaldoc.obiba.org/en/latest/cookbook/import-data.html

Analyze Data

The opal-demo server has a DataSHIELD user, you should use it instead in your examples.

It is highly recommended to use personal access tokens to authenticate from a script, see https://opaldoc.obiba.org/en/latest/cookbook/r-datashield/authz.html

Regards
Yannick

Dear Stuart, dear Yannick,

thank you so much for your feedback! I will update the tutorial accordingly soon.

One thing I was uncertain about was, whether all passwords in the YAML Docker config file should be changed to secure the server or whether the OPAL administrator password is enough. My reasoning was that if the ports of the other applications (eg MySQL) are not opened, then the password can be kept at its default value, because attackers could not access MYSQL.

If the MYSQL password are changed, could this cause any trouble connecting MYSQL with OPAL?

What about the other passwords for MANGODB, ROCK etc which are not explicitly mentioned in the YAML file but probably also set at their default values? Should they be also be explicitly set to user-defined values in the Docker file to install a secure server?

Kr, Wilmar

Dear Stuart, I have updated now the blog accordingly. Where can I find the “Education Theme” forum?

Thank you very much, Wilmar

Dear Yannick, I have updated now the blog accordingly.

Thank you so much, Wilmar

Dear @swheater , dear @yannick,

thank you so much for your feedback! I have updated the tutorial now.

One thing I was uncertain about was, whether all passwords in the YAML Docker config file should be changed to secure the server or whether the OPAL administrator password is enough. My reasoning was that if the ports of the other applications (eg MySQL) are not opened, then the password can be kept at its default value, because attackers could not access MYSQL.

If the MYSQL password are changed, could this cause any trouble connecting MYSQL with OPAL?

What about the other passwords for MANGODB, ROCK etc which are not explicitly mentioned in the YAML file but probably also set at their default values? Should they be also be explicitly set to user-defined values in the Docker file to install a secure server?

Kr, Wilmar

Hi,

You should use environment variables with a .env file so that passwords do not appear in the docker compose’s YAML file. See Docker Compose - Environment variables documentation.

If you set rock/mysql/mongodb username/password in their respective container, you need to tell Opal which they are. See Opal Docker env variables and Rock Docker env variables.

Regards
Yannick