Standardisation of functional testing

I am also currently working on establishing a standard method to test current and future development using test_that.

Dear all,

We have made some excellent progress in the last week. It is worth noting that an expectation is something we are looking for to achieve. An expectation is often tested against a criteria that is met or not.

We have agreed the following:

Test outcomes When developing any new data_shield methods, the following needs to be designed and implemented:

  • Test answers from data shield are the same as R.
  • Test answers are mathematically correct (i.e., no standard deviation returning negative values)
  • Test correct values under correct data
  • Test a correct expected value for an expected data
  • Test correct arguments
  • test incorrect argument

Testing table To help us all with our development, we would prefer using some dummy data, that manages better expected and unexpected results. For examples, R has some specific data types (see http://www.diegobarneche.com/2014-12-11-ufsc/lessons/01-intro_r/data-structures.html).

For that reason, we have agreed to provide the following table. It will be referred as table_with_values. The columns will be

  • all_null_values
  • integer_values
  • NA_values
  • numerical_values (i.e., decimal)
  • character_values
  • logical_values (i.e., boolean values)
  • non_negative_integer_values (>=0)
  • non_negative_numerical_values (>=0)
  • negative_integer_values (<0)
  • negative_numerical_values (<0)
  • positive_integer_values(>0)
  • positive_numerical_values(>0)
  • factor_values

We will also provide an empty table. This table should have all these fields, but no entry.

@PatRyserWelch - sounds like you are making solid progress :slight_smile:

Completely agree that a standard dataset to test against makes sense. Are you planning on having multiple columns of the same type with different values in the table, or would the idea be to pass in the exact same variable to a function many times?

You have made a fair point.

Which one would you be best in your opinion. I have yet to consider those …

P.

Am I right in thinking that you are expecting that most of the tests will be based on SURVIVAL, CNSIM and DASIM data sets, and only tests which need to test special situation will use this “data set with values”.

In some cases we have to use specific datasets for the tests. For example, if you want to test the ds.lexis function you can only do it using the SURVIVAL data that are in a longitudinal form. Also, the examples in the headers of the functions will be based on those datasets.

At the moment we have three type of datasets in the development (and in the training) VMs: i) SURVIVAL which are longitudinal data ii) CNSIM which are phenotypic data (simulated from the NCDS cohort) - includes 3 datasets with different sizes, they include missing values iii) DASIM which are phenotypic data (simulated from the NCDS cohort) - includes 3 datasets of the same size, they do not include any missing values.

In the future we may need some other datasets as well. For example we may need a synthetic dataset for omic data or a synthetic dataset of nested data, etc. I’m now developing two new functions: one for analysis of omic data and another for generalised linear mixed models, so we will soon need to upload two new synthetic datasets in our training servers that can be used for testing those new functions.

@PatRyserWelch - my gut feeling is that it would be better to have different columns of the same type. If you used the same column as every arguement to a function then I would guess there is risk that e.g. rounding errors may get cancelled out etc.

Olly, that is what I have put into place. Three testing tables on the Opal server. Then, the same data are available as CSV files in the testing environment. As a result, we can test to an accuracy of 15 decimal places. The Java environment uses 7 decimal places less than R.