Cohort variable identifying each subject


to assess cohort effects within the analyses (e.g regression analysis), we are thinking about introducing a categorical variable cohort which has the following characteristics: 1 = cohort 1, 2 = cohort 2, 3 = cohort 3

Before we do so we wanted to check: is there a DataSHIELD specific way of doing so?

Thank you! Caroline

Dear Caroline,

Yes, there is a way. You can create dummies for this variable. I guess you have a study_id variable. Attached please find a very useful script.



(Attachment Study_id script is missing)

Dear Marie,

yay - sounds great! Could you upload your attachment once more, please, as I cannot find it.

Thank you Caroline

Attached now as txt.



(Attachment Study_id script.txt is missing)

Ok the only allowed formats are images.

@Marie you can paste a script as a code block into a message and then click the <> button. This will mean that screen readers can also read the code, and it is searchable. Instructions and a video here on how to do this:

Thanks a lot! :slight_smile:

What exactly does your study_id tell you? We have one separate table for each study consisting of three cohorts. I am assuming, that you have one big table with different combinations of variables making several separate studies, right? If I got it right then in this case your study_id would correspond to something like a cohort_id in our case.

Dear Caroline,

What exactly does your study_id tell you? Study_id= participant’s identification number. As this is a numeric variable you can create zeroes (id-id) and ones ((id - id) + 1) and create the dummy variables for each cohort/study centre:

C1 C2 C3 C4 C5

source 1: 1 0 0 0 0

source 2: 0 1 0 0 0

source 3: 0 0 1 0 0

source 4: 0 0 0 1 0

source 5: 0 0 0 0 1

C1-5= cohort 1-5

If I run models with 5 studies and want to adjust for cohort centre, then I should add the newly created variables labelled as “source.1”, “source.2” etc

Hope it helps,


Sorry I wanted to say: id= participant’s identification number.

And from there you create the sources which are equivalent to study/cohort number 1, number 2 etc.



Ohja - what a great idea! …and thank you again!

All the best Caroline