Creating dummy variables

Hi, I am a bit rusty on some of the new DataSHIELD functionality. What is the best method to convert a factor variable (i.e. categorical with multiple levels) into dummy binary variables?

I can see there is probably a long-hand method with ds.boole, but wondered if there is a quicker way?

Thanks

Hi Tom,

You can create the dummy variables from a categorical variable using the ds.asFactor function and setting the argument fixed.dummy.vars equal to TRUE :slight_smile:

Hi Demetris, that is absolutely wonderful! Thank you!

If I do this to multiple variables, then they all have column names like DV2 DV3 etc so there is conflict. What is the easiest way to rename columns in a matrix/dataframe, please?

Tom, the only way to rename variables using the current version of dsBaseClient is to use the ds.make function to replicate the variables with a different name and then add those variables back to your dataframe and remove the variables with the DV2 and DV3 names.

An easiest way is to use Tim’s “dh.renameVars” function that is included in the dsHelper package (this is a client-side function!)

Thanks again!

Actually I just found another way, which is to use ds.matrixDimnames on the output from ds.asFactor (which is conveniently a matrix)

1 Like