White space in table entry causing lexical error

Hello.

DataShield is running into a problem when analyzing data with whites pace in it.

I have a data server that I populated with tables by uploading some sample .csv files to Projects I created on it. I generated the .csv files automatically, edited them in Excel to correct some formatting issues, then uploaded them to the server using DataShield in RStuido.

Some of those tables have entries with white space in them. For example, a “patient” table has a COUNTY column, where county names are listed in full (e.g. “Suffolk County”).

When I use DataShield on the client side with RStudio, I can assign a table from the server and analyze entries with no white space as expected. For example,

> DSI::datashield.assign.table(conns = connections, symbol = "Patients", table = c("Project_1.patients", "Project_2.patients"))
> ds.table("Patients$GENDER")

works fine.

However, if I try

> ds.table("Patients$COUNTY")

I get an error. datashield.errors() outputs the following:

$server_1
[1] "[Client error: (400) Bad Request] Lexical error at line 2, column 73.  Encountered: \" \" (32), after : \"\\\"Barnstable\""

$server_2
[1] "[Client error: (400) Bad Request] Lexical error at line 2, column 73.  Encountered: \" \" (32), after : \"\\\"Barnstable\""

“Barnstable County” is one of the possible entries in the COUNTY column.

I know that DataShield must be able to handle whitespace in table values. Does anyone have any ideas about what might have happened to cause this error?

Thank you. For the information I will extends the ds.table tests to include that use case, and investigate.

Stuart

Hi,

The first call of ds.table is to asFactorDS1 function which returns the levels of the input variable from the serverside to the clientside. Then those levels are send back to the serverside with the second call of ds.table in order to create the table. If those levels include spaces then the function fails because spaces are blocked by the R parser.

In this case you can only convert the character variable to numeric and then do the table.

Στις Τετ, 17 Αυγ 2022, 04:32 ο χρήστης Michael Chernicoff via DataSHIELD <notifications@datashield1.discoursemail.com> έγραψε: