Question about scalability (and possibly polling/long running jobs)

I guess this is a question for @yannick but I think might want input from @sidohaakma, @gfcg and @swheater.

It has been good to see that recent updates to Opal and the introduction of Rock has made it possible to scale up the computational resources available to users. The motivation behind doing this is that now some consortia have many users accessing nodes at the same time. Previously this would result in R becoming overwhelmed and the server crashing. The idea of have scalability is to stop this.

My first question is, are any groups currently using this facility to make sure their computation is not overwhelmed?

The second part to this comes from a query within our consortium. They appreciate that adding computing power is one solution. However, they are not keen to start down this route as they are worried about simply having to add more and more computing power (and cost) to deal with spikes in use. Therefore I was wondering about also adding in a complimentary solution which may already be available. This would be to have work delayed when resource usage is high and there is risk of overload, and then run after the previous work is complete. Is this the intention of the recently introduced polling options in DSI? Are there some examples of the polling in action? And will it help with the scenario I suggest?

I guess a consequence of what I am proposing tends to lead towards a more comprehensive queue management system, where jobs can be prioritised and slotted in an optimised way.

I know we have covered scalability before, but I think I need to revisit it to address the query I have received.

I should have added that a strong motivation for a queueing approach rather than adding hardware is for the developing countries we work with, that do not have budget. Of course this still applies somewhat to richer organisations!

Hi,

There are two ways of submitting an operation (aggregate or assign) in DSI, using the async parameter. See datashield.aggregate and datashield.assign async parameter documentation.

  • When async is false, the client waits for the operation to complete in a blocking way.
  • Whand async is true, the client immediately returns with a submitted R command ID, for latter retrieval of the result. This allows the client to submit R commands to each DS server and have them working in parallel.

The polling settings describe how frequently the client will check for the submitted R command result. this does not affect how the server manages its R commands queue.

What you are suggesting is that Opal (which manages the R server(s)) could have an improved strategy for managing the R commands queue. For now each R command received is pushed to the R server session, regardless of the global R server status. This definitely could be improved as the Rock R server reports its usage status (free memory, cpus). Then when the R server has reached a critical level (like 80% or 90% of the memory is used), pushing new R commands could be delayed.

Regards
Yannick

Hi,

I wonder if Opal could changed to enforce the optional policy of a single session per user or user group.

Stuart

Sometime datasets from different institutions are hosted on the same Opal server, then having several R sessions per user would be required. But there could be a kind of user “quota”, that is true.

We can imagine many strategies for handling R sessions and R commands to be executed in these R sessions, whether there is one or several R servers available… The priority for now is to protect the R server from being overwhelmed.

Yannick

Hi Yannick,

Yes it sounds like the polling will be complementary to any proposed queueing mechanism.

It sounds promising that the Rock R server can report its status. I am slightly nervous that this is (a) a fairly major change, throwing up lots of questions about how should the queue position/ETA be communicated back to the user, how do you manage greedy users etc (b) feels like a common problem where many users access an R server, for example does RStudio server already have a way of doing this. My experience of using shared systems with limited resources is on our HPC, which uses SLURM to queue jobs. This seems to have all kinds of optimisation and management features built in.

I think it might be worth investigating what is already out there to help (perhaps this?), and coming up at least with a proposal for how this could work.

Thanks

Tom