Niranjani Prasad will present her FPO "Methods for Reinforcement Learning in Clinical Decision Support" on June 11, 2020 at 3:30pm via Zoom.

Zoom link: 
https://princeton.zoom.us/j/2211458833

The members of her committee are as follows: Examiners: Barbara Engelhardt (adviser), Sebastian Seung, and Mengdi Wang; Readers: Ryan Adams and Finale Doshi-Velez (Harvard University)

A copy of her thesis, is available upon request. Please email ngotsis@cs.princeton if you would like a copy of the thesis.

Everyone is invited to attend her talk.  Abstract follows below.

The administration of routine interventions, from breathing support to pain management, constitutes a major part of inpatient care. Thoughtful treatment is crucial to
improving patient outcomes and minimizing costs, but these interventions are often
poorly understood, and clinical opinion on best protocols can vary significantly.
Through a series of case studies of key critical care interventions, this thesis develops a framework for clinician-in-loop decision support. The first of these explores the
weaning of patients from mechanical ventilation: patient admissions are modelled as
Markov decision processes, and model-free batch reinforcement learning algorithms
are employed to learn personalized regimes of sedation and ventilator support, that
show promise in improving outcomes when assessed against current clinical practice.
The second part of this thesis is directed towards effective reward design when formulating clinical decisions as a reinforcement learning task. In tackling the problem
of redundant testing in critical care, methods for Pareto-optimal reinforcement learning are integrated with known procedural constraints in order to consolidate multiple,
often conflicting, clinical goals and produce a flexible optimized ordering policy.
The challenges here are probed further to examine how decisions by care providers,
as observed in available data, can be used to restrict the possible convex combinations
of objectives in the reward function, to those that yield policies reflecting what we
implicitly know from the data about reasonable behaviour for a task, and that allow
for high-confidence off-policy evaluation. The proposed approach to reward design is
demonstrated through synthetic domains as well as in planning in critical care.
The final case study considers the task of electrolyte repletion, describing how
this task can be optimized using the MDP framework and analysing current clinical
behaviour through the lens of reinforcement learning, before going on to outline the
steps necessary in enabling the adoption of these tools in current healthcare systems.