The Spanish National Healthcare System (NHS) is mostly publicly funded and provided. It is considered highly cost-efficient according to international studies based on World Health Organization (WHO) data. However, the contention of healthcare costs increases while maintaining adequate levels of quality of care, is still a largely unsolved problem. In recent years, Emergency Departments (EDs) of specialized care hospitals have been subjected to budget restrictions, increased visits and increased clinical complexity of these visits. These circumstances require new approaches to ED management, which could benefit from decision support tools.
In this Ph.D. thesis, we propose machine learning solutions for two problems common to most EDs of specialized care hospitals: ED census forecasting and real-time prediction of probabilities of inpatient admission for all triaged patients in the ED. These solutions could be used as decision support systems. Data for the development of these solutions were provided by the Ramón y Cajal University Hospital of Madrid, a large specialized care referral center with all medical specialties excepting Obstetrics. In years 2011 and 2012 it had approximately 1,100 beds and approximately 553,000 patients assigned to its clinical area. Another topic of this Ph.D. thesis are software tools for the generation of logistic regression and Cox regression nomograms, since nomograms can be used as clinical decision aids and as contingency procedures in case of failure of computer-based decision support systems.
The first topic of this Ph.D. thesis is the development of models for ED census forecasting (i.e. prediction of the number of patients present at the ED at a given time). One of the uses of ED census forecasting is nursing personnel allocation, based on national and international recommendations. We chose an 8-hour granularity for our forecasts since many resources (such as nursing personnel) in the ED are organized in 8-hour shifts. Our aim was to generate forecasts for two dependent variables: average ED census levels and maximum ED census levels. Maximum ED census forecasts within 8-hour shifts could be used for nursing personnel allocation, while average ED census forecasts within 8-hour shifts could be used for the other needs (such as allocation of administrative personnel).
We used a generalized regression approach to time series forecasting with several machine learning algorithms: M5P, Alternating Model Trees (AMT) and Support Vector Regression (SVR). We compared these to a series of benchmarks: usual nursing staffing levels (and usual resource allocation policies), stratified average (averages stratified by the three 8-hour shifts of a day), linear regression and Seasonal Autoregressive Integrated Moving Average (SARIMA) models. Forecasts were generated for both dependent variables: average ED census levels and maximum ED census levels. Four forecast horizons were tested: 1 week, 2 weeks, 4 weeks and 8 weeks. Underestimation risks, overestimation risks and approximations to monetary costs of resource allocations policies were defined for both average and maximum ED census forecasts. Maximum ED census forecasts were transformed into nursing personnel levels, and underestimation and overestimation risks for maximum ED census forecasts were transformed into understaffing and overstaffing risks. A custom training and evaluation scheme was used, with increasingly larger train sets and fixed-length test sets. The same scheme but with fixed-length train sets of 1 year and fixed-length test sets was also used. The latter approach did not improve the results.
In the case of maximum ED census forecasts, M5P was the best choice for the reduction of major and medium nursing personnel understaffing risks, for all forecast horizons. Compared to the usual staffing levels, personnel planning with M5P could reduce major understaffing (>9 nurse) risks more than 10-fold (a reduction to ~1% with M5P compared to ~13% obtained with the usual nurse staffing levels); and could reduce medium understaffing (7-9 nurses) risks approximately 3-fold (a reduction to ~3% with M5P compared to ~10% obtained with the usual staffing levels). The usage of M5P also implied 5% - 6.1% increases in estimated nursing personnel costs (compared to the usual staffing levels), which are acceptable given the large reductions in understaffing risks.
In the case of average ED census forecasts, once again, M5P was the best choice for the reduction of major and medium underestimation risks, for all forecast horizons. Relative risk reductions were similar to those of maximum ED census forecasts (more than 10-fold reduction in major underestimation and approximately 3-fold reduction in medium underestimation, compared to usual resource allocation policies). However, in this case, the usual resource allocation policies already had low risks of major and medium underestimation (~2% risk of major underestimation and ~3.7% risk of medium underestimation). On the other hand, most importantly, in the case of average ED census forecasts, M5P led to a cost reduction of more than 15% compared to the usual resource allocation policies.
The second topic of this Ph.D. thesis is the development of models for real-time prediction of probabilities of inpatient admission from the ED. Our aim in this case was the development of classifiers with adequate performance in terms of both discrimination and calibration (goodness-of-fit), reliant on a small number of variables, available in most ED settings right after triage. In our setting, the Manchester Triage System (MTS) was used. Discrimination was evaluated with the area under the ROC curve (AUROC). Calibration was evaluated with Hosmer-Lemeshow (H-L) χ2 and p-values with 10 fixed probability intervals. We used logistic regression (LR) models, artificial neural networks (ANN) models and models based on an ad hoc ensemble classifier that optimized calibration (it combined a LR model with a tree of MTS chief complaints with LogitBoost on its leaves). A custom method was used for the evaluation of models, with increasingly larger train sets and 12 consecutive test sets of approximately monthly length. This evaluation method produced the results that follow, reported with 95 % confidence intervals (CIs). For LR models, average AUROC = 0.8531, 95% CI (0.8501, 0.8561); for ANN models, average AUROC = 0.8568, 95% CI (0.8531, 0.8606) and for ad hoc ensemble classifier models, average AUROC = 0.8635, 95% CI (0.8605, 0.8665). Confidence intervals of average AUROCs for LR and ad hoc ensemble classifier models did not overlap. Confidence intervals of average AUROCs for LR and ANN models slightly overlapped, although ANN models had higher AUROCs than LR models in all but one of the 12 test sets. Average H-L χ2 were, respectively, 35.15, 95% CI (32.57, 37.73) for LR models, 10.47, 95% CI (7.78, 13.17) for ANN models and 11.4, 95% CI (9.10, 13.75) for ad hoc ensemble classifier models. Both ANN and ad hoc ensemble classifier models possessed better calibration than LR models, with H-L p-values>0.05 in 10 of the 12 experiments.
The third topic of this Ph.D thesis is the development and evaluation of software for the generation of logistic and Cox regression nomograms. We developed two programs (nomolog and nomocox) for these purposes, based on Stata (a statistical software package widely used in biomedical research). At the time of the writing of this Ph.D. thesis these programs are used by an international community of researchers in the fields of clinical medicine, epidemiology or biostatistics. We surveyed some of these users about their background, their user experience with nomolog and nomocox, as well as the ease-of-use and flexibility of our programs compared to those available in R (another well-known statistical software). Most respondents were “Promoters” (Net Promoter Score > 8), i.e. very likely to recommend the software to other researchers. All respondents (100%) who had used both our programs (nomolog and nomocox) and nomogram generators available for the R statistical software, found nomocox and nomolog easier to use; with a 95% adjusted Wald CI (75.83%, 100%). A raw proportion of 81.25%, with a 95% adjusted Wald CI (54.03%, 96.36%) found our programs to be more flexible than the nomogram generators available for the R statistical software.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados