Using kernel density estimation to model surgical procedure duration

AuthorGilbert Ritchie,Kevin Taaffe,Bryan Pearce
Date01 January 2021
DOIhttp://doi.org/10.1111/itor.12561
Published date01 January 2021
Intl. Trans. in Op. Res. 28 (2021) 401–418
DOI: 10.1111/itor.12561
INTERNATIONAL
TRANSACTIONS
IN OPERATIONAL
RESEARCH
Using kernel density estimation to model surgical
procedure duration
Kevin Taaffea,, Bryan Pearceaand Gilbert Ritchieb
aDepartment of Industrial Engineering, Clemson University, Clemson, SC, USA
bIntegrated Business Systems & Services, Columbia, SC, USA
E-mail: taaffe@clemson.edu [Taaffe]; bpearce@clemson.edu[Pearce]; gritchie@ibss.net [Ritchie]
Received 13 December 2016; receivedin revised form 13 April 2018; accepted 13 April 2018
Abstract
Estimating the length of surgical cases is an important research topic due to its significant effect on the
accuracy of the surgical schedule and operating room (OR) efficiency. Several factors can be considered
in the estimation, for example, surgeon, surgeon experience, case type, case start time, etc. Some of these
factors are correlated, and this correlation needs to be considered in the predictionmodel in order to have an
accurate estimation. Extensive research exists that identifies the preferred estimation methods for cases that
occur frequently. However, in practice, there are many procedure types with limited historical data, which
makes it hard to use common statistical methods (such as regression) that rely on a large number of data
points. Moreover, only point estimates are typically provided. In this research, kernel density estimation
(KDE) is implemented as an estimator for the probability distribution of surgery duration,and a comparison
against lognormal and Gaussian mixture models is reported, showing the efficiency of the KDE. In addition,
an improvement procedure for the KDE that further enables the algorithm to outperform other methods is
proposed. Based on the analysis, KDE can be recommended as an alternative estimator of surgical duration
for cases with low volume (or limited historical data).
Keywords: kernel density estimation; surgery duration; Gaussian mixture models; lognormal estimation; bandwidth
optimization
1. Introduction
The operating rooms (ORs) and associatedsupport services are the primary sources of both revenue
and cost of a hospital (Healthcare Financial Management Association, 2002). Scheduled elective
surgeries comprise the majority of used OR time, and variability in the duration required to execute
these procedures results in loss of OR efficiency. This loss takes the form of either underutilization
Corresponding author.
C
2018 The Authors.
International Transactionsin Operational Research C
2018 International Federation ofOperational Research Societies
Published by John Wiley & Sons Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main St, Malden, MA02148,
USA.
402 K. Taaffeet al. / Intl. Trans. in Op. Res. 28 (2021) 401–418
or overutilization in the case of idle time after shorter than expected surgeries or overtime work
to complete longer than expected surgeries, respectively. Accurate specification of the probability
density function (pdf) for surgical duration is critical to effective surgical appointment scheduling.
It can be equally important in improving the validity of research analysis being conducted with
simulation modeling and other analytical toolsets.
Although statistical analysis of the variability in surgical duration has been studied for over 40
years (Rossiter and Reynolds, 1963), the relatively recent emergence of computerized data logging
systems has brought the ability to compile large and detailed historical databases of surgical cases.
Statistical modeling of these databases poses a difficult multivariate problem, as there are several
factors that could potentially play a role in influencing surgical duration, such as the procedure
type, surgeon, anesthesia level, experience of the surgical support staff, and more. Most of these
factors are discrete, of either nominal or ordinal type, with only duration and time of day factors
providing continuous-valued data. Many of these factors exhibit collinearity, indicating a strong
level of interdependency between factors and motivating careful selection of the scheme by which
the data are analyzed.
Despite rigorous research of statistical prediction and data mining models to identify factor
significance and pdf estimation methods,a large portion of the overall variability in surgicalduration
remains attributed to noise or other factors that are not traditionally logged into medical data sets.
Literature within the field of data-driven surgery duration estimation focuses almost exclusively on
bringing parametric modeling methods to bear on the problem.
Another issue in predicting the length of the surgery is that there are numerous different surgical
cases in practice. A typical approach in the literature is to combine all the different cases together
and identify a prediction for all the surgical cases. But in practice, one can argue that not all
the surgical types are affected by similar set of factors. In fact, there are some studies that try to
come up with a set of effective factors on predicting length of different case types. However, a key
issue is that while there are some routine surgical cases that are performed frequently, there are
many more case types that are not scheduled with regularity. As an example, of 1977 CPT (current
procedural terminology) codes scheduled over the course of three years, approximately 700 CPT
codes were scheduled only once (Bozorgi et al., 2015). For higher volume CPT codes, Hosseini et al.
(2015) demonstrate via case study a method for estimating surgical duration using data mining
and predictive modeling. We note that employing common statistical methods to predict surgery
duration for “low-frequency” CPT codes (almost 35% of all codes in the above example) is very
difficult, due to the lack of historical data for model creation. In the literature, Dexter and Ledolter
(2005) try to use Bayesian methods to handle such low-volume surgical cases. In this paper, we
propose to use the KDE method to estimate a distribution for surgical case length, and show
the value of this approach against current methods from the literature when used for low-volume
case types, in particular. Moreover, most research has focused on identifying a point estimate for
comparison, while in this manuscript we offer a method for fitting a distribution to the data.
KDE also has some other characteristics that make it a suitable method to handle predicting the
distribution for surgical time. Historically, KDE is utilized for fitting continuous data that resist
parametric analysis, and recent extensions allow discrete-valued data to be incorporated (Li and
Racine, 2007). Nonparametric methods assume no aprioridistributional form for the data, leaving
a reduced set of assumptions with only regularity conditions and independent and identically
distributed sampling requirements to be satisfied. The tradeoff implied by this increased flexibility
C
2018 The Authors.
International Transactionsin Operational Research C
2018 International Federation ofOperational Research Societies

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT