Year : 2013 | Volume
: 16 | Issue : 3 | Page : 167--168
A small step in the right direction
Praveen Kerala Varma
Division of Cardiac Surgery, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, India
Praveen Kerala Varma
Division of Cardiac Surgery, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum
|How to cite this article:|
Varma PK. A small step in the right direction.Ann Card Anaesth 2013;16:167-168
|How to cite this URL:|
Varma PK. A small step in the right direction. Ann Card Anaesth [serial online] 2013 [cited 2022 Jan 25 ];16:167-168
Available from: https://www.annals.in/text.asp?2013/16/3/167/114236
Risk models are used for two main reasons. Firstly, a risk model allows the calculation of the risk of mortality and morbidity of a surgical procedure. This is important as it serves to guide the clinician and the patient about the advisability of an operation by helping to weigh the risk against the benefits. Secondly, a risk model is a method of quality control. Risk-adjusted mortality rate can be used as a measure of the quality of the performance of the hospital, unit, or surgeon. Risk models try to predict the outcome based on preoperative risk factors or variables. There are several such models available; however, The Society of Thoracic Surgeons' (STS) risk score and the EuroSCORE are the widely used models for risk assessment.
The most important tool in any risk assessment model is the formation of a database. Accuracy of data entry and standardization of definitions are of paramount importance in any database. Databases function as a quality control tool by identifying the performance of an individual, unit or a hospital. The effort to characterize performance is not intended to stigmatize programs with worse performance but to illuminate the differences and to use the information as a basis for quality improvement. In addition, the identification of hospitals with better than average performance provides the opportunity to learn what processes may be responsible for their success. STS database is a voluntary registry that collects data from most cardiac centers in the United States. The data are gathered every 6 months and analyzed at Duke Clinical Research Institute, and the database currently has records of more than 3 million patients. Analysis of database allows them to assess the trends in cardiac surgical practices, development of risk models, and quality improvement measures. However, it is limited by the fact that it reflects only the patients from North America and has a poor global perspective. The EuroSCORE, which was published in 1999 as an additive model and was superseded by a logistic model in 2003,  used preoperative variables for calculating the mortality and showed good calibration across different population subsets. However, because of an improved outcome of cardiac surgeries all over the globe, it started to suffer from overestimation of expected mortality from the observed mortality, especially in high-risk groups. Hence, it was recalibrated in 2011 as EuroSCORE II based on 14 variables and data were collected from 154 centers across 42 countries over a 12-week period in 2010.  Only four Indian centers participated in providing the data.
Mortality and morbidity are the outcomes that are usually measured in any risk model. Mortality is the most widely used outcome measure as it is the least subjective of the outcome variables. Multivariate analysis is the cornerstone for assessing the outcome. The statistical technique commonly used for multivariate analysis is called regression analysis. Regression analysis builds a model based on dependency of the outcome on a set of predictor variables otherwise called risk factors. This allows risk stratification, i.e., separation of a group into different groups based on their degree of risk. When the outcome variable is a continuous one (say length of hospital stay), linear regression models are used and when it is dichotomous (like mortality), logistic regression models are used.
Once a risk model is developed, it needs to be validated. In EuroSCORE II, risk model was developed from 16,828 patients (development set) and validated on 5553 patients (validation set). The model was then tested on the validation data set for calibration (by comparing the observed and predicted mortality or goodness of fit) and for discrimination [using the area under the receiver operating characteristic (ROC) curve]. Goodness of fit of the final model was tested using the Hosmer-Lemeshow statistic. The Hosmer-Lemeshow (HL) Chi-square statistic measures the differences between the expected and observed outcomes over deciles of risk. If a model is well calibrated, the O: E ratio should be close to 1; departures above or below are indicative of under-prediction and over-prediction, respectively. A well-calibrated model gives corresponding P > 0.05. Discrimination means how well the model differentiates a population that had an event from the one that did not. ROC curves are typically plotted to evaluate the performance of logistic regression models. ROC curves were initially used in the Second World War by the US navy in radar-based operations to discriminate enemy ships or plane from friendly ones (hence the name), and since then have found extensive applications in industry and medicine. The curve typically plots sensitivity in the Y axis and 1 − specificity in the X axis from a dichotomous outcome. C statistic is equal to the area under the curve. A perfect discrimination gives a score of 1 and 0.5 denotes no discrimination (like toss of a coin for heads or tails). A value upward of 0.75 has good discriminatory ability. Application of EuroSCORE II in validation set showed C statistic of 0.80, indicating good discrimination. Observed mortality was 3.9% compared to the expected mortality of 4.1%. When the older model was used, the expected mortality was 7.57%, thus showing over-prediction of mortality in EuroSCORE and good calibration (goodness of fit) in EuroSCORE II model.
STS also recognized the importance of nonfatal complications for the assessment of quality. Hence, they identified five important complications, namely stroke, renal failure, deep sternal wound infection, prolonged (> 24 h ventilation), and reoperation within 24 h. Predictive risk models were developed by multivariate logistic regression from a database of half a million patients. However, the discriminatory power (C statistics) was low compared to mortality, with reoperation model being the least reliable model (C statistic 0.64). This is probably due to the difficulty in defining and tracking the incidence of complications, overlapping of many risk factors, and also due to the fact that many risk factors for these complications (e.g., re-operation for bleeding) are not well established.
The present study of Borde et al, demonstrates that using HL test, EuroSCORE II and STS risk score have good calibration power (P = 0.71 and P = 0.63, respectively) indicating satisfactory model fit. But, the area under the ROC curves was 0.69 and 0.65 for EuroSCORE II and STS risk score, respectively, indicating poor discriminatory power in the present cohort. This finding is not surprising as analysis of this study will identify important differences in patient characteristics between the groups. The STS risk model for morbidity has low discriminatory power; hence, it is not surprising to see poor discrimination in this study. The stroke rate in Indian population is very low in spite of the fact that in majority of coronary artery bypass graft (CABG) cases, the proximal anastomoses are done using side biting clamp compared to single cross-clamp technique used in the Western world. This is probably a sign of low atheromatous burden in Indian patients. The number of patients in high-risk group in this study is very small. This also could be a reason why there was poor discrimination in the high-risk group with EuroSCORE II. After the advent of EuroSCORE II, a study was published showing poor discrimination or over-prediction of mortality in high-risk groups.  Therefore, it may mean poor applicability of EuroSCORE II in high-risk groups. Not all risk factors can be accounted for in the construction of any risk model. For example, the biggest risk factor in CABG in the Indian scenario is poor quality of vessels which is very common in diabetic patients. However, this risk factor is not used in risk assessment in any model.
Risk scoring systems are most applicable when the preoperative patient characteristics and treatment profiles are comparable with those on which the system originated. For this reason, any risk scoring system can only be reliable when it is validated in local population. Even though the present study includes only a very small number of patients, it has questioned the validity of application of Western scoring system in Indian patients. This can be substantiated or disproved only with a robust multicentric study with sufficiently large number of patients or by creation of a national database. The authors have taken a small step in the right direction by analyzing their patients in a meaningful way.
|1||Roques F, Michel P, Goldstone AR, Nashef SA. The logistic EuroSCORE. Eur Heart J 2003;24:881-2.|
|2||Nashef SA, Roques F, Sharples LD, Nilsson J, Smith C, Goldstone AR, Lockowandt U. EuroSCORE II. Eur J Cardiothorac Surg 2012;41:734-44. |
|3||Borde D, Gandhe U, Hargave N, Pandey K, Khullar V. The application of European system for cardiac operative risk evaluation II and Society of Thoracic Surgeons risk-score for risk stratification in Indian patients undergoing cardiac surgery. Ann Card Anesth 2013;16:163-6.|
|4||Grant SW, Hickey GL, Dimarakis I, Cooper G, Jenkins DP, Uppal R, et al. Performance of the EuroSCORE Models in Emergency Cardiac Surgery. Circ Cardiovasc Qual Outcomes 2013;6:178-85.|