Abstract
Healthcare services provided to patients should vary depending on disease severity. However, disease severity bias, a type of selection bias, is a commonly encountered problem in administrative database studies. Herein, we selected chronic obstructive pulmonary disease (COPD), which commonly affects elderly Japanese citizens, for the development and validation of a severity classification system based on a health insurance claims database.
Patients who received COPD-related diagnostic codes in 2011 were selected from a commercially based health insurance claims database. COPD patients were randomly divided into two groups to develop and validate severity scores. A principal component analysis was used to estimate factor loadings used to weight calculations of COPD severity scores. Score validity was evaluated using a linear trend test to predict COPD treatment costs and acute exacerbation events.
Using records from 880 patients, ten variables were created: acute exacerbation events, emphysema diagnoses, laboratory test and oxygen therapy procedures, prescribed anticholinergic, inhaled corticosteroid (ICS), short acting beta-agonist, and long acting bronchodilator (LABA) agents, asthma diagnosis and patient birth years. Factor loadings from LABA and ICS prescriptions had the strongest impacts on estimated severity scores (0.50 and 0.49, respectively). Among 300 validation group patients, scores were found to associate with increasing trends of median costs and exacerbation risks (
Estimatedseverity scores would help to predict COPD-related medical costs and exacerbation events. For further clinical implementation, this classification system should be re-evaluated using clinical lung functions information indicative COPD severity and treatment choices.
Author Contributions
Copyright© 2017
Konomura Keiko, et al.
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Competing interests The authors have declared that no competing interests exist.
Funding Interests:
Citation:
Introduction
Chronic obstructive pulmonary disease (COPD) is a progressive disease characterized by chronic dyspnea, cough, sputum production, and mainly attributed to long-term exposure to tobacco smoke. COPD is the tenth-most common cause of death in Japan, and the number of associated deaths has exhibited an increasing trend Health insurance claims databases, which reflect real-world clinical environments, are important research tools with respect to drug safety monitoring, epidemiology and health economic studies. These databases include information about provided medical services, including disease diagnoses, procedures, and prescribed medications. Under the universal health insurance coverage system in Japan, patients can evenly use all available services, thus allowing the database collection of comprehensive information for patients living in Japan In the present study, we focused on COPD-related disease scores as a method of evaluating disease-specific costs and health service utilization. Various administrative database-focused COPD severity scores have been reported
Results
We identified 1,784 patients with COPD diagnostic codes in 2011 ( Comorbid conditions were defined at least one International Classification of Diseases, 10th Revision (ICD-10) diagnosis code and a record of the Anatomical Therapeutic Chemical (ATC) classification system during the 12month period from the first COPD diagnosis. The details were as follows: asthma was defined as ICD-10 code J45 with ATC code R03; hypertension was I10 with C02, C03, C07-9, or C011; ischemic heart disease was I20-25 with C01, C07-10, or B01; and diabetes was E10-14 with A10. PCA results are summarized in Short-acting beta-agonist, aLong-acting beta2-agonist Severity score = (age × 0.12) + (emphysema × 0.25) + (laboratory tests × 0.31) + (anticholinergics × 0.30) + (SABA × 0.21) + (LABA × 0.50) + (inhaled corticosteroids × 0.49) + (oxygen therapy × 0.23) + (acute exacerbations × 0.28) + (asthma × 0.27). Each variable has a possible value of 0–12 except for age, which is a continuous variable. Short-acting beta-agonist, Long-acting beta2-agonist, Q1–Q4 indicate the score quartiles Severity scores were re-calculated in the validating group (n = 300), using factor loadings from the development step. Distribution scores ranged from 4 to 34 and were similar between the developing and validating groups, as shown in
Variables
Development group
Validation group
n
(%)
n
(%)
Total COPD patients
880
300
Total COPD claims
7,066
2,379
Claims per patient
8
8
Age categories (at first COPD diagnosis) Mean age (SD)
56
(9)
55
(9)
40-49 years
247
(28)
92
(31)
50-59 years
300
(34)
114
(38)
60-69 years
250
(28)
68
(23)
70 years or older
83
(9)
26
(9)
Sex Male
464
(53)
165
(55)
Comorid conditions (ICD-10)
Asthma (145)
445
(51)
161
(54)
Hypertension (110)
179
(20)
68
(23)
Ischemic Heart Disease (120-25)
96
(11)
27
(9)
Diabetes (E10-E14)
50
(6)
28
(9)
1Age
0.12
0.61
2Emphysema (current)
0.25
0.58
3Laboratory tests (number of claims)
0.31
0.57
4Anti-cholinergics (number of claims)
0.30
0.56
5 SABA
0.21
0.60
6LABA
0.50
0.53
7Inheled corticosteroids (number of claims)
0.49
0.53
8Oxygen therapy (number of claims)
0.23
0.59
9AECB
0.28
0.58
10Asthma (number of claims)
0.27
0.59
Total Cronbach's alpha
0.60
Variables
Age
49.3
56.8
58.7
58.7
Emphysema (current)
0.08
0.53
1.34
2.89
Laboratory tests (number of claims)
0.82
1.44
2.66
3.17
Anti-cholinergics (number of claims)
0.07
0.49
1.75
3.40
SABA(number of claims)
0.25
0.44
0.87
1.70
LABA
0.50
1.00
2.04
6.78
Inheled corticosteroids (number of claims)
0.33
1.11
2.41
6.61
Oxygen therapy (number of claims)
0.01
0.15
0.26
0.84
Acute exacerbation (number of episodes)
0.70
1.19
2.05
3.63
Asthma (number of claims)
0.31
0.60
1.18
2.88
Discussion
This study developed a COPD severity classification method using a Japanese administrative database and validated the performance of this method. Score validity was confirmed by estimating COPD treatment costs and acute exacerbation risks, with higher scores indicating worse COPD conditions. Accordingly, this severity classification system could be used as a risk adjustment factor to control for potential confounders in administrative database studies. Few attempts to classify COPD conditions in an administrative database have been published. Notably, Macaulay and colleagues reported that they had classified COPD patients into three severity groups according to spirometry test results and GOLD criteria in a study based on an electronic health records database linked to a health care claims database Wu and colleagues previously developed a classification method using a claims database in the US In our study, we added asthma variable and excluded three variables (hospitalization due to acute exacerbation, pulmonologist visit, and use of oral corticosteroid) from the method described by Wu and colleagues to increase score reliability for the following reasons. First, asthma is an important risk factor of COPD. Our study population had approximately 50% of asthma diagnosis. Second, our database included long-term hospitalized patients who required no aggressive treatments. Third, not all patients received COPD services from pulmonologists; some occasionally received services from doctors in other departments. In addition, when patients received COPD services from large hospitals, codes indicative of the doctors specialties were often missing. Last, the oral corticosteroid variable was used to define an acute exacerbation event. In our study, prescriptions of anticholinergic, LABA, and ICS agents were strongly associated with higher severity scores. However, this trend was in contrast to the findings of Wu et al., who reported the strongest effects with prescriptions of anticholinergic, SABA, and LABA agents. This discrepancy might be attributable to differences in clinical guideline recommendations. The GOLD criteria regarding stable COPD management recommend the use of anticholinergics or SABA for mild conditions and LABA for moderate conditions Our method yielded an insufficient Cronbach s alpha compared to that obtained by Wu et al (0.60 vs. 0.71). Cronbach s alpha assesses reliability among variables. A value of 1 indicates completely consistent variables, whereas a score of 0 indicates no correlation among variables. Values of 0.7-0.8 should be regarded as satisfactory17; however, our Cronbach s alpha failed to reach this range. Infrequently observed variables such as oxygen therapy could potentially explain this issue; as such variables had small correlation coefficients, Cronbach s alpha would be smaller. However, we did not remove these variables because they were very important indicators of COPD severity. One advantage of this study was the ability to create severity scores without retrieving respiratory function test data from an administrative database. A multivariable analysis (e.g., logistic regression analysis) is usually used to create a score by setting a dependent variable such as the health or economic outcome, and accordingly calculates the weights of independent variables12. In the case of COPD severity, respiratory function test results and symptoms are needed to define the dependent variable. Unfortunately, these data are not available in Japanese administrative databases. However, using the PCA method, it was possible to calculate the weight (factor loading) of each independent variable without setting a specific dependent variable. This statistical technique is a way to overcome this weakness associated with administrative databases. In addition, the data included in the JMDC database were collected through the insurance reimbursement process; therefore, information is rarely missing. Moreover, under the Japanese national health insurance program, all services provided to COPD patients should be almost fully covered. For these reasons, our classification method was developed using all records of COPD treatment provided to Japanese patients. However, this study also included limitations common to administrative database studies Approximately 50% of patients in our study had comorbid asthma. It is difficult to clinically distinguish COPD from asthma, and therefore these diagnoses often overlap (asthma-COPD overlap syndrome). In previous research showed the prevalence of asthma-COPD overlap syndrome was 1.8-56.0% PCA often faces problems related to the low reproducibility of factor loading as a score system basis. Reproducibility depends on treatment patterns in a database. Therefore, when using different data sources, researchers should re-calculate factor loading, as demonstrated in our study. Furthermore, additional studies in which our findings are applied for clinical usage are needed. We will compare the performance of COPD severity scores and clinical conditions using electronic health records at a large-scale hospital in Japan. The severity scores calculated from factor loadings in this study are relative values and cannot be used for distributions of COPD severities (i.e., proportions of mild vs. more severe conditions). Therefore, we will set the cut-off values according to the GOLD criteria. These criteria allow the classification of COPD conditions into four severity categories depending on the values of respiratory function tests. We will evaluate the scores using the c-statistic, positive predictive value, or negative predictive value according to the electronic health records database. These techniques have been used previously to assess model discrimination and validate severity classification methods
Conclusion
In this study, a COPD severity classification method based on an administrative database in Japan was developed. This method is able to estimate COPD conditions without requiring laboratory test or clinical symptom data. For clinical implementation, we will confirm the validity of this classification system through comparison with medical information, including laboratory data. This classification method is a very important step in the adjustment of potential outcome risk factors according to administrative databases.