Journal of | | Open Access Pub

Abstract

Background:

Healthcare services provided to patients should vary depending on disease severity. However, disease severity bias, a type of selection bias, is a commonly encountered problem in administrative database studies. Herein, we selected chronic obstructive pulmonary disease (COPD), which commonly affects elderly Japanese citizens, for the development and validation of a severity classification system based on a health insurance claims database.

Methods:

Patients who received COPD-related diagnostic codes in 2011 were selected from a commercially based health insurance claims database. COPD patients were randomly divided into two groups to develop and validate severity scores. A principal component analysis was used to estimate factor loadings used to weight calculations of COPD severity scores. Score validity was evaluated using a linear trend test to predict COPD treatment costs and acute exacerbation events.

Results:

Using records from 880 patients, ten variables were created: acute exacerbation events, emphysema diagnoses, laboratory test and oxygen therapy procedures, prescribed anticholinergic, inhaled corticosteroid (ICS), short acting beta-agonist, and long acting bronchodilator (LABA) agents, asthma diagnosis and patient birth years. Factor loadings from LABA and ICS prescriptions had the strongest impacts on estimated severity scores (0.50 and 0.49, respectively). Among 300 validation group patients, scores were found to associate with increasing trends of median costs and exacerbation risks (p for trend < 0.05).

Conclusions:

Estimatedseverity scores would help to predict COPD-related medical costs and exacerbation events. For further clinical implementation, this classification system should be re-evaluated using clinical lung functions information indicative COPD severity and treatment choices.

Introduction

Chronic obstructive pulmonary disease (COPD) is a progressive disease characterized by chronic dyspnea, cough, sputum production, and mainly attributed to long-term exposure to tobacco smoke. COPD is the tenth-most common cause of death in Japan, and the number of associated deaths has exhibited an increasing trend 1. The previous study estimated that 5.3 million individuals aged ≥40 years were at risk of COPD in 2001 (estimated prevalence rate: 8.6%) 2. In addition, statistical surveys reported that patients with COPD accounted for expenditures totaling 151 billion yen (approximately 0.4% of the Japanese total medical expenditures) in 2011 3.

Health insurance claims databases, which reflect real-world clinical environments, are important research tools with respect to drug safety monitoring, epidemiology and health economic studies. These databases include information about provided medical services, including disease diagnoses, procedures, and prescribed medications. Under the universal health insurance coverage system in Japan, patients can evenly use all available services, thus allowing the database collection of comprehensive information for patients living in Japan 4. However, important clinical information is not included in these databases such as results of clinical test and disease severity. Appropriate treatment is provided according to a patient’s medical needs, which are determined by the disease condition and/or severity 5. In the absence of such information, estimated treatment effects determined through database studies are often biased due to confounding by indication 6. Thus, when using health insurance claims, summary variables indicative of disease conditions or severity must be created using diagnostic code and/or prescribed medications. For example, the Charlson comorbidity index was developed to predict mortality 7, and the Elixhauser comorbidity measure was developed to predict health-related outcomes 8. COPD severity is generally assessed according to the results of respiratory function tests, using Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria 9.

In the present study, we focused on COPD-related disease scores as a method of evaluating disease-specific costs and health service utilization. Various administrative database-focused COPD severity scores have been reported 101112, from among these, we selected a scoring system developed by Wu and colleagues that corresponded to our research purposes 10. This score, which was developed in the United States (US) for patients with COPD who experience acute exacerbation events, was used to estimate drug utilization and medical costs related to COPD severity without relying on respiratory function test data 13. However, the patients evaluated in that study might have had a more severe disease condition, compared with the general population of Japanese patients with COPD. Moreover, drug selections for COPD management differ between the US and Japanese clinical environments: for example, the transdermal tulobuterol patch, a long-acting β2-agonist (LABA), has been frequently used in Japan for the long-term management of stable patients with COPD 14. Therefore, we decided to modify the severity scoring system developed by Wu and colleagues to the Japanese clinical environment and validate this modification using a Japanese administrative database.

Results

Results Patient Characteristics

We identified 1,784 patients with COPD diagnostic codes in 2011 (Figure 1). The following patients were excluded: 186 who did not receive another COPD diagnostic code during the 12-month follow-up, 224 without adequate continuous follow-up data, 23 who were younger than 40 years, 156 with no evidence of COPD-related laboratory tests or prescriptions, and 15 who underwent surgery related to cancer diagnosis. The remaining 1,180 patients with COPD were included in the study and assigned randomly to a developing group (n = 880) or validating group (n = 300). Demographic characteristics of patients in each group are described and compared in Table 1. The distributions of age categories, sex, and comorbid conditions were similar between the groups. Patients in both groups received an average of approximately eight COPD claims.

Figure 1. Selection criteria for the study population, chronic obstructive pulmonary disease.

Table 1. Demographic characteristics of the developing and validating groups

Variables	Development group		Validation group
	n	(%)	n	(%)
Total COPD patients	880		300
Total COPD claims	7,066		2,379
Claims per patient	8		8

Age categories (at first COPD diagnosis) Mean age (SD)	56	(9)	55	(9)
40-49 years	247	(28)	92	(31)
50-59 years	300	(34)	114	(38)
60-69 years	250	(28)	68	(23)
70 years or older	83	(9)	26	(9)
Sex Male	464	(53)	165	(55)
Comorid conditions (ICD-10) *
Asthma (145)	445	(51)	161	(54)
Hypertension (110)	179	(20)	68	(23)
Ischemic Heart Disease (120-25)	96	(11)	27	(9)
Diabetes (E10-E14)	50	(6)	28	(9)

Comorbid conditions were defined at least one International Classification of Diseases, 10th Revision (ICD-10) diagnosis code and a record of the Anatomical Therapeutic Chemical (ATC) classification system during the 12month period from the first COPD diagnosis. The details were as follows: asthma was defined as ICD-10 code J45 with ATC code R03; hypertension was I10 with C02, C03, C07-9, or C011; ischemic heart disease was I20-25 with C01, C07-10, or B01; and diabetes was E10-14 with A10.

Developing the Severity Score

PCA results are summarized in Table 2. Among the ten variables, the factor loadings of LABA and ICS were most reflective of the severity of COPD conditions (0.50 and 0.49, respectively). Cronbach’s alpha value was 0.60. Patients were categorized into quartiles according to scores (Q1 and Q4 were the lowest and highest scores, respectively). The mean value of the counted variables was calculated for each group (Table 3), and the results indicated increased trends in higher severity score quartiles.

Table 2. Factor loading estimated from a principal component analysis.

Variables	Factor loading	Cronbach's alpha removing each variable
1Age	0.12	0.61
2Emphysema (current)	0.25	0.58
3Laboratory tests (number of claims)	0.31	0.57
4Anti-cholinergics (number of claims)	0.30	0.56
5 SABA* (number of claims)	0.21	0.60
6LABAa (number of claims)	0.50	0.53
7Inheled corticosteroids (number of claims)	0.49	0.53
8Oxygen therapy (number of claims)	0.23	0.59
9AECB* (number of episodes)	0.28	0.58
10Asthma (number of claims)	0.27	0.59
Total Cronbach's alpha	0.60

Short-acting beta-agonist,

^aLong-acting beta2-agonist

Severity score = (age × 0.12) + (emphysema × 0.25) + (laboratory tests × 0.31) + (anticholinergics × 0.30) + (SABA × 0.21) + (LABA × 0.50) + (inhaled corticosteroids × 0.49) + (oxygen therapy × 0.23) + (acute exacerbations × 0.28) + (asthma × 0.27).

Each variable has a possible value of 0–12 except for age, which is a continuous variable.

Table 3. Average number of variables in the developing group

Variables	Q1	Q2	Q3	Q4
	n=220	n=220	n=220	n=220
Age	49.3	56.8	58.7	58.7
Emphysema (current)	0.08	0.53	1.34	2.89
Laboratory tests (number of claims)	0.82	1.44	2.66	3.17
Anti-cholinergics (number of claims)	0.07	0.49	1.75	3.40
SABA(number of claims)	0.25	0.44	0.87	1.70
LABAa(number of claims)	0.50	1.00	2.04	6.78
Inheled corticosteroids (number of claims)	0.33	1.11	2.41	6.61
Oxygen therapy (number of claims)	0.01	0.15	0.26	0.84
Acute exacerbation (number of episodes)	0.70	1.19	2.05	3.63
Asthma (number of claims)	0.31	0.60	1.18	2.88

Short-acting beta-agonist,

Long-acting beta2-agonist, Q1–Q4 indicate the score quartiles

Predictive Performance

Severity scores were re-calculated in the validating group (n = 300), using factor loadings from the development step. Distribution scores ranged from 4 to 34 and were similar between the developing and validating groups, as shown in Figure 2. When severity scores were divided into three categories, mild, moderate, and severe/very severe, the median costs were 79,027 yen, 204,445 yen, and 422,463 yen, respectively, indicating an increasing trend (p for trend < 0.05, Figure 3). In addition, a similar increasing trend was observed for the risk of an acute exacerbation event (48%, 61%, and 83%, respectively; Figure 4).

Figure 2. Distribution of severity scores in the developing and validating groups. Figure 3. Total costs (yen) per year for chronic obstructive pulmonary disease (COPD) treatment in the validating group. Figure 4. Probability of acute exacerbation per year in the validating group.

Discussion

Discussion Brief Statement of the Principal Findings

This study developed a COPD severity classification method using a Japanese administrative database and validated the performance of this method. Score validity was confirmed by estimating COPD treatment costs and acute exacerbation risks, with higher scores indicating worse COPD conditions. Accordingly, this severity classification system could be used as a risk adjustment factor to control for potential confounders in administrative database studies.

Comparison with Similar Studies

Few attempts to classify COPD conditions in an administrative database have been published. Notably, Macaulay and colleagues reported that they had classified COPD patients into three severity groups according to spirometry test results and GOLD criteria in a study based on an electronic health records database linked to a health care claims database 12. As their database included respiratory function test results, the authors were able to define COPD severity based on GOLD criteria. In addition, Mapel and colleagues developed a method for identifying and characterizing COPD 11. These authors stratified patients according to comorbid respiratory conditions and medical procedures but used coding systems unique to the US (such as ICD-9 and CPT-4 codes), with no counterparts in Japanese claims systems, to define COPD severity. Moreover, Eisner and colleagues created COPD severity scores that used patient survey data but did not require respiratory function tests 23. That scoring system, however, required health-related quality of life and physical disability-related information that are rarely included in administrative databases. As a result, a coding system that required neither the results of respiratory function tests nor patient-reported outcomes was required.

Wu and colleagues previously developed a classification method using a claims database in the US 10. Their research included 2,068 patients with an acute exacerbation of chronic bronchitis due to COPD. Twelve variables were selected to calculate COPD severity scores: number of days of hospitalization due to acute exacerbation; number of claims for oxygen therapy, acute exacerbation, emphysema, spirometry test, pulmonologist visit; prescriptions of anticholinergic, oral corticosteroid, ICS, SABA, and LABA agents; and patient age. The method developed by Wu and colleagues was later used to examine the utilization and cost of medical services according to COPD severity 13, and was validated using another administrative database, although no direct comparison of respiratory function test values was performed 24.

In our study, we added asthma variable and excluded three variables (hospitalization due to acute exacerbation, pulmonologist visit, and use of oral corticosteroid) from the method described by Wu and colleagues to increase score reliability for the following reasons. First, asthma is an important risk factor of COPD. Our study population had approximately 50% of asthma diagnosis. Second, our database included long-term hospitalized patients who required no aggressive treatments. Third, not all patients received COPD services from pulmonologists; some occasionally received services from doctors in other departments. In addition, when patients received COPD services from large hospitals, codes indicative of the doctors specialties were often missing. Last, the oral corticosteroid variable was used to define an acute exacerbation event.

In our study, prescriptions of anticholinergic, LABA, and ICS agents were strongly associated with higher severity scores. However, this trend was in contrast to the findings of Wu et al., who reported the strongest effects with prescriptions of anticholinergic, SABA, and LABA agents. This discrepancy might be attributable to differences in clinical guideline recommendations. The GOLD criteria regarding stable COPD management recommend the use of anticholinergics or SABA for mild conditions and LABA for moderate conditions 4. On the other hand, the Japanese guideline recommends initiating LABA for mild conditions and adding ICS for more severe conditions. The findings from our estimate scores reflect these differences in drug treatment options.

Our method yielded an insufficient Cronbach s alpha compared to that obtained by Wu et al (0.60 vs. 0.71). Cronbach s alpha assesses reliability among variables. A value of 1 indicates completely consistent variables, whereas a score of 0 indicates no correlation among variables. Values of 0.7-0.8 should be regarded as satisfactory17; however, our Cronbach s alpha failed to reach this range. Infrequently observed variables such as oxygen therapy could potentially explain this issue; as such variables had small correlation coefficients, Cronbach s alpha would be smaller. However, we did not remove these variables because they were very important indicators of COPD severity.

Strengths and Limitations

One advantage of this study was the ability to create severity scores without retrieving respiratory function test data from an administrative database. A multivariable analysis (e.g., logistic regression analysis) is usually used to create a score by setting a dependent variable such as the health or economic outcome, and accordingly calculates the weights of independent variables12. In the case of COPD severity, respiratory function test results and symptoms are needed to define the dependent variable. Unfortunately, these data are not available in Japanese administrative databases. However, using the PCA method, it was possible to calculate the weight (factor loading) of each independent variable without setting a specific dependent variable. This statistical technique is a way to overcome this weakness associated with administrative databases.

In addition, the data included in the JMDC database were collected through the insurance reimbursement process; therefore, information is rarely missing. Moreover, under the Japanese national health insurance program, all services provided to COPD patients should be almost fully covered. For these reasons, our classification method was developed using all records of COPD treatment provided to Japanese patients.

However, this study also included limitations common to administrative database studies 1012. Notably, we did not consider the risk factor of smoking history, because the variable was not available in our database. Even without considering the data, our method was capable of describing COPD severity with regard to age and other treatment procedures.

Approximately 50% of patients in our study had comorbid asthma. It is difficult to clinically distinguish COPD from asthma, and therefore these diagnoses often overlap (asthma-COPD overlap syndrome). In previous research showed the prevalence of asthma-COPD overlap syndrome was 1.8-56.0% 25262728. Therefore, we did not remove patients with asthma from the study population.

Implications for Research

PCA often faces problems related to the low reproducibility of factor loading as a score system basis. Reproducibility depends on treatment patterns in a database. Therefore, when using different data sources, researchers should re-calculate factor loading, as demonstrated in our study. Furthermore, additional studies in which our findings are applied for clinical usage are needed. We will compare the performance of COPD severity scores and clinical conditions using electronic health records at a large-scale hospital in Japan. The severity scores calculated from factor loadings in this study are relative values and cannot be used for distributions of COPD severities (i.e., proportions of mild vs. more severe conditions). Therefore, we will set the cut-off values according to the GOLD criteria. These criteria allow the classification of COPD conditions into four severity categories depending on the values of respiratory function tests. We will evaluate the scores using the c-statistic, positive predictive value, or negative predictive value according to the electronic health records database. These techniques have been used previously to assess model discrimination and validate severity classification methods 293031.

Conclusion

In this study, a COPD severity classification method based on an administrative database in Japan was developed. This method is able to estimate COPD conditions without requiring laboratory test or clinical symptom data. For clinical implementation, we will confirm the validity of this classification system through comparison with medical information, including laboratory data. This classification method is a very important step in the adjustment of potential outcome risk factors according to administrative databases.

Journal of Aging Research And Healthcare