This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Dermatology, is properly cited. The complete bibliographic information, a link to the original publication on http://derma.jmir.org, as well as this copyright and license information must be included.
Hidradenitis suppurativa (HS) is a potentially debilitating, chronic, recurring inflammatory disease. Observational databases provide opportunities to study the epidemiology of HS.
This study’s objective was to develop phenotype algorithms for HS suitable for epidemiological studies based on a network of observational databases.
A data-driven approach was used to develop 4 HS algorithms. A literature search identified prior HS algorithms. Standardized databases from the Observational Medical Outcomes Partnership (n=9) were used to develop 2 incident and 2 prevalent HS phenotype algorithms. Two open-source diagnostic tools, CohortDiagnostics and PheValuator, were used to evaluate and generate phenotype performance metric estimates, including sensitivity, specificity, positive predictive value (PPV), and negative predictive value.
We developed 2 prevalent and 2 incident HS algorithms. Validation showed that PPV estimates were highest (mean 86%) for the prevalent HS algorithm requiring at least two HS diagnosis codes. Sensitivity estimates were highest (mean 58%) for the prevalent HS algorithm requiring at least one HS code.
This study illustrates the evaluation process and provides performance metrics for 2 incident and 2 prevalent HS algorithms across 9 observational databases. The use of a rigorous data-driven approach applied to a large number of databases provides confidence that the HS algorithms can correctly identify HS subjects.
Hidradenitis suppurativa (HS) is a chronic, recurring inflammatory disease of the skin. Clinically, subjects have nodules, draining skin tunnels (ie, sinus tracts), abscesses, and bands of severe scar formation in the intertriginous skin areas, such as the axillary, groin, perianal, perineal, and inframammary regions [
The use of real-world evidence from observational data is valuable for studying the epidemiology, clinical manifestations, and real-world experience of patients with HS. A critical step in using observational data for the study of HS is the development of accurate phenotype algorithms (PAs). A PA is the translation of the case definition of a health condition or phenotype into an executable algorithm based on clinical data elements in a database [
The objectives of this study were to develop HS PAs, evaluate their performance, and characterize the resultant HS phenotypes across a network of 9 US and non-US observational databases. This study used a data-driven framework and developed HS PAs for use in observational databases.
A literature search was conducted to identify studies that describe the codes and logic used to identify HS patients in observational databases. This literature search identified 30 articles, which provided a set of diagnosis codes for the identification of HS across vocabularies, including the ICD-9, the International Classification of Diseases, Tenth Revision (ICD-10), and Read codes. Five of the 30 articles included validation metrics. Our study utilized the Systemized Nomenclature of Medicine (SNOMED) vocabulary to develop the codes. The vocabulary and diagnostic codes used in the published studies and the SNOMED terms are presented in
The observational databases used in this study were not created specifically to study HS. The observational data were obtained in the delivery of health care or for administrative or billing purposes in electronic format. A network of 9 observational databases (4 administrative claims databases from the United States, 1 from Japan, 1 from France, 1 from Germany, and 1 from Australia; and 1 US electronic health record [EHR] database;
Four HS PAs were developed and evaluated in subjects of all ages [
The OHDSI CohortDiagnostics tool [
Use of the PheValuator [
Computer code for PheValuator and CohortDiagnostics and the JSON files for the PAs are available on the authors’ website [
Description of databases used in the study.
Name | Years | Country | Data type | Clinical visits included | Subjects, n (millions) | Age at first observation, average (years) | Female subjects, % | Length of follow-up, median (years) |
IBM MarketScan Commercial Claims and Encounters | 2000-2021 | United States | Insurance claims | Inpatient/outpatient | 157 | 31 | 51 | 1.56 |
IBM MarketScan Multi-State Medicaid | 2006-2020 | United States | Insurance claims | Inpatient/outpatient | 31 | 23 | 56 | 1.52 |
IBM MarketScan Medicare Supplemental | 2000-2021 | United States | Insurance claims | Inpatient/outpatient | 10 | 71 | 55 | 2.46 |
Optum’s de-identified Clinformatics Data Mart Database | 2007-2021 | United States | Insurance claims | Inpatient/outpatient | 71 | 37 | 51 | 1.48 |
Optum Electronic Health Records | 2007-2021 | United States | Electronic health records | Inpatient/outpatient | 99 | 37 | 53 | 2.63 |
Japan Medical Data Center | 2000-2021 | Japan | Insurance claims | Inpatient/outpatient | 12 | 31 | 49 | 3.29 |
IQVIA Disease Analyzer–France | 2016-2021 | France | General practitioner data | Outpatient | 4 | 37 | 52 | 0.9 |
IQVIA Disease Analyzer–Germany | 2011-2021 | Germany | General practitioner data with supplemental data from participating specialists | Outpatient | 31 | 43 | 56 | 0.5 |
IQVIA Australian Longitudinal Patient Data | 1996-2020 | Australia | General practitioner data | Outpatient | 5 | 37 | 22a | 0.5 |
a59% of subjects did not have a designated sex in this study.
Schematics of phenotype algorithms for Hidradenitis suppurativa (HS).
The use of the IBM and Clinformatics databases was reviewed by the New England Institutional Review Board and was determined to be exempt from broad approval, as this project did not involve human subject research. Patient consent for publication was not required. All patients in the databases were deidentified, and the identities of data contributors were removed.
We examined cohort characteristics of the PAs. These characteristics may be viewed interactively online [
A comparison of standardized differences between the incident 1x and the incident 2x cohorts for 3 data sets across 5 different time frames is shown in
We examined the incident 2x algorithm for subject characteristics across the databases. We identified a higher proportion of female subjects with HS compared to male subjects. The largest disproportionality was in the MDCD database, in which 82% of the subjects were female. The JMDC database had the lowest disproportionality by sex, with 45% female subjects. An outpatient visit was the most common type of clinical visit for the first diagnosis of HS. Less than 5% of first diagnoses were made during an emergency room visit, with the exception of the MDCD database, for which the proportion was 10%. Examination of the index codes or diagnosis codes that allowed subjects into cohorts showed that the most prevalent code was the diagnosis code of “hidradenitis suppurativa” (SNOMED code 4241223; ICD-10 L73.2) in all databases except the CCAE database, in which the most prevalent code was a diagnosis code of “hidradenitis” (SNOMED code 434119; ICD-9 705.83).
Graphical depiction of the overlap in subjects between the 2 incidence cohorts and the 2 prevalence cohorts. CCAE: IBM MarketScan Commercial Claims and Encounters; Clinformatics DOD: Optum’s de-identified Clinformatics Data Mart Database; IALPD: IQVIA Australian Longitudinal Patient Data; IDAF: IQVIA Disease Analyzer–France; IDAG: IQVIA Disease Analyzer–Germany; JMDC: Japan Medical Data Center; MDCD: IBM MarketScan Multi-State Medicaid; MDCR: IBM MarketScan Medicare Supplemental; Optum EHR: Optum Electronic Health Records.
Comparison of the proportion of subjects in the incident 1x cohort and the incident 2x cohort for 3 selected data sets with different demographic characteristics. Points closer to the diagonal indicate similar proportions between the comparators; points farther from the diagonal indicate more disparate proportions. CCAE: IBM MarketScan Commercial Claims and Encounters; Clinformatics DOD: Optum’s de-identified Clinformatics Data Mart Database; MDCD: IBM MarketScan Multi-State Medicaid.
Incidence rates for HS (for the incident 2x algorithm) from 2015 to 2020 differed between databases. The MDCD database had the highest rate at 23 per 100,000 person-years. The rates in the CCAE, Clinformatics DOD, and Optum EHR databases were approximately 12 per 100,000 person-years. Rates in the IDAG and IDAF databases and the JMDC and the IBM MarketScan Medicare Supplemental Database (MDCR) databases were 1 per 100,000 person-years. The rate in the IALPD database was undetectable, likely due to the small sample size. The incidence rates peaked in subjects in the 20- to 29-year-old age group. The incidence rates in the 30- to 39-year-old age group in the MDCD and IDAG databases were higher than in the older age groups but were similar to the 20- to 29-year-old age group. Incidence rates in female subjects were generally higher than in male subjects and were highest in the MDCD database at 24 per 100,000 person-years, followed by 11 per 100,000 person-years in the CCAE, Clinformatics DOD, and Optum EHR databases and 1 per 100,000 person-years in the IDAF database. The rate in female subjects was equal to the rate in male subjects in the MDCR database at 2 per 100,000 person-years.
Performance characteristics for the HS phenotypes assessed using the PheValuator method are presented in
Performance characteristics of the hidradenitis suppurativa phenotypes based on the PheValuator methodology.
Phenotype algorithm/database | Sensitivity (95% CI) | PPVa (95% CI) | Specificity (95% CI) | NPVb (95% CI) | ||||
|
||||||||
|
IBM MarketScan Commercial Database | 0.380 (0.367-0.393) | 0.599 (0.582-0.615) | 0.999 (0.999-0.999) | 0.998 (0.998-0.998) | |||
|
Optum’s de-identified Clinformatics Data Mart Database | 0.369 (0.358-0.380) | 0.603 (0.589-0.617) | 0.999 (0.999-0.999) | 0.998 (0.997-0.998) | |||
|
IBM MarketScan Multi-State Medicaid Database | 0.311 (0.306-0.317) | 0.676 (0.668-0.685) | 0.998 (0.998-0.998) | 0.990 (0.990-0.990) | |||
|
IBM MarketScan Medicare Supplemental Database | 0.298 (0.277-0.319) | 0.444 (0.417-0.472) | 1.000 (1.000-1.000) | 0.999 (0.999-0.999) | |||
|
Optum’s de-identified Electronic Health Record dataset | 0.279 (0.269-0.289) | 0.777 (0.761-0.793) | 1.000 (1.000-1.000) | 0.997 (0.997-0.997) | |||
|
||||||||
|
IBM MarketScan Commercial Database | 0.151 (0.142-0.161) | 0.890 (0.868-0.909) | 1.000 (1.000-1.000) | 0.998 (0.998-0.998) | |||
|
Optum’s de-identified Clinformatics Data Mart Database | 0.133 (0.126-0.141) | 0.882 (0.862-0.900) | 1.000 (1.000-1.000) | 0.997 (0.996-0.997) | |||
|
IBM MarketScan Multi-State Medicaid Database | 0.115 (0.112-0.119) | 0.874 (0.862-0.885) | 1.000 (1.000-1.000) | 0.987 (0.987-0.987) | |||
|
IBM MarketScan Medicare Supplemental Database | 0.109 (0.095-0.123) | 0.830 (0.778-0.874) | 1.000 (1.000-1.000) | 0.999 (0.999-0.999) | |||
|
Optum de-identified Electronic Health Record dataset | 0.109 (0.102-0.116) | 0.948 (0.931-0.962) | 1.000 (1.000-1.000) | 0.997 (0.996-0.997) | |||
|
||||||||
|
IBM MarketScan Commercial Database | 0.541 (0.531-0.551) | 0.649 (0.639-0.660) | 0.999 (0.999-0.999) | 0.998 (0.998-0.998) | |||
|
Optum’s de-identified Clinformatics Data Mart Database | 0.666 (0.655-0.677) | 0.602 (0.591-0.613) | 0.998 (0.998-0.998) | 0.999 (0.999-0.999) | |||
|
IBM MarketScan Multi-State Medicaid Database | 0.664 (0.658-0.670) | 0.628 (0.621-0.634) | 0.995 (0.995-0.995) | 0.996 (0.996-0.996) | |||
|
IBM MarketScan Medicare Supplemental Database | 0.442 (0.422-0.462) | 0.355 (0.338-0.373) | 0.999 (0.999-0.999) | 0.999 (0.999-0.999) | |||
|
Optum de-identified Electronic Health Record dataset | 0.632 (0.618-0.647) | 0.754 (0.739-0.768) | 1.000 (1.000-1.000) | 0.999 (0.999-0.999) | |||
|
||||||||
|
IBM MarketScan Commercial Database | 0.296 (0.285-0.307) | 0.874 (0.860-0.887) | 1.000 (1.000-1.000) | 0.997 (0.997-0.998) | |||
|
Optum’s de-identified Clinformatics Data Mart Database | 0.233 (0.220-0.246) | 0.937 (0.920-0.951) | 1.000 (1.000-1.000) | 0.998 (0.998-0.998) | |||
|
IBM MarketScan Multi-State Medicaid Database | 0.219 (0.203-0.236) | 0.732 (0.699-0.764) | 1.000 (1.000-1.000) | 0.999 (0.999-0.999) | |||
|
IBM MarketScan Medicare Supplemental Database | 0.288 (0.282-0.294) | 0.859 (0.851-0.867) | 0.999 (0.999-0.999) | 0.992 (0.992-0.992) | |||
|
Optum de-identified Electronic Health Record dataset | 0.231 (0.222-0.239) | 0.912 (0.900-0.923) | 1.000 (1.000-1.000) | 0.996 (0.996-0.996) |
aPPV: positive predictive value.
bNPV: negative predictive value.
This study sought to develop and determine the accuracy of 4 HS PAs. The 4 PAs included 2 for incidence and 2 for prevalence, with one in each group having high sensitivity and specificity. Use of the PheValuator method allowed for estimation of sensitivity, specificity, PPV, and NPV without manual chart review. While both the incident and prevalent PAs were useful for the exploration of HS in observational databases, the PAs with definitions requiring just a single HS diagnosis code had lower specificity and higher sensitivity than the definitions requiring 2 codes, which had higher specificity and lower sensitivity. Thus, the choice of which algorithm to use is dependent on the research question being explored. For example, the use of a more sensitive algorithm would be applicable for safety studies, in which the PA is used to determine HS outcomes and missed identification of possible cases is problematic, whereas the use of a PA with higher specificity would be useful for treatment comparison studies, in which the goal is to ensure that all subjects exposed to a treatment have a high probability of having HS.
A few studies have included validation metrics for HS algorithms for observational databases [
Strengths of our study include the use of a rigorous, data-driven approach for generating and evaluating the HS phenotypes across a data network that included 9 databases covering US and non-US countries. Network-based phenotype evaluations greatly strengthen the knowledge base for a given algorithm, because they allow the assessment of the consistency of findings across data types, geographic locations, and time periods. When concordant trends emerge, it increases confidence that the observations are the effect of the PA itself rather than an artifact of a particular data source. The PAs were analyzed using multiple approaches, providing ancillary verification of decisions made in determining the cohort logic. Our study includes several study artifacts, including JSON files for the PAs, computer code, and results for all the analyzed PAs, providing transparency in our interpretation of the results.
There were also several limitations to our study. We used administrative data sets primarily maintained for insurance billing, which are well-known to have significant deficits, including coding inaccuracies [
This study developed and evaluated 4 HS PAs using a rigorous, data-driven approach and generated phenotype performance metrics including sensitivity, specificity, PPV, and NPV. Based on the analyses, we recommend that PAs requiring a single HS diagnosis code be used in studies requiring high sensitivity, while studies requiring high specificity should use PAs requiring 2 HS diagnosis codes. These algorithms will enable researchers to use large observational databases to research HS, which has a high burden of disease. There is a need for better evidence, as currently there are clinical knowledge gaps for HS that observational data is well suited to address.
Diagnostic codes.
IBM MarketScan Commercial Claims and Encounters
continuous enrollment
Optum’s de-identified Clinformatics Data Mart Database
electronic health record
hidradenitis suppurativa
IQVIA Australian Longitudinal Patient Data
International Classification of Diseases, Ninth Revision
International Classification of Diseases, Tenth Revision
IQVIA Disease Analyzer–France
IQVIA Disease Analyzer–Germany
Japan Medical Data Center
IBM MarketScan Multi-State Medicaid
IBM MarketScan Medicare Supplemental
negative predictive value
Observational Health Data Sciences and Informatics
Observational Medical Outcomes Partnership
phenotype algorithm
positive predictive value
standard mean difference
Systemized Nomenclature of Medicine
Manuscript review was provided by Anna Sheahan, PhD. All authors contributed to all aspects of the study (study design and execution, data analysis and interpretation, and writing of the manuscript). This research was funded by Janssen Research and Development, LLC. The data source for this study was a retrospective claims database and thus there are no patient or public contributors.
The data used for this study are proprietary and only available through a licensing data-use agreement process. This process ensures that confidentiality of the data contributors is maintained and that the data are used appropriately. The MarketScan Research Database can be licensed by researchers.
All authors are employees of Janssen Research and Development, LLC, and may own stock or stock options. The work performed for this study was part of their employment.