Transparent, Reproducible, and Open Science Practices of Published Literature in Dermatology Journals: Cross-Sectional Analysis

Background: Reproducible research is a foundational component for scientific advancements, yet little is known regarding the extent of reproducible research within the dermatology literature. Objective: This study aimed to determine the quality and transparency of the literature in dermatology journals by evaluating for the presence of 8 indicators of reproducible and transparent research practices. Methods: By implementing a cross-sectional study design, we conducted an advanced search of publications in dermatology journals from the National Library of Medicine catalog. Our search included articles published between January 1, 2014, and December 31, 2018. After generating a list of eligible dermatology publications, we then searched for full text PDF versions by using Open Access Button, Google Scholar, and PubMed. Publications were analyzed for 8 indicators of reproducibility and transparency—availability of materials, data, analysis scripts, protocol, preregistration, conflict of interest statement, funding statement, and open access—using a pilot-tested Google Form. Results: After exclusion, 127 studies with empirical data were included in our analysis. Certain indicators were more poorly reported than others. We found that most publications (113, 88.9%) did not provide unmodified, raw data used to make computations, 124 (97.6%) failed to make the complete protocol available, and 126 (99.2%) did not include step-by-step analysis scripts. Conclusions: Our sample of studies published in dermatology journals do not appear to include sufficient detail to be accurately and successfully reproduced in their entirety. Solutions to increase the quality, reproducibility, and transparency of dermatology research are warranted. More robust reporting of key methodological details, open data sharing, and stricter standards journals impose on authors regarding disclosure of study materials might help to better the climate of reproducible research in dermatology. (JMIR Dermatol 2019;2(1):e16078) doi: 10.2196/16078


Introduction
Scientific research is currently facing a reproducibility crisis, with an estimated 50% to 90% of research having been suggested to be irreproducible [1 -3]. Supporting the notion of this crisis, the Reproducibility Project: Cancer Biology experienced failure of 32 of 50 replication attempts, in part owing to insufficient reporting of information necessary to reproduce the original study [4]. One study included in this large-scale project was conducted by Baker and Dolgin [5].
Aiming to better understand the causes of melanoma, the authors conducted whole-genome sequencing of 25 human telomerase reverse transcriptase-immortalized metastatic melanoma cells and reported that 6 different PREX2 gene mutations are common to melanoma cells. They additionally asserted that PREX2 mutations can increase the rate of tumor incidence compared with controls [5]. However, attempts to replicate these findings failed. In one such attempt, Berger et al [6] obtained samples of human skin cells used in the original study and assiduously copied the study's experimental conditions. They found that the median tumor-free survival was only 1 week, whereas the original study found that 70% of mice remained tumor-free at 9 weeks. These results ultimately made it impossible to determine whether PREX2 mutations influenced the rate of tumor incidence compared with control.
Reproducible research is a foundational component for scientific advancement [7]; however, many published works often lack essential reproducibility-related elements, such as openly shared data files, materials, and protocols [8,9] Equally problematic in terms of the lack of information sharing is the rate at which trials are prospectively registered before study commencement. For example, Nankervis et al [10] found that only 5% of eczema randomized controlled trials (RCTs) were preregistered, registered correctly, and registered with enough accessible information to assess whether the primary outcome aligned with the original registration. Preregistration can protect against selective outcome reporting bias and aid in reducing the prevalence of spurious and misleading results [11][12][13]. In addition, the dissemination of raw datasets from clinical research through Web-based repositories allows complex issues to be reanalyzed for confirmation or refutation by replication studies [14]. Furthermore, data sharing allows for further clarification through open discussion and helps to legitimize the quality and integrity of research outcomes [15,16]. Clinical trials are now required to include a data sharing plan in the trial registration as a condition to be considered for publication in journals that are members of the International Committee of Medical Journal Editors [17]. Journals following this policy in dermatology include JAMA Dermatology, Dermatology, American Journal of Clinical Dermatology, and Journal of Surgical Dermatology, among others. Optimizing good statistical practices-as well as using methods that promote reproducibility and transparency-could ultimately increase reproducibility within the dermatology literature. As questionable findings or false leads impinge scientific advancements, researchers and physicians must advocate for efficient scientific methods that bolster reproducible research [18,19].
As little is known about the extent of reproducible literature within dermatology journals, further investigation is warranted. We therefore explored the current state of reproducibility-related research practices in a random sample of publications from the field of dermatology. Our study examined specific indicators of reproducibility and transparency, building upon similar studies, to provide baseline data for subsequent investigations [8,9,20].

Overview
This cross-sectional analysis evaluating indicators of reproducibility and transparency was based on the methodology of Hardwicke et al [8], with slight modifications. To promote transparency and clarity of our research, all protocols, data, and appropriate materials are available on Open Science Framework [21]. This analysis did not include human subjects and was not subject to institutional review board oversight [22]. This investigation was reported using the guidelines for conducting meta-research as detailed by Murad and Wang [23] and, when necessary, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines [24]. Our primary objective was to evaluate for the presence of specific indicators of reproducibility and transparency in the published dermatology literature.

Journal and Publication Selection
On June 6, 2019, one author (DT) searched the National Library of Medicine (NLM) catalog for journals in the field of dermatology using the subject terms tag "Dermatology [ST]." To be included, journals had to be (1) MEDLINE indexed and (2) published in the English language. One investigator (DT) used the electronic ISSN to extract the list of journals. The same journal search string of ISSNs was then used in PubMed on June 7, 2019, to collect all publications published between January 1, 2014, and December 31, 2018. A random sample of 300 publications were selected for our analysis using Excel's random number function. Our search string and the complete list of publications returned from our search are available for reference [25].

Data Extraction
Before data extraction, 2 investigators (MA and AN) completed training (conducted by DT) to ensure reliability between investigators. This training session (which was recorded and is available for reference [26]) involved reviewing study objectives, study design, study protocol, and the data extraction form. After completion of training, MA and AN extracted data from the 300 randomly sampled publications in a blinded and independent manner. Data extraction began on June 10, 2019, and concluded on June 30, 2019. Investigators held a final consensus meeting to resolve any discrepancies. DT was available for adjudication, if necessary. Publications were separated into 2 categories: (1) those that contained empirical data and (2) those that lacked empirical data. Our dataset is available on a Web-based repository [27].

Specific Indicators of Reproducibility and Transparency
A pilot-tested Google Form similar to that created by Hardwicke et al [8] was used for data extraction. This form prompted investigators to identify the presence of prespecified indicators considered necessary to reproduce a study [28]. Information extracted from each publication varied according to the study design. Studies with empirical data were assessed for the following indicators: materials availability, data availability, analysis scripts, protocol, preregistration, conflict of interest (COI) statement, funding statement, and open access. Nonempirical studies were only assessed for the presence of 3 indicators: COI, funding statement, and open access. Furthermore, despite case reports and case series often providing empirical data, previous studies have demonstrated that key methodological information needed to reproduce these study types is commonly absent or is insufficient [9]. Thus, we decided to omit these study types from certain assessments. Table 1 details the 8 queried indicators of reproducibility and transparency, their importance, and a description of study designs included in each analysis. Having access to all materials (eg, stimuli, survey instruments, and computer code/software used for data collection or running experiments) increases the feasibility by which researchers are able to replicate a study using identical methodology Empirical studies a Materials available Sharing of data in their unaltered, digital form facilitates validation of study outcomes and helps prevent forms of bias, such as selective outcome reporting Empirical studies b Raw data Having access to well-documented, step-by-step instructions detailing data preparation and analysis can help to increase the clarity of data interpretation. In addition, thorough analysis scripts can help limit inadvertent computations and misrepresentation of study findings in replication studies

Empirical studies b Analysis scripts available
To completely and accurately reproduce a study, the full protocol must be available in its entirety. Slight alterations to the original study protocol have the potential to influence study outcomes, thereby hindering reproducibility Empirical studies b Protocol available Publications restricted behind a paywall contribute to the irreproducible environment of biomedical research. One way to circumvent this obstacle is through study preregistration. Making available study methods, hypotheses, and analysis scripts could potentially help increase the transparency of biomedical research while simultaneously mitigating reporting bias, data dredging, and p-hacking Open access increases the availability of pertinent information for study reproduction. Failing to make available complete records of the study's protocol, data, and analyses hinders a comprehensive evaluation of the given study All studies included in random sampled d Open access a Empirical studies refers to studies with empirical data including clinical trial, cohort, case control, chart review, and cross-sectional; even though case studies and case series often include empirical data, this category excludes these study types owing to the inherent difficulty surrounding their reproduction, as discussed by Wallach et al [9]. Meta-analyses and commentaries were also excluded from this analysis as materials are not typically included (n=114). b Empirical studies (clinical trial, cohort, case control, secondary analysis, chart review, commentary [with data analysis], and cross-sectional) excluding case reports and case series. Meta-analyses were included in this analysis (n=127). c All empirical and nonempirical studies were included in this analysis (n=280). d All publications included in random sample were included in this analysis (n=300).

Assessing Open Access
We employed a systematic process to determine the public's ability to access full text PDF versions of publications included in our sample. First, a search using the publication's title, digital object identifier, and/or PubMed ID on Open Access Button [29] was performed. If this search yielded no return, investigators then performed this same search process using Google Scholar and PubMed. Publications were determined to be inaccessible and paywall restricted if a full text version was unobtainable.

Attempts of Replication and Citation in Research Synthesis
To evaluate whether a publication with empirical data was cited in a systematic review and/or meta-analysis, we used Web of Science [30], following previous studies [8,9,20]. We determined the citing publications to be either a replication study or a meta-analysis or systematic review by individually screening the title, abstract, or the full text when necessary.

Statistical Analysis
We presented outcomes as percentages with associated 95% CIs, calculated using the Wilson binomial proportion confidence interval method. Descriptive statistics, medians, and upper and lower quartiles were reported using functions available in Microsoft Excel.

Results
Our search of the NLM catalog returned 100 dermatology journals. In all, 46 of these journals met the inclusion criteria and accounted for 46,615 publications from 2014 to 2018. Data were extracted from a random sample of 300 publications. A total of 280 were deemed eligible and accessible, whereas the remaining 20 were inaccessible (Figure 1).

Sample Characteristics
Our final analysis of 280 dermatology publications included 127 publications (45.4%) with empirical data from reproducible study designs and 153 publications (54.6%) that lacked empirical data or were inherently difficult to reproduce. The median 5-year journal impact factor was 2.719. Journal impact factors were inaccessible for 21 publications. Tables 2 and 3 provide additional characteristics for our sample of dermatology publications.

Eight Indicators of Reproducibility and Transparency
Among the 280 eligible publications, 201 (71.8%) were publicly available, whereas the remaining 79 (28.2%) were only available through a paywall. We classified the 20 publications for which full text PDF versions were unattainable as being paywall restricted. Thus, a total of 99 publications (of 300; 33.0%) were classified as being unavailable to the public. Only 23 publications (out of 114, 20.2%) provided a statement indicating that additional materials were available. Only 3 publications (out of 127, 2.4%) provided a protocol availability statement. All 3 of these statements provided a valid link to a Web-based protocol. Almost all publications lacked data availability statements. A total of 14 publications (out of 127, 11.0%) included data availability statements; however, only 11 of these data statements were linked to supplemental data files. Of the 11 accessible supplemental data files, only 3 provided access to complete and unmodified raw datasets. In addition, only 1 publication (out of 127, 0.8%) provided an analysis script or code. Our analysis revealed only 3 publications (of 127, 2.4%) were prospectively registered. A total of 233 publications (out of 280, 83.2%) provided a COI statement. Of these 280 publications, 30 (10.7%) indicated that 1 or more authors had a COI, and 203 (72.5%) declared the author(s) did not have a COI. The remaining 47 publications (out of 280, 16.8%) failed to provide a COI statement. Furthermore, 155 (out of 280, 55.4%) publications reported a funding source, whereas 125 (44.6%) publications did not receive external funding. Finally, 23 publications (out of 114, 20.2%) included in our analysis were cited in a subsequent data synthesis or review paper (Table  4). No publication included in our analysis was cited in a replication study.

Principal Findings
Our findings suggest that the current climate of dermatology research does not encourage reproducible and transparent research practices. Few studies provided access to datasets, analysis scripts, or complete study protocols. These findings are congruent with previous reports that found that studies often fail to promote transparent and reproducible research practices [9], and they align with a study published in Nature that found that 90% of more than 1500 researchers agreed that biomedical science is facing a significant reproducibility crisis [1]. This environment of poor research practice is problematic for clinicians and researchers who might seek to validate or reproduce a study in its entirety. As scientists and clinicians continue to make medical advances, studies must be readily reproducible to ensure proper validation of results and to allow for sustained progression in clinical practice. In the following text, we describe 2 practices in the field of dermatology-study protocols and preregistration-that were commonly omitted by researchers. We follow with actionable recommendations for research funders, journals, and researchers that, if implemented successfully, might help better the climate of reproducible research in published dermatology literature.
Most studies included in our sample did not provide additional materials or complete study protocols. Precisely outlining methodology is essential for study reproducibility [31], whether this information is provided within the publication or in supplementary materials [32]. The Journal of the American Academy of Dermatology's (JAAD) instructions to authors state, "submissions of research articles should be accompanied by a supplementary document that includes the protocol and statistical analysis plan; this should be labeled 'For editor/reviewer reference only' and is not for publication" (emphasis ours) [33]. The British Journal of Dermatology (BJD) author guidelines state, "The editorial team has found that providing the study protocol facilitates acceptance of the paper if it is available. Therefore, the BJD encourages submission of the protocol at the time of manuscript submission, with the protocol identified as a 'Supplementary file for review.' Submission of the trial protocol is also strongly encouraged for industry-sponsored trials." [34] JAMA Dermatology guidance states, "authors of manuscripts reporting clinical trials must submit trial protocols (including the complete statistical analysis plan) along with their manuscripts… and that if the manuscript is accepted, the protocol and statistical analysis plan will be published as a supplement [35]." The widespread variability in guidance provided by these 3 prominent dermatology journals-which ranges from nonpublication of study protocols by JAAD to protocol publication upon article acceptance by JAMA Dermatology-suggests differing views toward implementing reproducible research practices within the field. BJD does not require protocol submission but simply encourages it. As journals are the final arbiters of studies that move on to publication, they have a high degree of influence on the climate of reproducibility and transparency in dermatology research. We highly recommend that dermatology journals adopt stronger requirements for submitting authors to promote greater transparency and reproducibility.
According to the Food and Drug Administration Amendments Act, established in 2007, all applicable RCTs must be registered before participant enrollment [22]. Although the number of preregistered RCTs has increased, other study designs have not shown as much improvement. Boccia et al found that only 1109 cancer observational studies were registered on ClinicalTrials.gov across an 11-year period [36]. In addition, systematic reviews have a preregistration platform, the International Prospective Register of Systematic Reviews (PROSPERO), which has increased in usage exponentially since its inception in 2011 [37]. These study designs are preregistered solely at the authors' discretion, with few journals or funders having concrete guidance on the subject. Of the 3 journals discussed above, only BJD mentions registering systematic reviews, stating that authors are required to preregister on PROSPERO [34]. Transparent research practices such as prospective registration can help mitigate unethical research practices by providing access to date-stamped protocol details and informing the public about current clinical trials being performed [38]. For example, P-hacking (using different statistical analyses until a nonsignificant finding is found to be significant) [39] and HARKing (forming study hypothesis after results have been calculated) [40] might be avoided if investigators disclose the expected statistical analyses that will be used throughout the study before its commencement. It should be noted that HARKing can be beneficial to the scientific process by generating important discoveries during post hoc analyses [41][42][43] In addition, previous studies have shown that reviewers often encourage authors to add hypotheses post hoc as part of the peer review process [44]. However, the crossover into research misconduct occurs when authors contend that these posthoc hypotheses were part of the original study design, thereby potentially decreasing the confidence of statistically significant outcomes [45].

Future Recommendations
Changes to the landscape of dermatology research are warranted; however, the optimal framework for doing so is unclear. Here, we offer recommendations for research stakeholders-including funding agencies, journals, and researchers-that may help increase the quality of reproducible research practices in dermatology, if implemented successfully.
With respect to funding, some foundations and governmental agencies have established measures to promote reproducibility and transparency of research for which they provide funding. A nonexhaustive list of these funders include the National Institutes of Health (NIH), the National Science Foundation, the Wellcome Trust, and the Bill and Melinda Gates Foundation. As one example, the Gates Foundation, which funds approximately 2000 to 2500 research articles per year totaling US $5 billion [46], has established an open access policy requiring that all research data and manuscripts resulting from its funds be promptly and broadly disseminated [47]. To further its goals for widespread dissemination, the foundation has launched its own open access journal, Gates Open Research. Currently, research funded by the foundation is not eligible for publication in some of the world's most renowned journals, such as Nature, Science, Proceedings of the National Academy of Sciences, and New England Journal of Medicine owing to these funding restrictions [48]. The NIH has established the Rigor and Reproducibility Initiative, embedding requirements that submitted grant applications outline strategies for more reproducible research [49]. Strategies such as these are the first steps toward adoption of more transparent and reproducible research practices.
For journals, we recommend consideration of adopting stricter standards on the disclosure of study materials, raw datasets, protocols, and analysis scripts. Journals should consider requiring that authors share all study materials on public repositories, such as Open Science Framework. With essential study materials publicly available, outcomes may be reproduced and validated with greater ease. A recent survey found that open access to study data increased the public's trust and confidence in research outcomes [50]. Depositing all study materials and data before publication may increase the public's faith and confidence in the literature published in journals with such requirements.
Finally, for researchers, we believe a need exists to train and equip principal investigators to adopt more reproducible and transparent research practices. This goal may be best accomplished through continuing education, academic conferences, webinars, and journal clubs. A need also exists to train and equip the next generation of scientists. Given the apprenticeship nature of many biomedical laboratories, principal investigators should take the lead in fostering such cultures within their laboratories and instilling such practices with mentees. Courses on open science are being developed across the country, many posted on the Open Science Framework [51]. The National Institutes of General Medical Sciences has posted several Web-based training modules to increase the overall rigor and reproducibility of medical research [52]. As these courses continue to expand at universities and with funders, continued development and uptake of such training may help reverse the scant nature of reproducibility and transparency of research in the dermatology literature.

Strengths and Limitations
Our study has many strengths, but some limitations are present. Regarding strengths, all materials, protocols, analysis plans, and raw data from our study are publicly available on Open Science Framework. In addition, we implemented numerous measures to ensure the reliability of study outcomes by (1) using a blinded, double data extraction technique-the gold standard for meta-research practices [53] and (2) providing thorough training of each investigator to ensure reliability of results between investigators. Regarding limitations, data extraction was limited to the content of the full-text PDFs and available supplemental materials for each publication. Additional materials may be attainable by contacting the corresponding author. Furthermore, this study focused specifically on publications in dermatology journals. Thus, the results from this study may not be generalizable to other subjects or years of publication. For the aforementioned reasons, interpretation of our findings should be considered a lower bound estimate of reproducibility of publications in dermatology journals.
In conclusion, the rate of disclosure of study materials, data, protocols, and analysis scripts of sampled dermatology publications is unacceptably low. Without implementing and adhering to more robust reporting standards and open science practices, reproducibility-related factors of dermatologic research may remain poor.