New and Noteworthy
The Arthritis and Autoimmune and Related Diseases Knowledge (ARK) Portal is a virtual resource that accumulates, organizes, and links core datasets generated by research teams focused on arthritis, autoimmune, skin, and related diseases.
Storing, managing, standardizing, and publishing the vast amounts of data produced by biomedical research is a critical component to the mission of NIAMS and NIH. The research on arthritis, autoimmune, musculoskeletal, and skin diseases generates datasets of significant sizes and variety which provide a great scope for data science and its applications.
This webpage will facilitate the identification of data science and data sharing resources for the NIAMS investigator community. Its contents will be revised periodically with timely announcements and links to relevant online resources.
Table of Contents:
- General Policy, Guidance and Strategy
- NIH Policy for Data Management and Sharing
- Data Management and Sharing Plan Development
- Data Sharing and Access
- NIH Genomic Data Sharing (GDS) Policy
- Additional Resources / References
- External References
General Policy, Guidance, and Strategy
NIAMS’ data science related strategy, policy and guidance are closely aligned with the core NIH data science guidance. See the NIH Strategic Plan for Data Science, which provides a comprehensive strategy with goals as well as implementation and evaluation guidance.
The FAIR Guiding Principles for scientific data management and stewardship define four key properties of good scientific data management: Findability, Accessibility, Interoperability, and Reusability. Also see the TRUST Principles for digital repositories.
NIH Policy for Data Management and Sharing
The NIH Policy for Data Management and Sharing (DMS Policy) was issued to promote the management and sharing of scientific data generated from NIH-funded or conducted research. The NIH DMS Policy applies to applications and proposals submitted to NIH on or after January 25, 2023.
Detailed information is available on the NIH Scientific Data Sharing website.
Other NIH resources relevant to the DMS Policy can be found on the following sites:
- 2023 DMS Policy FAQ
- Getting Ready for NIH’s DMS Policy
- NOT-OD-21-013: Final NIH Policy for Data Management and Sharing
Data Management and Sharing Plan Development
Under the DMS policy, NIH sponsored investigators will:
- Prospectively plan for the managing and sharing of scientific data
- Submit a DMS plan with their grant applications
- Comply with the approved plan
- Detailed Guidance and Sample Plans
- Supplemental DRAFT Guidance: Elements of a NIH Data Management and Sharing Plan.
For illustrative purposes only:
- A template used by the NIH Intramural Research Program (an example based on The DMP Tool, a service of the University of California).
- Tips for Writing a Data Management and Sharing Plan (an example from NICHD).
Data Sharing and Access
Data repository considerations:
- For maximum sharing of scientific data, the investigators should use an established data repository that is most appropriate for the discipline and the type of data generated from their research project.
- While NIH does not endorse or require sharing data in any particular repository, some initiatives and funding opportunities may have their own requirements.
- Although NIH encourages sharing scientific data, healthcare data often includes electronic health records and other identifiable information about patients that requires additional protections.
- NIH Guidance for Selecting a Data Repository
- NIH-supported Repositories for Sharing Scientific Data
- NIH Generalist Repository List
- Informed Consent for Secondary Research and Biospecimens: Points to Consider and Sample Language for Future Use and/or Sharing developed by the NIH Office of Science Policy
- Consent Templates and Guidance that Address Storage, Sharing, and Future Research Using Your Specimens and Data developed by the NIH Office of Intramural Research, Office of Human Subjects Research Protections
- Best Practices for Sharing Research Software Frequently Asked Questions developed by the NIH Office of Data Science Strategy
The Arthritis and Autoimmune and Related Diseases Knowledge (ARK) Portal
The ARK Portal is a virtual resource that accumulates, organizes, and links core datasets generated by research teams focused on arthritis, autoimmune, skin, and related diseases. Directed by NIAMS and developed and maintained by Sage Bionetworks, the ARK Portal will house a broad and diverse portfolio of datasets including those from the Accelerated Medicines Partnership® Rheumatoid Arthritis and Systemic Lupus Erythematosus (AMP® RA/SLE). Access to the datasets on the portal is free to the public. Some datasets will require users to register and agree to a data-use agreement.
Other Examples of Repositories and Data Management Systems
NIH Library of Medicine – Open Domain-Specific Data Sharing Repositories is a listing of NIH-supported domain-specific data repositories that make data accessible for reuse and are open for both submitting and accessing data.
The Common Fund Data Ecosystem (CFDE) is developing an online portal that will allow researchers to access and work across multiple Common Fund (CF) program data sets within a digital cloud environment.
NHLBI’s BioData Catalyst is a shared virtual space where scientists can access and work with the digital objects of biomedical research, such as data and software.
The Musculoskeletal Knowledge Portal enables browsing, searching, and analysis of human genetic and genomic information linked to musculoskeletal traits and diseases.
The Immunology Database and Analysis Portal (ImmPort) provides an open access platform for research data sharing in support of the NIH mission to share data with the public.
The HEAL Data Ecosystem is part of the Helping to End Addiction Long-term® Initiative, an aggressive trans-agency effort to speed scientific solutions to stem the evolving national opioid public health crisis.
Storage/STRIDES – The NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative allows NIH to explore the use of cloud environments to streamline NIH data use by partnering with commercial providers. To enroll in this initiative, visit the NIH STRIDES enrollment page.
CDE – Common Data Elements: A Common Data Element (CDE) is a standardized, precisely defined question, paired with a set of allowable responses, used systematically across different sites, studies, or clinical trials to ensure consistent data collection.
- Visit the NIH Common Data Elements (CDE) Repository to view NIH-endorsed collections of CDEs that meet established criteria:
- Visit NHLBI-CONNECTS to view COVID-19 Therapeutic Trial Common Data Elements
Research and Analysis Tools
NIAMS’ Biodata Mining and Discovery Section is developing data science and bioinformatics approaches in support of NIAMS goals.
NIH Genomic Data Sharing (GDS) Policy
NIH expects broad data sharing for the genomic data generated from the research that it funds. For more detailed information please check NIH Genomic Data Sharing Policy webpage. For examples of genomic data that falls under the NIH GDS policy please check the NIH About Genomic Data Sharing webpage.
NIAMS extends this genomic data sharing expectations in the following cases, regardless of sample size:
- If any type of single cell sequencing technique is used (eg., single cell RNA sequencing);
- Any type of microbiome sequencing;
- Rare diseases (a good guide what is considered a rare disease can be found on the NCATS Genetic and Rare Diseases Information Center webpage;
- Understudied human populations;
- NIAMS will evaluate on case-by-case basis any dataset that might be of special interest/importance for the broader research community and has the potential to increase our knowledge on how to “enhance health, lengthen life, and reduce illness and disability.”
- Is my project subject to the NIH GDS Policy? View NIAMS' decision tree flowchart (NIAMS specific)
- Developing Genomic Data Sharing Plans
- Completing an Institutional Certification Form
- Submitting Genomic Data
- Genomic Data Submission and Release Expectations
- NIH Guidance on Consent for Future Research Use and Broad Sharing of Human Genomic and Phenotypic Data Subject to the NIH Genomic Data Sharing Policy
For questions with the Institutional Certifications, GDS Plans, or dbGAP registrations, please contact the NIAMS Genomic Data Sharing mailbox at: NIAMS_GDS@mail.nih.gov
- NIH Scientific Data Sharing
- Data Sharing Policy Trainings and Events
- Final NIH Policy for Data Management and Sharing (Notice No. NOT-OD-21-013)
- NIH Office of Data Science Strategy
- National Library of Medicine's Training on Biomedical Informatics, Data Science, and Data Management (nih.gov)
- National Institute of Environmental Health Sciences’ Resources for Scientists provides a list of databases and galleries as resources to scientists.
- Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
- NICHD Datasets & Research Resources includes tissue banks and repositories, datasets and databases, model organisms, genome and DNA sequences, and resource libraries.
- NICHD Data and Specimen Hub (DASH) contains a centralized resource that allows researchers to share and access de-identified data from studies funded by NICHD. DASH also serves as a portal for requesting biospecimens from selected DASH studies.
- National Cancer Institute (NCI)
- NCI Bioinformatics, Big Data, and Cancer uses advanced computing, mathematics, and different technological platforms to physically store, manage, analyze, and understand data.
- The NCI Cancer Research Data Commons (CRDC) is a data science infrastructure that connects cancer research data collections with analytical tools.
- National Institute of General Medical Sciences (NIGMS)
- NIH Research and Training Funding Opportunities
- NIAMS Extramural Program, Grants and Funding
- FAIR Guiding Principles
- The TRUST Principles for Digital Repositories
- Webinar: Implementation of New NIH Data Management and Sharing Policy