New and Noteworthy

Arthritis and Autoimmune and Related Diseases Knowledge (ARK) Portal

The Arthritis and Autoimmune and Related Diseases Knowledge (ARK) Portal is a virtual resource that accumulates, organizes, and links core datasets generated by research teams focused on arthritis, autoimmune, skin, and related diseases.


Storing, managing, standardizing, and publishing the vast amounts of data produced by biomedical research is a critical component to the mission of NIAMS and NIH. The research on arthritis, autoimmune, musculoskeletal, and skin diseases generates datasets of significant sizes and variety which provide a great scope for data science and its applications.

This webpage will facilitate the identification of data science and data sharing resources for the NIAMS investigator community. Its contents will be revised periodically with timely announcements and links to relevant online resources.

Table of Contents:

General Policy, Guidance, and Strategy

NIAMS’ data science related strategy, policy and guidance are closely aligned with the core NIH data science guidance. See the NIH Strategic Plan for Data Science, which provides a comprehensive strategy with goals as well as implementation and evaluation guidance.

The FAIR Guiding Principles for scientific data management and stewardship define four key properties of good scientific data management: Findability, Accessibility, Interoperability, and Reusability.  Also see the TRUST Principles for digital repositories.

NIH Policy for Data Management and Sharing

The NIH Policy for Data Management and Sharing (DMS Policy) was issued to promote the management and sharing of scientific data generated from NIH-funded or conducted research. The NIH DMS Policy applies to applications and proposals submitted to NIH on or after January 25, 2023.

Detailed information is available on the NIH Scientific Data Sharing website.

Other NIH resources relevant to the DMS Policy can be found on the following sites:

Data Management and Sharing Plan Development

Under the DMS policy, NIH sponsored investigators will:

  • Prospectively plan for the managing and sharing of scientific data
  • Submit a DMS plan with their grant applications
  • Comply with the approved plan

Useful resources:

For illustrative purposes only:

Data Sharing and Access

Data repository considerations:

  • For maximum sharing of scientific data, the investigators should use an established data repository that is most appropriate for the discipline and the type of data generated from their research project.
  • While NIH does not endorse or require sharing data in any particular repository, some initiatives and funding opportunities may have their own requirements.
  • Although NIH encourages sharing scientific data, healthcare data often includes electronic health records and other identifiable information about patients that requires additional protections.

Useful resources:

The Arthritis and Autoimmune and Related Diseases Knowledge (ARK) Portal

The ARK Portal is a virtual resource that accumulates, organizes, and links core datasets generated by research teams focused on arthritis, autoimmune, skin, and related diseases. Directed by NIAMS and developed and maintained by Sage Bionetworks, the ARK Portal will house a broad and diverse portfolio of datasets including those from the Accelerated Medicines Partnership® Rheumatoid Arthritis and Systemic Lupus Erythematosus (AMP® RA/SLE). Access to the datasets on the portal is free to the public. Some datasets will require users to register and agree to a data-use agreement.

Other Examples of Repositories and Data Management Systems

NIH Library of Medicine – Open Domain-Specific Data Sharing Repositories is a listing of NIH-supported domain-specific data repositories that make data accessible for reuse and are open for both submitting and accessing data.

NIH Office of Science Policy – Data Repositories Established as NIH Trusted Partners.

The Common Fund Data Ecosystem (CFDE) is developing an online portal that will allow researchers to access and work across multiple Common Fund (CF) program data sets within a digital cloud environment.

NHLBI’s BioData Catalyst is a shared virtual space where scientists can access and work with the digital objects of biomedical research, such as data and software.

The Musculoskeletal Knowledge Portal enables browsing, searching, and analysis of human genetic and genomic information linked to musculoskeletal traits and diseases.

The Immunology Database and Analysis Portal (ImmPort) provides an open access platform for research data sharing in support of the NIH mission to share data with the public.

The HEAL Data Ecosystem is part of the Helping to End Addiction Long-term® Initiative, an aggressive trans-agency effort to speed scientific solutions to stem the evolving national opioid public health crisis.

Storage/STRIDES – The NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative allows NIH to explore the use of cloud environments to streamline NIH data use by partnering with commercial providers. To enroll in this initiative, visit the NIH STRIDES enrollment page.

Data Standardization

CDE – Common Data Elements: A Common Data Element (CDE) is a standardized, precisely defined question, paired with a set of allowable responses, used systematically across different sites, studies, or clinical trials to ensure consistent data collection. 

Research and Analysis Tools

NIAMS’ Biodata Mining and Discovery Section is developing data science and bioinformatics approaches in support of NIAMS goals.

NIH Genomic Data Sharing (GDS) Policy

NIH expects broad data sharing for the genomic data generated from the research that it funds. For more detailed information please check NIH Genomic Data Sharing Policy webpage. For examples of genomic data that falls under the NIH GDS policy please check the NIH About Genomic Data Sharing webpage.

NIAMS extends this genomic data sharing expectations in the following cases, regardless of sample size:

  • If any type of single cell sequencing technique is used (eg., single cell RNA sequencing);
  • Any type of microbiome sequencing;
  • Rare diseases (a good guide what is considered a rare disease can be found on the NCATS Genetic and Rare Diseases Information Center webpage;
  • Understudied human populations;
  • NIAMS will evaluate on case-by-case basis any dataset that might be of special interest/importance for the broader research community and has the potential to increase our knowledge on how to “enhance health, lengthen life, and reduce illness and disability.”

Useful Resources:

For questions with the Institutional Certifications, GDS Plans, or dbGAP registrations, please contact the NIAMS Genomic Data Sharing mailbox at:

Additional Resources/References

External References


Last Updated: