23 May 2023
by Soulaine Theocharides – Software developer (Naturalis Biodiversity Center)
In the realm of biodiversity and geoscience research, the digitization of natural history collections has revolutionized the way we study and understand our planet’s rich biological and geological heritage. However, as the volume of digitization and other valuable data resources continues to grow, ensuring their long-term accessibility and interoperability becomes increasingly challenging. This is where Persistent Identifiers (PIDs) come into play, providing a robust and reliable solution for managing and referencing these digital assets. FAIR Digital Object Profiles (FDO Profiles) further standardize and structure PID records, facilitating interoperability between FDOs and sophisticated machine actionability.
In this blog post, we will explore the significance of PIDs in biodiversity research, highlight the role of FDO Profiles in shaping the PID architecture of DiSSCo, and outline the design process of our first FDO Profile, now available for public feedback.
The Importance of Persistent Identifiers
PIDs assign globally unique identifiers to objects, providing a stable reference even if the objects are relocated or undergo changes. By ensuring persistence, PIDs eliminate broken links and guarantee the accessibility of digital resources, regardless of any future changes in storage infrastructure or hosting platforms.
When a specimen record is ingested into DiSSCo, it is assigned a PID. Through robust PID infrastructure, researchers, machines, and the wider community will be able to easily locate and cite specific Digital Specimens or related resources, enhancing transparency and reproducibility in research.
What is an FDO Profile?
In addition to the location of the referenced object, FDO Records contain structured metadata that describes the attributes and characteristics of the resource associated with the PID. The FDO Record is similar to the PID record idea proposed by RDA [1], but the term FDO Record is used to “highlight that there could be possible [implementations] of FDO without explicitly relying on the attributes stored in a PID record” [2]. This metadata may include information such as title, creator, date, identifiers for related objects, access rights, and more. This information allows machines to make decisions regarding the Digital Object without needing to resolve the PID.
Different Types of Digital Objects have different FDO Record metadata requirements, and thus the actions a machine can take on a PID record is defined by the object Type. FDO Profiles standardize which FDO record attributes should be associated with each Type of object. Within DiSSCo alone, we expect to assign PIDs to a diverse array of object Types, including media objects, annotations, and of course, Digital Specimens. Each of these object types will have their own FDO Profile; currently, the FDO profile for Digital Specimens is available for public feedback.
FDO Profiles and DiSSCo
Recognizing the need for consistency and harmonization, our FDO profiles incorporate elements that can be reused for various Digital Object types within DiSSCo. Attributes like issue date or PID status are applicable not only to Digital Specimens but also to media objects, annotations, and other resource types. This consistency promotes interoperability and simplifies metadata management efforts.
Subsequently, we added additional attributes specific to Digital Specimens, such as specimen host and material sample type. This approach strikes a balance between standardized metadata representation across biodiversity research resources while accommodating the unique characteristics of Digital Specimens.
Sample FDO Record. 100-series indexes (blue) are reserved for administration; 01-099 indexes (orange) are applicable to all objects within DiSSCo; and 200-series indexes (teal) are specific to Digital Specimens.
Next Steps
We are actively working on designing additional FDO profiles for different Digital Object Types within DiSSCo. These profiles will further enhance the interoperability and standardization of biodiversity research resources, ensuring that the benefits of PIDs and FDO profiles extend beyond Digital Specimens to encompass various data types and formats.
We are actively seeking feedback from the community on the current FDO Profile for Digital Specimens via an RFC Document, available here. The RFC (Request for Comments) process facilitates conversation between DiSSCo and community members, provides an opportunity to receive feedback on the DiSSCo development process, and defines how decision making works. More information on the RFC process itself can be found here.
Conclusion
Persistent Identifiers and FDO profiles are crucial components of the DiSSCo infrastructure. By providing stable and globally unique identifiers, PIDs ensure the accessibility, citation, and long-term preservation of Digital Objects. Meanwhile, FDO profiles offer a means to describe the attributes of these PID records in a standardized and interoperable manner. As digitization efforts expand, leveraging PIDs and FDO Profiles will continue to play a pivotal role in unlocking the vast potential of biodiversity research, enabling collaboration, discovery, and machine actionability.
[1] RDA PID KI. 2019. RDA Recommendation on PID Kernel Information. Research Data Alliance. [Online]. DOI: https://doi.org/10.15497/RDA00031
[2] S. Islam, “FAIR digital objects, persistent identifiers and machine actionability,” FAIR Connect. [Online]. DOI: https://doi.org/10.3233/FC-230001