F3
FAIR Guiding Principle F3:
metadata clearly and explicitly include the identifier of the data it describes
Interpretation of F3
Principle F3 implies that the resource (data, metadata, software or any other) has metadata that is separated from the actual resource they describe, but are nonetheless persistently linked via a GUPRI (linking metadata explicitly to the resource and vice versa, as described in FDOF specifications). Here we explicitly emphasize that implementation choice as crucial for a FAIR by design approach. The F3 principle states that any description of a digital resource must contain clearly and explicitly the identifier of that resource being described. For instance, the description of a computational workflow, should explicitly contain the identifier for that workflow in a manner that is unambiguous (well qualified, see Principle I3). This is especially important where the resource and its metadata are stored independently, but are nonetheless persistently linked, which is assumed to be the case by the GO FAIR Foundation. The purpose of this principle is twofold. First, it is perhaps trivial to say that a descriptor should explicitly say what resources it is describing; however, there is a second, less-obvious reason for this principle. Many digital objects (such as workflows, as mentioned above) have well-defined structures that may disallow the addition of new fields, including fields that could point to the metadata about that resource. Therefore, the only consistent way for both humans and machines to discover the metadata of a resource is through a search for the identifier of that resource. Thus, by requiring that a metadata descriptor contains the identifier of the thing being described, that identifier may then successfully be used as the search term to discover its metadata record. However, it should be clear that in many cases the identifier itself is not a regular search term. In fact the GO FAIR Foundation considers it good practice in FAIR to avoid semantic meaning in GUPRIs as these are be prone to change. That is why rich metadata are already defined in F2 of the guiding principles. When FAIR principle F3 mentions that the identifier of the object should be explicitly and clearly included in the object's metadata, our interpretation assumes "explicit" refers to the mere presence of the resources's identifier in the content of the metadata record while "clear" refers to having this identifier directly and unambiguously related to the metadata record by means of a known predicate. In previous experiments examining common usage, we have identified over 20 different ways that stakeholders sometimes use to declare which resource is being described by a given metadata record. This makes it very hard for humans and machines to, given a metadata record, identify which object this record describes.
This interpretation of F3 is based on 'FAIR Principles: Interpretations and Implementation Considerations'. Jacobsen et al, Data Intelligence 2020; 2 (1-2): 10–29. doi: https://doi.org/10.1162/dint_r_00024
References
-
Machine-Centric Science, Podcast on F3 by Donny Winston: https://open.spotify.com/episode/0WeoKjy8sUN6acOf4vxYyG