Issue 496: Types for p2 has type
Posted by Martin on 25/6/2020
I'd like to raise a new issue, how to make recommendations for the types we recommend in scope notes. Should we create a terminology file in SKOS, which may be incorporated in whatever vocab?
In the 48th CIDOC CRM and 41st FRBR CRM sig meeting (virtual), the sig reviewed MD's HW and questioned about making recommendations specifying the minimal vocabulary to be used in each case.
Finally the sig decided to assign HW to sig members –interact with Linked Art and other communities using the CRM, to formulate the minimal requirements for restricting the appropriate types. HW assigned to GB communicate with RS / Linked Art and point to the direction of what has been decided so far. TV &, MD will contribute too.
Post by Rob (8 June 2021)
I think my part of the homework for #496 is to describe the Linked Art requirements, process and decisions.
First - Linked Art is conceived of as an application profile for art-related descriptions that uses CRM as its core ontology. It selects as minimal as possible a subset of the classes and relationships needed to fulfil the use cases. It draws mostly from CRM base, with a few select terms from sci and dig. There is also a Linked Art extension that defines a small number of terms that aren't available in any other extension (but typically align with the direction that soc is taking). You can see Linked Art's documentations here: https://linked.art/
We also need to select vocabulary to use with P2_has_type and rely heavily on the Getty AAT thesaurus. We divide the vocabulary into three conditional, disjoint buckets:
- Terms that MUST be used for the description to be able to be understood.
- Terms that SHOULD be used for the description to be easily interoperable across institutions
- Terms that MAY be used, as assistance to the community rather than requiring them to look them up independently
We try to keep the MUST bucket as small as possible, and based on cross-domain and universal use cases. Examples include:
- Primary Name (A classification on an appellation that it is the "main" name of the entity) vs Display Name (classification on appellation that it is the human readable representation of an entity like a TimeSpan)
- Activity Classifications: We need to distinguish Provenance, Publishing, Promise and Exhibitions as having particular recommended structures.
- Meta types: We don't require any particular types for even things like Painting, but we do require types on those types so we know what sort of thing they are. For example, there is an "object type" which is required on the object's type. Meta types include object type, nationality, culture, gender, statement type, color, shape. Example:
E22 (the painting) p2_has_type E55 (painting) . <-- painting is recommended
E55 (painting) p2_has_type <aat:300435443> (type of work) . <-- type of work is required
Now we can slot anything in to the "painting" slot and know that it's the type of the work rather than some other classification... like shape or color.
Thus we also require aat:300191751 for permanent transfers of custody or location, and aat:300221270 for temporary transfers of custody or location, per the recent decision to not add has_permanent_custodian to manage it at the property level.
The SHOULD bucket is on the order of 100 terms for common requirements, but ones that would reduce the ability to easily compare across institutions' datasets, rather than ones that would make the data almost useless if they weren't present. These are things like the common types of statement about an entity, the common types of Place, Group, or Object. Also the types of comparable structure like Dimension, Appellation and Identifiers. Then the common Measurement Units, Currencies, Languages. We use AAT for all of these.
The MAY bucket is just things that we've found ourselves looking up and want to make it easier for others to find.
Hope that helps,
Post by Franco Niccolucci (8 June 2021)
dealing with vocabularies, we noticed (in ARIADNE) that named time periods may have some ambiguity as the same name may refer to different time spans depending on the location. It is a well-known fact firstly evidenced in the ARENA project with an interesting comparative diagram among several EU countries. This is more evident in archaeology, where e.g. "Iron Age” has a different meaning in Ireland and in Italy. I use to make a joke on this, telling the story of a time traveller who travelled in the year 50 AD from Roman Age back to Iron Age, while he simply went from Ronan Gaul (then in the Roman Age) to Ireland, which was never invaded by Romans and at the time was still in its Iron Age. I think that this may be also relevant to Art, for example a “Renaissance painting” is dated to rather different time periods according to its provenance. The solution we found to the issue is TeriodO https://perio.do/en/ a gazetteer of periods which may assign different time spans to the same name according to location. If this is interesting I can provide further details on how we successfully managed the issue.
Post by Thanasis Velios (11 June 2021)
To follow up with this, and with the usual apologies for potentially misunderstanding the objective of the issue, I have done a quick scan of the CRM document to identify where these recommendations for types are done. Some are rather implicit but may be worth considering:
* E4: type of period
* E10: type of transfer of custody
* E15: type of identifier assignment
* E34: type of alphabet
* E56: type of language
* E57: type of material
* E58: type of unit
* E90 / P3.1: type of encoding
All the best,
In the 50th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 43nd FRBR – CIDOC CRM Harmonization meeting, TV gave an outline of the overall issue; The recommendtions for type-restrictions on classes that are available in a number of scope notes are not collected in a resource. The SIG needs to propose
- the type restrictions recommended for CRM
- the criteria to identify suitable thesauri that can be recommended in the CIDOC CRM document
- determine the functional role of a minimal vocabulary
HW: MD & PR [TV to proofread]
- Start a new issue regarding the content of the minimal vocabularies required for type-restriction in the CRM.
Post by Martin (5 October 2021)
Issue 496 ended with homework by me, Par Riva and Thanasis Velios, abut the functionality of a minimal CRM vocabulary:
Here is our proposal:
The Functional role of a Minimal Vocabulary
...to be used together with the CIDOC-CRM
The policy of the CRM is to restrict classes to those that appear as specific domains or ranges of CRM properties, because those properties structure the knowledge base and frequently appear hard-coded in the control-software, i.e., data entry, storage and access tools. Therefore they are of much higher priority for system interoperability than the classes without properties, which we model as instances of E55 Type, i.e. as data, as usual in conceptual modelling of databases since their conception.
Nevertheless, in certain cases the CRM makes important and non-obvious ontological distinctions of specialization of CRM classes without assigning specific properties to them. These may differentiate and specialize even substance and identity criteria in a way that has a bearing on the use of properties, as in the case of E10 Transfer of Custody: The kind of transfer of custody, i.e., either field collection, transfer from one keeper to another or loss, can be specified by E55 Type, and consequently the property associating the donor or the receiver will not be used.
These distinctions normally appear in the scope notes with a hint about the need for respective vocabularies. They further appear in examples. Finally, a series of classes have been deprecated because they did not need specific properties, but backwards compatibility would require that they be turned into clearly recommended instances of E55 Type.
Over the past 30 years attempts to harmonize and integrate vocabularies in the cultural heritage (CH) domain have widely failed. Rather, some vocabularies play a more important role, but specialized needs are too abundant to allow for a systematic integration, and volatile vocabularies are an important tool of research in all sciences and humanities.
Therefore, the CRM-SIG will recommend in a document separate from the CIDOC CRM definition only those terms that are regarded to be important for the above mentioned ontological distinctions, and unambiguous enough to be fixed as standard. These may be linked or integrated as broader or narrower terms into vocabularies of the user's choice, in a way compatible with the meaning of the classes of the CRM where they will be used together.
The CRM-SIG may exemplify this on the base of the Art & Architecture Thesaurus (AAT) or the Backbone Thesaurus (BBT).
Further, CRM will recommend the use of some standard vocabularies for cases in which a good and comprehensive international practice exists, such as measurement units, country codes etc.