Issue 383: 'has content' property

Starting Date: 
2018-05-22
Working Group: 
3
Status: 
Open
Background: 

In the 41st joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 34th FRBR - CIDOC CRM Harmonization meeting, the sig resolving the issue 363, decided to open a new issue about the definition of a new property of E90 for  capturing the  the actual content of a symbolic object. This property should  be modelled on the R33 property of FRBRoo. HW to MD for formulation of this property

Lyon, May 2018

Current Proposal: 

Posted by Martin on 6/11/2018

I had sent the below as new issue, but it is indeed the answer to Issue 383.

The question is, how to deal with a file, which is more specific in content, such as an MS Word, but represents the character sequence that defines the content of the respective E90. Is is "is incorporated in", or a subproperty of it?

On 9/19/2018 11:09 PM, Martin Doerr wrote:
> Here my scope note:
>
> Pxxx has symbolic content
>
> Domain:             E90 Symbolic Object
>
> Range:                E62 String
>
> Quantification:    many to many (0,n:0,n) ??
In CRM RDFS   subproperty of: rdfs:value
>

>
> Scope note:         This property associates  an instance of E90 Symbolic Object with a complete, identifying representation of its content in the form of an instance of E62 String. This property only applies to instances of E90 Symbolic Object that can be represented completely in this form. The representation may be more specific than the symbolic level defining the identity condition of the represented. This depends on the type of the symbolic object represented. For instance, if a name has type "Modern Greek character sequence", it may be represented in a loss-free Latin transcription, meaning however the sequence of Greek letters. As another example, if the represented object has type "English words sequence", American English or British English spelling variants may be chosen to represent the English word "colour" without defining a different symbolic object. If a name has type "European traditional name", no particular string may define its content.
>

>
> Examples:         
>
>
> * The materials description (E33) of the painting (E22)  _has symbolic content_ “Oil, French Watercolors on Paper, Graphite and Ink on Canvas, with an Oak frame.”
>
> * The title (E35) of Einstein’s 1915 text (E73) _has symbolic content_ “Relativity, the Special and the General Theory“
>
> * The story of Little Red Riding Hood (E33) _has symbolic content_ “Once upon a time there lived in a certain village …”
> * The inscription (E34) on Rijksmuseum object SK-A-1601 (E22) _has symbolic content_ “B”
>

Posted by Robert Sanderson on 6/11/2018

Thank you for pushing this forward, Martin!

 

Quantification wise, I would be in favor of 0,1 : 0,1.

 

If the structure of the set of symbols changed, then it would be a different symbolic object according to my understanding of E90:

 

>  … identifiable symbols and any aggregation of symbols …  that have an objectively recognizable structure and

that are documented as single units.

Similarly, if the same string was used by different Symbolic Objects, then it seems like they would actually be the same symbolic object (or you would instead use two strings with the same data).

(And in the RDF projection this makes no difference, as literal values do not have their own separate identity)

 

For the examples, I would replace the Little Red Riding Hood example with one that is complete, to avoid confusion with the scope note requirement of being represented completely.

How about:

>  The Accession Number (E42) of the J. Paul Getty Museum’s “Abduction of Europa” (E22) _has symbolic content_ “95.PB.7“

 

And for the file question, do you mean that the symbolic object is the MS Word file, which has a representable set of (binary) symbols, or that the symbolic object is text which is incorporated within the file, but not verbatim (as the characters in the (e.g.) paragraph are likely to be represented in the file using very a different structure).

 

Posted by Martin on 9/11/2018

Dear Robert,

On 11/6/2018 9:00 PM, Robert Sanderson wrote:

> Thank you for pushing this forward, Martin!

> Quantification wise, I would be in favor of 0,1 : 0,1.

I prefer 0,1:0,n or 0,n:0,n

> If the structure of the set of symbols changed, then it would be a different symbolic object according to my understanding of E90:

> >  … identifiable symbols and any aggregation of symbols …  that have an objectively recognizable structure and
>
> that are documented as single units.
Correct. The question is, if we encounter different representations, for instance one giving a text "hello world" in Latin 1, and another in ASCII, but the E90 instance is of type Latin characters only, or if you write my name DOERR or DÖRR, both regarded by German authorities as identical variants representing the "Umlaut" OE or Ö.  Of course, in that case, having both representations would be redundant. In that case, 0:n is more tolerant.
Another opinion being, that one string is enough to define the E90. Then, 0,1.

> Similarly, if the same string was used by different Symbolic Objects, then it seems like they would actually be the same symbolic object (or you would instead use two strings with the same data).
This is a long debated question. In most cases, this appears as reasonable, but we do have cases in which the identity of the E90, seen as a message in the sense of Claude Shannon, is bound to the "sender". Discussing the sense of E35 Title, it appears that we cannot take the identity of the Title detached from the thing it was given to. This creates a precedent for the latter interpretation.

As a general principle, a 1:1 dependency is a thing subject to the suspicion of a hidden identity. To be on the safe side, I would rather not identify the E90 with the content model.

Two strings with the same data to be different is a (good) implementation choice of RDF, which assigns the identity to the link rather to the string, exactly in order to distinguish where the message comes from. If two strings with the same data are regarded as different, then we have actually a 0,x:0,n model in the ontology.
>
> (And in the RDF projection this makes no difference, as literal values do not have their own separate identity)

> For the examples, I would replace the Little Red Riding Hood example with one that is complete, to avoid confusion with the scope note requirement of being represented completely.
>
> How about:


> >  The Accession Number (E42) of the J. Paul Getty Museum’s “Abduction of Europa” (E22) _has symbolic content_ “95.PB.7“
Good!

> And for the file question, do you mean that the symbolic object is the MS Word file, which has a representable set of (binary) symbols,
No
>
> or that the symbolic object is text which is incorporated within the file, but not verbatim (as the characters in the (e.g.) paragraph are likely to be represented in the file using very a different structure).