Issue 347: Dimension and Data sets

Posted by Martin on 21/9/2017

In connection with Issue 293, I propose to consider defining the relation between

Dimension and Data Set in CRMDig. We may consider basically any CSV file an instance of Dimension, i.e., a point in a multidimensional mathematical space in which each mathematical dimension has a real world meaning in terms of an observable property. Classical examples are digital images with RGB values on a 10M 2D pixel matrix, i.e.,  3 millions dimensions in one measurement result, taken in one process.

Then Dimension could be IsA Data Set.

This can help harmonizing the assigment of Dimensions following a data evaluation process with the general result of data evaluation.



In the 39th joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 32nd FRBR - CIDOC CRM Harmonization meeting, the sig discussed about dimension proposed  to be  revised  the Dimension considering that  data evaluation creates an approximation of a dimension. Also it was decided  to  propose a  better model about  how dimensions are related to values from measurements and from evalution. It was  assigned to  MD and  Steve to  find a conservation person . Thanasis should think about this.

Heraklion, October 2017

Posted by Martin on 19/5/2018

May be we should relax the definition of E54 to either representing the approximation of a true quantity of a thing or phenomenon provided by a measurement, with all reservations to which degree the measurement measures what it is supposed to do, or a derived quantity computed indirectly from observation data comparable to reality, or a quantity produced by simulation of reality-like situations.

If we take for instance measuring the weight or length of an object, we know that it changes continuously, regardless whether within negligible margins or not. The indeterminacy/precision intervals given are those of the measurement and not those of the property itself. In that sense, we may abandon that the Dimension is the true quantity of the thing, but rather true measured.

In case of physical constants, such as the proton diameter (see recent literature), which is not property of a particular but may quite well be, we may talk about medium values from multiple measurements.

We can continue to include counting letters in a text, once this is based on comparing physical copies. A monetary amount is still a more tricky thing theoretically. It could be turned into paper money, but in case of bitcoins etc., it may never be. As a measure to compare social obligations, it pertains to a reality.

In case the value is a complex structure, the "unit" can describe the structure and elementary units of subfields.

A dataset may be composed of Dimensions, or be a Dimension as a whole.

I am in favor of taking a digital image or the results of a gene activation measurement array as a Dimension.


