« Virtual World Watch Request for Information | Main | Write to Reply »

November 04, 2009

"Simple DC" Revisited

In a recent post I outlined a picture of Simple Dublin Core as a pattern for constructing DC description sets (a Description Set Profile), in which statements referred to one of the fifteen properties of the Dublin Core Metadata Element Set, with a literal value surrogate in which syntax encoding schemes were not permitted.

While I think this is a reasonable reflection of the informal descriptions of "Simple DC" provided in various DCMI documents, this approach does tend to give primacy to a pattern based solely on the use of literal values. It also ignores the important work done more recently by DCMI in "modernising" its vocabularies, emphasing the distinction between literal and other values, and reflecting that in the formal RDFS descriptions of its terms.

In a document presented to a DCMI Usage Board meeting at DC-2007 in Singapore, an alternative, "modern" approach to "Simple DC" was proposed by Mikael Nilsson and Tom Baker. (I don't have a current URI for the particular document, but it is part of the "meeting packet"). That proposal suggested a view of "Simple DC" as a DSP (actually, it proposed a DCAP, but I'm focussing here on the "structural constraints" component) in which the properties referenced are not the "original" fifteen properties of the DCMES, but rather the fifteen new properties added to the "DC Terms" collection as part of that modernisation exercise:

  • A description set must contain exactly one description (Description Template: Minimum occurrence constraint = 1; Maximum occurrence constraint = 1)
  • That description may be of a resource of any type (Description Template: Resource class constraint: none (default))
  • For each statement in that description, the type of value surrogate supported depends on the range of the property:
    • For the following property URIs: dcterms:title, dcterms:identifier, dcterms:date, dcterms:description (Statement Template: Property List constraint):
      • There may be no such statement; there may be many (Statement Template: Minimum occurrence constraint = 0; Maximum occurrence constraint = unbounded)
      • A literal value surrogate is required (Statement Template: Type constraint = literal)
      • Within that literal value surrogate, the use of a syntax encoding sceme URI is not permitted (Statement Template/Literal Value: Syntax Encoding Scheme Constraint = disallowed)
    • For the following property URIs: dcterms:creator, dcterms:contributor, dcterms:publisher, dcterms:type, dcterms:language, dcterms:format, dcterms:source, dcterms:relation, dcterms:subject, dcterms:coverage, dcterms:rights (Statement Template: Property List constraint):
      • There may be no such statement; there may be many (Statement Template: Minimum occurrence constraint = 0; Maximum occurrence constraint = unbounded)
      • A non-literal value surrogate is required (Statement Template: Type constraint = non-literal)
      • Within that non-literal value surrogate
        • the use of a value URI is not permitted (Statement Template/Non-Literal Value: Value URI Constraint = disallowed)
        • the use of a vocabulary encoding scheme URI is not permitted (Statement Template/Non-Literal Value: Vocabulary Encoding Scheme Constraint = disallowed)
        • a single value string is required (Statement Template/Non-Literal Value/Value String: Minimum occurrence constraint = 1; Maximum occurrence constraint = 1) and the use of a syntax encoding scheme URI is not permitted (Statement Template/Non-Literal Value/Value String: Syntax Encoding Scheme Constraint = disallowed)

This pattern seeks to combine the simplicity of use of the "traditional" "Simple DC" approach of using only 15 properties with a recognition of the value of using literal and non-literal values as appropriate for each property. However, it is, by definition, slightly more complex than the "all literal values" pattern outlined in the earlier post, and it differs from the patterns described informally in existing DCMI documentation (and I think it would be difficult to argue that it is represented using formats like the oai_dc XML format, which of course predated the creation of the new properties by several years.)

This does not have to be an either/or choice. It may well be that there is a use for both patterns, and if they are clearly named (I don't really care what they are called as long as the names are different!) and documented, there is no reason why two such DSPs should not co-exist.

Having said all that, I'd just re-emphasise that I think both of these patterns are fairly limited in the sort of functionality they can support. It seems to me the notion of "Simple DC" emerged at at time when the emphasis was still very much on the indexing and searching of textual values, and it largely ignores the Web principle of making links between resources. It would be difficult to categorise "Simple DC" - in either of the forms suggested - as a "linked data" friendly approach. I fear a lot of effort has been spent trying to build services on the basis of "Simple DC" when it may have been more appropriate to recognise the inherent limitations of that approach, and to focus instead on richer patterns designed from the outset to support more complex functions.

P.S. I know, I know, I promised a post on "Qualified DC". It's on its way....

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345203ba69e20120a64fc001970b

Listed below are links to weblogs that reference "Simple DC" Revisited:

Comments

I thought I knew enough of the lingo to get by, but... what's a "non-literal value surrogate" that is not a URI? What's the difference between that and a 'literal value surrogate'?

What's the right documentation for me to look at to understand these terms with as little pain as possible?

@Jonathan,

Those concepts are defined by the DCMI Abstract Model

http://dublincore.org/documents/abstract-model/

and used in the DSP model

http://dublincore.org/documents/dc-dsp/

A non-literal value surrogate has several components: an optional value URI, an optional vocabulary encoding scheme URI, and zero or more value strings (literals "representing" the value); whereas a literal value surrogate is composed of just the literal itself.

It might be easiest to compare the RDF graphs for the two cases in

http://dublincore.org/documents/dc-rdf/

especially the figures in 4.3 and 4.7. The literal value surrogate is just a literal object, but the non-literal value surrogate is a subgraph in its own right.

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad