« On openness | Main | Slideshare »

October 08, 2006

dctagging

I have been an enthusiastic user of the del.icio.us social bookmarking service for quite a long time (my first entry was on 30 July 2004, apparently). I have looked at other similar systems over the years but there's something about the clarity and simplicity of del.icio.us that appeals to me, and my del.icio.us collection is indispensable to me. I use del.icio.us almost every day, to add new entries and to retrieve references from my own collection, but also occasionally to browse the collections of colleagues and friends who I know share similar interests.

One of the most common criteria by which I found myself wanting to retrieve entries was by author: I wanted to find bookmarks in my collection for items that were created by Tim Berners-Lee or Roy Fielding. In my early posts to del.icio.us, I captured this using a "structured tag" approach in which I used a tag of the form "creator_FamilynameFirstname" to capture this information. So items by the two authors above were tagged with the tags "creator_Berners-LeeTim" and "creator_FieldingRoy".

Some time later, I came across GeoTagging, a set of conventions for using tags to add geographical identification metadata, usually the latitude and longitude of a location associated with the described resource, so that the resource can be found using a location-based search or the data otherwise processed using location-based services. Geotagging incorporates the use of a single geotagged tag, to signal that the convention is being applied in the current set of tags, and a set of structured tags of the form geo:xyz=nnnnnn, which serve as attribute-value pairs, where geo:xyz is the "qualified name" of an attribute defined by the Geotagging specification and nnnnnn is the attribute value provided by the person creating the entry.

So, for example the set of tags

geotagged geo:lat=51.4989 geo:lon=-0.1786

indicate firstly that the Geotagging convention is being applied and secondly that the resource described has some association with the place with latitude 51.4989 and longitude -0.1786. (Recently Flickr implemented a system whereby users can apply geotags to their images by selecting a location on a map, rather than having to determine latitude and longitude by some other means and enter the tags by hand.)

So based on the Geotagging approach, I switched to using a convention I've informally called "dctagging" where I apply a tag dctagged and then use structured tags of the form dc:xyz=sssss, where each such tag represents (using the terminology of the DCMI Abstract Model) a statement using a property from the Dublin Core metadata vocabularies, with the property URI represented by the "qualified name" dc:xyz (actually, I use names of the form dcterms:xyz as well) and the value string is represented by the sssss part of the tag. So for items created by Tim Berners-Lee, I use the tag combination

dctagged dc:creator=Berners-LeeTim

which enables me to retrieve bookmarks in my collection for items created by Berners-Lee. Obviously this relies on a shared convention for the construction of "value strings" (the sssss part of the tag), and it would be more difficult to achieve that across the collections of multiple users.

I have made a few uses of other DCMI properties e.g.

dctagged dc:publisher=DCMI

but for my own purposes of retrieval, I've tended to make use mainly of the DC "creator" and "contributor" properties. Fortunately del.icio.us incorporates a global tag replacement feature where you can replace all the instances of one tag in your collection to instances of one or more other tags, so it was relatively easy to convert my existing data to the new conventions - though it does have to be done on a tag by tag basis through a form, so I still haven't converted all my existing tags.

The next step, I suppose, would be to produce an algorithm to extract this DC metadata from one of the XML formats exposed by del.icio.us and to make use of it as DC metadata description sets - but that is a job for some rainy Sunday afternoon back in the UK, not a hotel room in Mexico City!

TrackBack

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00d8345203ba69e200e55071db218833

Listed below are links to weblogs that reference dctagging:

» links for 2006-10-09 from Hermes
Powell, Andy and Johnston, Pete: dctagging ah, PeteJ explains his Dublin Core approach to tagging (tags: Dublin_Core del.icio.us tags)... [Read More]

» Flickr Machine Tags and API changes from eFoundations
A while ago I wrote about a structured tag convention that I use to add Dublin Core metadata to items in my del.icio.us bookmark collection (and other similar collections). e.g. I use the pair of tags dctagged dc:creator=Berners-LeeTim to indicate [Read More]

» dctagging revisited from eFoundations
In response to the short presentation on encoding DC metadata as structured tags or triple tags that I gave to the meeting of the DCMI Social Tagging community at the DC-2007 conference, Ganesh Yanamandra from the National Library Board of [Read More]

Comments

I really like the ideas Pete puts forward here and have begun using them for the presentations that I'm uploading into Slideshare.

I have one point of disagreement in that I think the proposal to use the 'dc:' prefix is both unnecessary and likely to hold back adoption.

Once the 'dctagged' tag has been used it should be sufficient to simply use any DC property name with the '=valuestring' construct to create a new tag, e.g.

dctagged creator=PowellAndy audience=teachers

Yes, using the 'dc:' prefix is a useful convention to aid explicitness, but it also carries with it a certain techiness that might stop some people adopting this way of tagging stuff?

Hmmm, maybe.... I guess what I didn't say explicitly in the initial post is that the "dctagged" tag is (like the geotagged convention) effectively providing an implicit "namespace declaration" for both the "dc" and "dcterms" prefixes, and the use of those prefixes is saying explicitly that this is, e.g., the "creator" property as defined by DCMI, rather than as defined by me.

I take your point about avoiding unnecessary complexity, but OTOH I'm inclined to say that people interested in trying to apply DC in this fashion wouldn't be put off by the prefix - they are probably accustomed to seeing forms like "DC.creator" or "dc:creator" anyway.

Hi Pete,

I must admit I am a newbie..just wanted to share my thoughts.

Your convention of tagging is effective to extract all the tags which are created by dc-aware taggers. thats similar to the convention which the geo community uses effectively.

we understand that someday someone (maybe one of the dc folks) might create an algorithm to extract these tags.

But if you analyse the metadata behind general tags you dont really need any extraction program.

The tags can be served via rss feeds. rss 1.0 itself supports a few dublin core elements.but they are not exhaustive enough.

regards,
Ganesh

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad