References as Metadata, and More

How to reference from within audio, video or PDF files? A follow up to my previous post on the presentation of scientific knowledge in "new" media.

In my previous post on the presentation of scientific knowledge in non-traditional media I both called for a discussion about the issue and provided some early thoughts on how such a discussion should develop. I also argued for some key points, which are entirely essential to me - first of all, referencing.

Since then, there have been some developments that I want to outline in this blog post. While the development of a tool (so far only a proof of concept unfortunately) that embeds references in files using XMP metadata is certainly the most important, let's start more with some related events that have taken place since.

Approaching People

I started my previous post writing about how people told me that they or others were interested in the topic. Obviously, this is one of my personal motivations for considering to even just try to work something out - there is a pool of potentially interested people out there. On the other hand, getting people working in the social sciences and humanities or related fields to present their work in a more approachable way without losing their strengths is just incredibly important, period. Having a pool of potentially interested people does unfortunately not mean that people will heed to one's call - even if it is just a call for discussions. Some PR work needs to be done. Since I am preparing for writing my BA thesis (say, I could not travel much), and I just went on a trip to Indonesia this week (say, I won't be able to get any talking in Germany done in the near future either), the first target group had to be students in Frankfurt. I thus concentrated on presenting my ideas at university.

In doing so, I wrote up a draft "syllabus" for an autonomous tutorial-cum-discussion group, which would essentially put into practice what I proposed in my previous blog post. It is aimed to incorporate both theoretical and practical elements: we would first analyze the different media available to us and then discuss, how we might put what we learned into practice. I also held a short presentation on the issue, outlining the need for us as people in some way affiliated to academia to consider "new' media. The presentation can be found here.

The next step will certainly need to be to approach more people - those already thinking about the same questions at university (which are in other faculties mostly) and those outside of university.

In Search of a Framing

In my previous blog post on the issue, I avoided giving serious reasons for why people should consider using anything but their usual media in presenting their work. A pool of people interested in that, sure, it is a motivation. Me seeing the topic's relevance, sure, it's one, too (for me). But these are no reasons yet. To engage people I am searching for a proper framing of the issue.

So far, I think there three ways in which the general issue of "non-standard media' in the presentation of knowledge can be presented, if the main aim is to engage the audience:

The neutral framing: "Let's go with the times" : The "Let's go with the times" argument combines both positive and negative aspects. On the one hand, it presents a certain urgency to the issue. If we do not... and there goes the negative framing. On the other hand, positive aspects ("New opportunities") can be considered using this argument, too. This framing requires enough time to actually have a proper exchange with whoever one is to convince. While generally preferable, it might not always be the right strategy to use this narrative. If sufficient time is not available, it might only be really useful with those, who encounter both positive and negative effects of non-standard media in respect to their work in daily practice.

The positive framing: "New opportunities" : The positive framing basically says "look at what cool stuff we can now do, too". While surely true, it requires those one talks to to be relative idealistic about what they do and to have some margin of failure. If people do not care about their work, presenting nice new models and tools will not matter to them either.

The negative framing: "If we expand our scope in terms of presentation, we'll have a problem" : This is an alarmist framing, and certainly the worst choice. But it might help with people, who are happy with their current situation but don't want to lose it.

The Technical Side: Embedding References into Files' XMP Metadata

Perhaps the most important development on my side since my initial post on the issue, I wrote a small tool that writes and reads references from and to the XMP metadata of a file. One of my earliest ideas was to create a framework to have references available with media files in a machine-readable way. Even in my previous post, I already hailed the concept of shownotes for podcast episodes. This is essentially what I was looking for.

I argued however, that notes and references should be easily copy-able and thus embedded into the file. As reading the file takes longer and might thus not be preferred over a sidecar file, in which the metadata is stored, but which makes full copying harder, it might be smartest to just keep both: metadata embedded into files and metadata written into a sidecar file.

Since I want my solution to work for as many file formats as possible, I looked at XMP. XMP uses Dublin Core to describe files, which means that references are apparently already implemented in XMP - just that nobody seems to use it.

I thus started writing a tool to edit XMP metadata in files to include references using the dc:source tag (I am still undecided, if dc:relation is not more fit). The result is a an editor written in Python (using a Qt4 GUI), and a more limited version written in PHP. Both can be found here.

Both still have major problems. The Python version relies on exempi, which is unfortunately not available on Windows. Also, writing to files only works if there is sufficient space for metadata in the file already (this seems to result from writing plain text into a binary file). I thus need to use JabRef to write random XMP metadata into PDF files, which I want to edit, before I can start editing them for now. Similar problems are likely to occur in the case of audio or video files, even if I have not tried that out yet.

Despite these problems, it's good news that the implementation was actually very easy and that the building blocks are already there. An example of a PDF file with references embedded into the metadata to be read out for the user later can be found here.

A screenshot of the Python-based editor in action