Events…

About a year ago I worked on a public data model for representing news stories as linked data. The model is simple  and can be summed up by example in the following RDF statements:

<Storyline1>  :hasSlot  <StorylineSlotA>
<Storyline1>  :hasSlot  <StorylineSlotB>
 
<StorylineSlotA>  :contains   <Event1>
<StorylineSlotB>  :contains   <Event2>
 
<StorylineSlotB>  :follows  <StorylineSlotA> 

<Asset1>  :about  <Event1>
<Asset2>  :about  <Event1>
<Asset3>  :about  <Event1>
 
<Asset4>  :about  <Event2>
<Asset5>  :about  <Event2>
<Asset6>  :about  <Event2>

In order to implement that in BBC News I took a strategic decision to allow Storyline instances to be the object (rdfs:domain) of our :about predicates, effectively simplifying the model to enable a journalist to say:

<Asset7>  :about  <Storyline1>

We ran a pilot with a local newsroom in winter 2103/14 and this approach worked fine, content could be aggregated into collections (typically chronological streams of updates) with each asset being annotated as being about that Storyline. This can be used to drive a user experience similar to http://www.itv.com/news.

In December 2013 I was fortunate to have Paul Rissen join me in News – Paul had been one of the original collaborators on the Storyline data model, and was the author of the Stories ontology which it was derived from.  Over the past few months Paul has helped me realise that while allowing Storyline instances to be used as tags may have been useful to promote its adoption, it is semantically wrong.  A Storyline is a particular telling of a story – a version of events unique to that journalist or newsroom:

<Asset1>  :about  <Journalist A's version of events>

Doesn’t sound right does it? News assets are usually about events, and (as Yves pointed out long ago) events involve people and organisations, take place at locations, and can involve other factors. Storyline is the editorial layer on top of that basic annotation – a curation if you like. It is the decision process that goes in to the selection of assets  that describe that event or series of events.

Over the coming months Paul and I will be looking at how we can implement this distinction into the (now well established) newsroom tagging workflow, to make sure that the semantic annotations we are making are as accurate and useful as possible.

 

An extract from ‘Yoga for You’ by Claude Bragdon

The mistake is to regard love as an emotion, a feeling, a quality. Love is an element, as air is an element. It is a white fire that burns to destruction, or warms or illuminates or promotes growth or promotes frequencies. By the strange fact that people separate different frequencies of manifestation into spiritual or material, it is a material element, for no element is more material than any other. Love is an element as earth is an element. It is a chemical of the body without which the body is nothing. Let the focused ray of attention regard love as an element and its workings will be made clear to you.

In the true nature of love is the true nature of healing. In the knowledge of love is the knowledge of creation. In the practice of love is the science of the communal living of man. In the breathing of love is the secret of initiation.

Love is the only element of salvation. The peril of today is as nothing to the peril of the future if people continue to neglect the element of love. With it, the onrush of science will make abundance and leisure for all. Without it, the new knowledge will make unemployment and starvation. Love and mathematics are one. Without the order of mathematics love is injustice and discrimination.

Piety is outmoded. The gods do not want people on their knees every morning and night and behaving like apes in the meantime. They want them on their feet with their hands up and their arms outstretched to the light, looking, seeing, finding out and understanding. They want universal love, not because it is pious, good, religious, sentimental, or even good policy, but because it is pure, living scientific law. We all take part in involuntary creation, just as our bodies are filled with involuntary activities of chemical actions and reactions; but the powers given us to undertake voluntary creation have only been discovered by a very few people, and many have come to grief by using them selfishly.

There are only two things in the universe, power and substance. Love is the power which creates from the substance. There is the fashioner and the fashioned. All parts of creation take part in creation.

querying RDF in Ruby with RDF.rb

I recently had to make a tool for work that allowed me to see the linked data graphs that BBC journalists are starting to create as they annotate news content. Ruby is my hacking language of choice so this blog post describes how I used @gkellog’s RDF.rb library to:

  • fetch RDF graphs from the BBC’s linked data platform’s HTTPS API (via Restclient)
  • parse the data with RDF::Turtle::Reader
  • query it with RDF::Query and process the resulting Solutions

Disclaimer – I’m an amateur programmer so some of this may look horribly hacky to a Ruby or RDF expert; in my defence all I can say is that it works 🙂

getting data from the API

The BBC’s linked data platform sits behind a REST API that uses HTTPS and requires RSA cert authentication (the guys working on it plan a public API sometime soon, bit for now its use is internal only). Using the restclient gem makes getting data from this kind of API pretty straightforward:

require 'restclient'

SSL = {
  :ssl_client_cert => OpenSSL::X509::Certificate.new(File.read("/path/to/my/client.crt")),
  :ssl_client_key => OpenSSL::PKey::RSA.new(File.read("/path/to/my/client.key")),
  }

def getThingGraph(guid)
  url = "https://api.live.bbc.co.uk/ldp-writer/thing-graphs?guid=" + guid
  data = RestClient::Resource.new(url, SSL).get({:accept => "application/rdf+turtle"})
end

so now I have a String object that contains some RDF/turtle graphs. For the sake of completeness here’s an example of what the API response looks like:

<http://www.bbc.co.uk/things/ffc9b446-97b0-4cec-9f4f-dbd5d8238dad#id>
      a       <http://www.bbc.co.uk/ontologies/cms/ManagedThing> , <http://www.bbc.co.uk/ontologies/news/Person> ;
      <http://www.w3.org/2000/01/rdf-schema#seeAlso>
              <http://www.chucknorris.com/> ;
      <http://www.bbc.co.uk/ontologies/coreconcepts/disambiguationHint>
              "Carlos Ray 'Chuck' Norris (born March 10, 1940) is an American martial artist and actor. After serving in the United States Air Force, he began his rise to fame as a martial artist, and has since founded his own school, Chun Kuk Do." ;
      <http://www.bbc.co.uk/ontologies/coreconcepts/preferredLabel>
              "Chuck Norris" ;
      <http://www.bbc.co.uk/ontologies/coreconcepts/sameAs>
              <http://dbpedia.org/resource/Chuck_Norris> .

<http://www.bbc.co.uk/contexts/85390773-6985-49c9-aef1-ec3763f258ab#id>
      a       <http://www.bbc.co.uk/ontologies/provenance/ThingGraph> ;
      <http://www.bbc.co.uk/ontologies/provenance/provided>
              "2013-11-07T17:20:39+00:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
      <http://www.bbc.co.uk/ontologies/provenance/provider>
              <mailto:jeremy.tarling@bbc.co.uk> .

The next step was to read the response into an RDF graph that can be queried – “get me all the objects (values) of triples with the predicate <core:sameAs>” sort of thing.

reading the API response into an RDF graph

This is the bit that I got a bit stuck on. There are some great examples linked to from the RDF.rb page but none of them seemed to do exactly what I wanted, namely to work with the in-memory String object that restclient had made for me.

I ended up with a two step process: first to read the string using RDF::Turtle::Reader

rdf_doc = RDF::Turtle::Reader.new(data)

and then to append the resulting RDF data to a RDF::Graph.new object so it could be queried with RDF::Query

graph = RDF::Graph.new << rdf_doc

The getThingGraph method now looks like this:

 def getThingGraph(guid)
  url = "https://api.live.bbc.co.uk/ldp-writer/thing-graphs?guid=" + guid
  data = RestClient::Resource.new(url, SSL).get({:accept => "application/rdf+turtle"})
  rdf_doc = RDF::Turtle::Reader.new(data)
  graph = RDF::Graph.new << rdf_doc
end

which results in an object that can now be queried.

querying the graph and processing the results

The RDF::Query class allows you to define a query pattern. In my example I’m going to define a simple query that looks for any triples that have the predicate rdf:type – a useful thing to get an idea of the sort of data you are dealing with:

@thingType = RDF::Query.execute(graph, {
  :thing => {RDF::URI("http://www.w3.org/1999/02/22-rdf-syntax-ns#type") => :type}
})

Executing the query gets you an RDF::Query::Solutions object which has some nice methods for examining graph datasets. Note that that’s ‘Solutions’, not ‘Solution’ – in other words it’s a collection so you can iterate over each solution that matched your query. In my case I’m presenting the results in a Sinatra app so they surface via an erb template:

<% @thingType.each do |thing| %>
  <%= thing[:type] %>
<% end %>

And there you have it – in my example graph above the result tells me that Chuck has three types, a cms:ManagedThing, a news:Person and a provenance:ThingGraph.

The Song of Uriel

(excerpt from The Song of Sano Taro, by Nancy Fullwood)

Good people of the realm of Earth, I, Uriel, greet you on this evening of grace. I invite you into my realm of White Fire, and beg that you give your undivided attention while I tell a story that is difficult for aught but alchemists to comprehend.

When the gods, under the order of the King, attempted to fuse the seven primal Forces into a sphere of balanced Forces, the experiment failed, in that the Forces took control of their own motion and madness beset them. So vital were they that they heeded not the command to obey the law of harmony in the fusion of their forces, and as harmony is the one law of being, they became lost in chaos when promiscuity became the rule. The gods were sad indeed, for well they knew that these elemental children of the King would destroy themselves and the work which had consumed countless ages would have to be begun all over again. So they held a council, and the master alchemists of the King’s domain announced that there was only one hope of saving the situation. To be sure, the method which must be employed would cause the Forces great suffering, for the law has decreed that when harmony is not regarded, pain results.

The plan, which was decided upon in the council of the gods was this: first the vibratory motion of the mad forces must be lowered, or slowed, if you will, so that they would find themselves in what seemed narrow confines and limitations, because of their slow movement. When the gods succeeded in bringing them to the lowest vibratory movement possible, they had them well in hand. Bear in mind that these elemental children of the King were imbued with Free Will and power of reproduction, so it was a delicate matter to hold them in their right relation to one another. Sad to relate, they knew only lust, which caused them unrest and disintegration. Love was among them but they knew him not because of their lack of balance. Without Love there can be no cohesion. So he who is nameless, though he has been called by various names, each of which is a symbol of the essence of life, which the gods call White Fire and the people call Love, volunteered to give his essence, which is Love, to save the mad forces from destroying themselves. To accomplish this, he spread his essence over and through the dark mass, thus shedding his blood to save them. The alchemy of the gods is profound indeed.

Imagery is the wisest method in which to proceed. I place for your consideration a sphere of dark colour whirling in the other. The sphere was composed of seven elemental forces, separated one from the other. Now Love was in their midst, but only through their harmonious fusion with each other could they become conscious of Love. Love, who brings Hope as handmaiden, is always conscious of himself, so you sense the truth that Love offered himself as a conscious sacrifice to save the mad Forces. Ere this sacrifice could be completed, Love, whose vibration is high and swift, must lower his own movement and fuse himself with the negative Forces whirling in chaos, suffering with them the pain of limitation and lying with them in the grave of dense matter until they were quickened by his Fire, receiving thereby an impulse which will drive them on to perfection.

The days of creation are of great length. Three days passed ere Love’s seed was planted in the very heart of the Earth. Then, trembling with ecstasy, Love cried: “It is finished!” and the motion of the planet Earth was quickened and Earth shone with dawning light and Love rejoiced.

Be not dismayed at the lack of understanding of the people. I speak truth when I say that when it was known that Love had volunteered to lower his swift motion that the rebellious children might be saved, I, Uriel, volunteered to become the body of Love, the great unifying or cohesive principle in the universe. Only thus could Love’s swift motion contact the discordant forces, and through the crossing of them with his essence, bring about their transmutation by quickening their movement. So, clothed with Love’s fire, I fell like a burning faggot into chaos, pledged to remain Love’s channel of expression until he brought his sacrifice to completion, and the seven primal Forces were saved in spite of themselves.

Love’s burial in matter has been the keynote of all religions which have stirred the spirit of the people of Earth, and the profound belief, though often unconscious, in the resurrection of matter through Love has kept Hope alive within them, and when Hope lives within the people the gods are assured that they will seek understanding and find it. Adieu

I, Uriel, have spoken.

Storylines vs object oriented news

I’ve recently been collaborating with some like-minded colleagues in the BBC and other media organisations on a model for story-telling in News. Building on the work that the BBC has been doing to utilise Linked Data driven content aggregations, we wanted to look at how we might model the relationship between events as told by journalists. As Michael Smethurst has pointed out, the who/what/where/when aspect of reporting events gets you so far but leaves out the more interesting elements of ‘why’ and ‘because’:

“The more interesting part (for me) is the dependencies and correlations that exist between events because why is always the most interesting question and because the most interesting answer. Getting the Daily Mail and The Guardian to agree that austerity is happening is relatively easy, getting them to agree on why, and on that basis what should happen next, much more difficult.”

I’m often reminded by colleagues at work that the BBC has a reputation for quality factual reporting and impartiality, as it were to suggest that the editorialisation of news is something that only goes on in newspapers. I don’t think the BBC is just a glorified wire service aggregator and publisher; while it’s true that a good deal of BBC content does come direct from the wires there’s an inevitable process of editorial selection: what to leave in or out, the order to present reported events in, and the links to make between events. Also a lot of content produced by BBC journalists doesn’t fit in to a neat event model: features, analysis, even republishing the the odd bit of celebrity gossip:

gossip

(As Jonathan Stray has pointed out this is not a new trend in journalsim but has been a developing theme over the past century.)

From a data architecture point of view I’m particularly interested in modelling news stories as data. For the past decade the BBC News website has been a flat, page-based website where a page is equal to an article. Not a story mind you, but an article about a story. You might find the odd article that’s a one-off but the great majority of articles on the BBC news website wil be multiple accounts of the same story, retold as new developments occurr. There are three problems with this approach:

  • duplication of content – because each article stands alone it has to re-tell the events that make up the story so far
  • duplication in search engines – search engines will index each article separately, so when someone searches for details about a story they may get the BBC’s latest account or they may not be so lucky – most likely they’ll see multiple articles about the same storyline
  • link curation scaling – links between articles that are about the same story have to be manually created and curated and immediately decay from the moment an article is published

The BBC is in the process of migrating its News website from static page publishing to a dynamic publishing platform based on a typical three-tier architecture: presentation – service – data. This was done for the BBC Sport website last year, and it’s particularly exciting as the data tier consists of both a content store (for articles) and a triple store that holds semantic annotations about the articles in the content store. The opportunity for a BBC News website running on this platform is to move from a page-based model of multiple articles about the same story to a story-driven model where journalists publish updates about storylines to the same (persistent) URL for that story. This was one of the motivations for us to collaborate on the Storyline Ontology.

storyline data model

So what is an update in this story-driven approach? From a web perspective I see an update as a fragment of a story, a development if you like.  Physically it’s an asset: some text, an image, an audio clip, a video clip, a social media status update, etc. Updates might be represented in a URL structure as bbc.co.uk/news/storylineID#updateID, which could be a useful pattern for a few reasons:

  • users of the website could share individual updates via social media
  • updates could be presented in context of the wider narrative – an item in a timeline for example
  • search engines should ignore the fragment identifier (the hash and everything after it) thereby only indexing the story page and removing the duplication that I mentioned above.

But coming back to Michael’s point at the start of this post, it’s not just the updates about reported events that are interesting in a storyline, it’s the selection that drives the narrative thread and points to things like causality – the ‘why’ rather than the ‘what’.

There’s been a fair bit of buzz lately about how some news outlets are paring back news to it’s bare bones – a ‘just the facts’ approach, and that these ‘facts’ can be treated like objects and instanced into news accounts. Object-oriented news is not a new idea, and I can see the attraction in a short-form-social-media-status-update driven world. But I think there’s a risk in this approach that if we overemphasise these fact-objects out of the context of a narrative thread then they take on a life of their own.

Building facts into a storyline involves an editorial process that (should) ensure provenance, attribution and maybe one day even openness about the editorial process that the journalist went through. I was workshopping up in Birmingham last week with the England editorial crew and Eileen Murphy used this phrase that has stuck in my mind: ‘a window on the newsroom’. Anything that increases the transparency of our journalism can only be a good thing.