A Time of Beginnings and Transitions
Posted by Philip Fennell on 07 January 2014 07:49 AM |
|
The month of January, named after Janus the ancient Roman god of beginnings and transitions who is depicted as having two faces so that he may look both forward into the new year as well as back into the one that is past, is a time for reflection and looking forward. With the introduction of MarkLogic 7 late last year, we find ourselves at a point of transition where we can look back at how we used to structure our information and look forward to 7’s Semantic Web functionality that enables us to bring our information and applications into the realm of Linked Data. Regardless of the form your data currently takes, highly structured relational, semi-structured XML or even unstructured texts, the Resource Description Framework (RDF) model is very flexible and ideally suited to the tasks of joining disparate datasets within an organization, describing metadata about existing assets or identifying entities and their relationships found as a result of the enrichment of plain text. In assessing the possibilities, we have to ask ourselves some important questions about the information we have; is it already interlinked, how should we identify the entities (things) within it, what are the questions we would like to ask of it and how to the create the kinds relationships, both internally amongst the data and externally to third-party data, that will enrich its value and allow more useful and valuable applications to emerge from it. Is your data interlinked? How do you identify the things? What questions are we asking? “As an end-user I want a list of all the books by author X.” As you delve further into the Semantic Web technologies you will find, I think, that the sentence-oriented structure of RDF assertions (subject, predicate, object) align quite nicely with natural language and this is also true of RDF’s query language SPARQL. With this in mind, the domain experts in your, or your customer’s, organization may find that this model makes for more effective communication of requirements between teams. How should I assert a relationship? Therefore, it is important to consider how you will support the querying of your data by the way you express relationships. It may be enough to use simple relationships, which are easy to query like “who went to Washington“, and other times it will be necessary to qualify them so you can ask “find me all the people who made journeys to Washington in 1939“, the judgment is yours to make but one of the nice things about RDF is that you can easily extend simple relations by adding the qualified ones without disrupting your existing data. In that respect, RDF is a very agile data model that allows your data to evolve as and when you need it to. Beginning the transition Happy New Year. | |