Thursday, July 10, 2008

Which data model and query language for content management?

The paper "What Goes Around Comes Around" by Michael Stonebraker and Joey Hellerstein surveys about different data models and the associated query languages introduced in the past with a view to prevent repeating the history when inventing new data models.

As mentioned in the paper, there have been 9 major data model proposals:

Hierarchical (IMS): late 1960’s and 1970’s
Directed graph (CODASYL): 1970’s
Relational: 1970’s and early 1980’s
Entity-Relationship: 1970’s
Extended Relational: 1980’s
Semantic: late 1970’s and 1980’s
Object-oriented: late 1980’s and early 1990’s
Object-relational: late 1980’s and early 1990’s
Semi-structured (XML): late 1990’s to the present

It's a good read up to get a grip on how data models evolved and the lessons learned.

Authors predict that XML will become popular as an 'on-the-wire-format' as well as data movement facilitator (e.g. SOAP) due to its ability to get through firewalls. However, they are pretty pessimistic about XML as a data model in DBMS mainly because of its complex query language (XQuery), complex XMLSchema and its having only a limited real applications (schema later approach for semi-structured data) which cannot be done using OR DBMS's. It seems if you don't KISS ;-) (Keep it Simple and Stupid), you are going to loose.

No comments: