Storing Objects in a DatabaseSource: comp.database.object, comp.object
Anders Vännman asked:
I'm working in a project where we need to do a mapping between our object model and a traditional relational/navigational database. We need to be able to connect to a various different database types, like Btrieve, Informix, DB2, via ODBC and so on. I.e. we need total independence of the database.
Have anyone done this before and have any clues? Right now we are considering writing the layers by ourselves, but it takes time and time cost.
Robert C. Martin explained the problems:
The problem actually has very little to do with OO and RDB. The problem has to do with OO and DB. That is, the problem exists with any kind of database, including OO databases.
The reason is that applications are dynamic, and databases are static. There is a paradigm boundary that must be crossed when communicating between these two domains.
Consider. A database, any type of database, is a repository of information. An application, on the other hand, is a manipulator of information. One is static and the other is dynamic. The database is a representation of information, and the application is a representation of behavior. Two different paradigms.
In the case of the applications, our goal is to encapsulate the data and expose the behavior. In the case of the database, our goal is to expose the data and remove the behavior altogether. This is the boundary that is difficult to cross.
Now you might think that in an OO database, the object in the database possess complex behavior and so the database can encapsulate data and expose behavior just like an application. However there are forces that prevent this from being an effective strategy.
The database is typically depended upon by many different applications. The applications are not depended upon by the database. Thus the applications are free to change without affecting the database, but the database cannot change without affecting all the applications that use it. This means that the database is extremely stable (i.e. difficult to change) and the applications are relatively instable (easier to change).
This difference in stability is the crux of the problem. If the behaviors used by the applications are part of the database, then those behaviors are as difficult to change as the database. i.e. when you change the behavior of an object in the database you affect all the other applications that depend upon the database. The more applications that use the database the more intolerable this becomes.
Thus, the designers of the database are driven to export the behavior from the database and into the applications, thus reinforcing the paradigm crossing.
Why is the paradigm crossing difficult? There are two reasons. First the modeling needs are reversed. We must cross from a domain in which data is hidden to a domain in which data is exposed. This by itself would be enough to make matters difficult. However, there is another more significant factor.
Databases are difficult to change, and yet we need to change them often! Sometimes we simply need to change the schema. Other times we need to rip out a DBMS and put in a new DBMS. e.g. swapping Sybase for Oracle or something like that.
In order to allow that to happen, we have to protect the applications from the database. We have to build a layer in each application that hides the details of the database from the rest of the application. This layer is what bridges the paradigm boundary and protects the application from changes to the database.
The layer depends upon the database, and depends upon the application. The application does not depend upon the layer. (For details see Chapter 4 of my book: Designing Object Oriented C++ Applications using the Booch Method.) The main idea is to use the Proxy pattern, or some other design pattern that inverts the dependencies between the application and the layer.
This layer is horribly dependent. When the schema changes, the layer changes too. When the application changes, the layer changes too. The layer is a focus for maintenance. It breeds nightmares.
BTW, the difficulty in crossing this layer is not simply a result of using OO. Any well designed application in any paradigm has the same problems. Encapsulation has been a principle of software design for much longer than OO has been popular. In any application where behavior is exposed and data is encapsulated, the application/database problem arises. OO heightens the problem somewhat because it enforces the encapsulation to the point where it is difficult to break the rules, but it is not the cause of the problem. The cause of the problem is the desire for well designed applications.
Joshua Duhl suggested some commercial solutions:
My first bit of advice is don't build it yourself -- it will be very costly to build and to maintain. For the money that you'd put into building it yourself you can buy something that will get you a lot farther, with much less, but not necessarily no, problems. That being said, there are a bunch of products on the market which may help you get where you want to go. You may, in fact, depending on the product you select (assuming you select one :-) have to do some bit of tailoring of the solution to fit your needs exactly....
What follows is my personal FAQ answer for this question (having researched and presented on the topic numerous times... perhaps I can submit it to a FAQ site? :-)
There are a host of products available for connecting OO languages to relational and possibly non-relational data stores. Each product can offer a range of features. The list below outlines some of the possible features, but you will need to check with the vendors for specific support of these features. They include:
There are other smaller products but they are much less well known. So which you want to use largely depends on what your application needs, how much flexibility you need, what scale your application is, and how much you expect the tool to do for you.
And I add:
KORMA, generates Java and COM components based on a relational database schema
FORM, Framework for Object-Relational Mapping
JOP, Java Object Persistence (free library)
Barry & Associates, ODBMS FAQ
Kyle Brown and Bruce G. Whitenack, Crossing Chasms: A Pattern Language for Object-RDBMS Integration
Wolfgang Keller and Jens Coldewey, Relational Database Access Layers - A Pattern Language
H. Vogelsang and U. Brinkschulte, Persistent Objects In A Relational Database (postscript)
Mark L. Fussell, Foundations of Object-Relational Mapping
Arthur M. Keller (from Persistence), Papers on object-oriented databases
Joseph W. Yoder, Joe's Object-Oriented Mapping Pages
Cetus links about Databases: Mapping Objects to Relations