Data Model for OO ApplicationsSource: firstname.lastname@example.org
Problem: How do you design a data model for object oriented applications? Should the data model be designed first, or should it be derived from the object model?
Aswinee K. Rath wrote:
[...] The business model of my client is changed and they want to rewrite their existing critical applications to suit the new business needs. [...]
The main architect for this project wants the data model for the applications to be created as the very first step. We are spending a lot of time in designing a data (class) model diagram in Rational Rose, where each entity represents a logical table. The model essentially looks like the databases of existing application. My feeling is that we are not in right path. [...]
H. S. Lahman replied:
The key word I see here is 'applicationS' (i.e., plural). Each application is likely to need data in a specific format. At least some applications are likely to hybridize persistent data with local data (e.g., from the user or from intermediate processing). Each application is also likely to have different accessing needs. So...
[...] Assuming the various applications will share the same database resources, I think they each represent a suite of requirements on the database. In a single RAD application the database effectively is the application. But when there are multiple applications, the database becomes a service to the applications. As such, its design is driven by the applications' requirements for data.
Therefore, I think one has to examine the data requirements of the applications first. Since the applications are changing in this case, then one needs to do at least some analysis on the applications first.
This is not to say, though, that the database is solely driven by the individual applications. It has its own concerns about performance and scalability that depend upon the interactions among applications. So eventually I think some compromises will have to be made between the applications and the database. Essentially that means there will be negotiation between the applications as a group and the data model.
(Depending upon complexity, access volume, and whatnot, this might result in a middleware layer between the applications and the database that provides its own reformatting, distributed caching, and other exotic services.)
Good application design usually isolates persistence within a package or layer that has a generic interface for the application based upon the application's specific needs. That package or layer understands the semantics of the particular persistence mechanisms. Depending upon how differently the application and database view data, that package or layer may be simple or complex.
What is in that client-side package or layer will be the result of a negotiation with the database developers. But that negotiation is a complex one that requires a systems engineering overview of all the applications. The requirements for the database itself should come out of that vision just as the requirements on the applications' persistence packages or layers should come out of that vision.
Bottom line: as described I see this as a systems engineering problem that needs to start with the applications' data requirements; the database data model should not be driving the architecture.
Robert C. Martin added:
How do you know that the data model is correct until you have a program that uses that data model to get the job done? How do you know that your data model is not too complex unless you incrementally build that model based upon getting high priority features to work?
If it were me, I'd be trying to get the behaviors of the system working one feature at a time. I'd be migrating the database schema for each new feature. I'd be evolving the system from very humble and simple beginnings, adding each new feature in the order that the stakeholders want.
Kyle Brown and Bruce G. Whitenack, Crossing Chasms - A Pattern Language for Object-RDBMS Integration
ootips, Storing Objects in a Database