The same data can be represented by many UML models.
Most fundamental is the model of the problem domain itself, independent of any implementation technology. This is purest and the simplest from of the model, because it deliberately ignores the details of implementation technologies in order to focus on the data itself. Of course there will often be differences of opinion about the best way to express the data in the domain model, so you can have more than one domain model.
For every implementation technology of interest (such as SQL or XML Schema) you can create a UML model that describes the implementation (rather than the pure data). These implementation models let you describe an implementation in detail. As an example, XML Schema supports “sequence”, “choice”, and “all” content models, which are typically described by additional nodes in a UML diagram that uses the UML Profile For XML. Many people prefer to work at this very concrete level, because they are familiar with the implementation technology, which makes the model very familiar and comfortable. There are at least three reasons why this level of concreteness is actually counterproductive:
The difference in complexity between an implementation model and a domain model is illustrated below. The top diagram is a fragment of a model reverse engineered from a functional relational database. The second diagram was derived from the first by deleting from the model elements that a machine (with reasonable presumptions) can deduce. I know which of the two I’d rather work with.
Please note that a focus on a UML domain model does not prevent working with XML Schema. In fact it makes it easier to develop not only an XML Schema, but a JSON schema, an SQL schema, an ASN.1 syntax, and many other concrete descriptions as well. When all the technology models are programmatically derived from the fundamental domain model, they necessarily describe exactly the same data, which makes it easier for diverse implementations to interact.
Of course UML concrete models could be derived from the domain model, just as schemas can. Query/View/Transform (QVT) can be used for that purpose. I’ll wish good luck to anybody who wants to go down that path.
Diagrams representing a concrete UML model of SQL (top) and a purer abstract domain model (below).
Of course there will be folks who will claim that it’s the other way around - that the concrete model is the signal and the domain model is the noise. Anybody with this perspective ought to explain which concrete model is the signal. The SQL/XML/JSON/YAML proponents are welcome to duke it out between themselves. ↩