Wednesday, October 25, 2006

Entity Modelling 101

This post came by the MSDN architecture forum and I thought it was a good 20 second thought for the day...

I have a really tough data modeling problem, and I'm dumbfounded. I hope there's somebody with an idea.

The problem really deals with aquatic habitat and vegetation, but I'll frame it in a more familiar context.

It's a very common scenario. An employee works for a company. The employee is assigned to a project.

Company --> Project --> Client
:               :
:               :

Employee --> Consultant

If you add this "consultant" relation between employee and project, isn't that a circular reference? Doesn't that violate like 4th normal form or something?

Is there anyone with any ideas?


There are a bunch of ways to go on this. Are we talking about database design or object model? It doesn't really matter in the abstract I guess...

You've defined your entities just fine from what I can see but the magic isn't in the entities. The real work happens in the relationships between entities and I think that's what you may be missing. The best description I can give is the classic: "Is-A v/s Has-A". Here goes my 101 explanation:

Company - HAS A - Project
Project - HAS A - Client
Company - HAS A - Employee (semantics aside...)
Consultant - HAS A - Project
Employee - IS A - Consultant

HAS A implies an ownership or property relationship. So the Company object would have a Project object property or Projects collection. Project would then have a Client object property etc.

IS A implies an inheritance relationship. So a Consultant object would inherit from the Employee object because an Employee could end up fulfilling the role of Consultant. Vice President would also inherit from Employee for the same reason but the base Employee functions would always be available. Also, does a Consultant have a Project or does a Project have a Consultant? Perhaps there is a context that you must come in with depending upon the situation. e.g. I am a Consultant and I have many Projects but each Project has one or more Consultants. Depending upon which direction you are viewing from, you could have a Project with a collection of Consultants or a Consultant with a collection of Projects. Both are correct depending upon what question you are planning to ask of the model.

The language is important here because you may decide that Client - HAS A - Project and this would dramatically change the way you created your model.

No comments: