This is part 5 of the series; part 4 is here.
On yesterday's blog entry I described an issue I found with the Rails ActiveRecord framework. Here is a summary:
Let's say that in my domain model I have a School and a Teacher, and both have an Address. Good database design suggests that you have separate tables for School, Teacher and Address, and that entries in the School and Teacher tables contain a key to their respective Address. Here's an article on IBM DeveloperWorks that says that same thing; see the section called "Complex Datatypes".
To recap, there's a one-to-one mapping between School and Address, a one-to-one mapping between Teacher and Address. The School and Teacher tables each contain a key to an Address.
In addition, I'd like there to be a dependency relationship between School and Address so that when a School is deleted, its associated Address is also automatically deleted. I'd also like the same dependency to exist between Teacher and Address.
Ruby on Rails includes a framework called ActiveRecord that directly supports access to relational databases without having to write any custom code. A scaffolding tool generates Ruby classes based on the names of your SQL tables, and the runtime system introspects SQL tables to add the appropriate accessors to the Ruby classes. The only thing you have to do is to "annotate" the Ruby classes with words like "has_one", "belongs_to" and "has_many" to indicate the relationships between the entities.
The current version of ActiveRecord requires that if model A has a dependency on model B, then class A must "belong_to" class B and the table for A must contain the key to B. In my case, since Address is dependent on School, it means that I'd have to put a key to the School into my Address table, which violates good database design principles.
Since I don't want to warp my database design to fit a current limitation of Rails, I added a few lines of custom code to my School and Teacher classes to implement the desired dependency. If you're interested, here's the code from School:
class School < ActiveRecord::Base
belongs_to :address # yuch# use callback hook to implement dependency
before_destroy { |m| m.address.destroy if m.address }# when new address is assigned, delete old one
def address=(new_address)
address.destroy if address
new_address.save if new_address
self[:address_id] = new_address.id
end
end
Clearly, this is not ideal. It would not be hard for Rails to directly support my common use case so that no custom code is needed.
Tomorrow I'm going to post some ideas on how to improve and simplify ActiveRecord.
Part 6 of this series is here.
Hi,
I've probably misunderstood what you ment to do, but couldn't you have said that a School has_one address and a Teacher has_one address?
Therefore in your Address model you'd have something like:
belongs_to :teacher
belongs_to :school
This would mean you would have to have a school_id and teacher_id column in the address table, so I'm not really sure this would bring you any closer to what so may say as "good" database design, but thats why Rails is considered opinionated software.
On the plus side in moving the belongs_to relationship around you can do the following in you Teacher model:
has_one :address, :dependent => :destroy
Which upon deletion of the teacher would delete the associated address row in the DB. The same can be done for the school as well.
Again apologies if I may have missunderstood what you originally ment.
Posted by: Jonathan Conway | Nov 27, 2005 at 03:45 PM
Hi Jonathan,
You're right, I *could* have done that, but it would mean violating good database design principles; the Address table should not contain keys to things like Teacher and School; it should be the other away around.
This particular database design principle is not something leading edge or controversial; it's the kind of thing normally taught in "Schema Design 101".
I like the Rails philosophies in general, which is one of the reasons I'm using it for my new application. But I do think that ActiveRecord should support common and correct (from a DB design perspective) use cases, which is why I'm spending some time on analyzing this issue and trying to come up with a way to improve Rails.
Regards,
Graham
Posted by: Graham Glass | Nov 27, 2005 at 04:15 PM
Hi Graham,
I totally understand where you're coming from. I suppose I've just grown so used to the ways of Activerecord to the point where I just accept it.
I look forward to any improvements you bring to Activerecord.
Cheers
Jon
Posted by: Jonathan Conway | Nov 28, 2005 at 05:54 AM
Okay, so what happens (hypothetically) if a school and a teacher have the same address? If that's not supposed to happen, how is that constraint evident in your database design?
Posted by: Tom | Nov 28, 2005 at 06:42 AM
Hi Tom,
I would not share the address entries; even if they were the same, I would have one copy for the School and one for the Teacher.
Regards,
Graham
Posted by: Graham Glass | Nov 28, 2005 at 09:39 AM
This is a great series of posts. At some point I'd be interested to hear what you think about developing in PHP vs. Rails. There's plenty of people who have told their story of transitioning from Java to Rails, but very little from the PHP world.
It would be interesting, for example (once your port is done) to compare the lines of code require in each case, and the approximate amount of development time for each (this one is a little unfair since a port is not the same a development from scratch).
I'm asking because I collect information like this for use in my Rails presentations.
Also, I'm the guy who is responsible for the installer that wouldn't let you select a path. :-( This is a long-standing problem with the NSIS installer that I use, and I haven't yet figured out a way to work around it (or what I might be doing in my installer script to cause it). I'm considering moving to a different open source installer.
Finally (and a little off topic), your Thailand trip cuaght my attention because my wife is from Thailand and we own a couple houses there (in the far northeast of Thailand). So, I was curious where you've been staying in Thailand.
Curt
Posted by: Curt Hibbs | Nov 28, 2005 at 12:49 PM
You mention that separating out the addresses is 'good database design'. I don't necessarily agreee with that statement.
There is nothing that says you HAVE to break out the address into a separate table and there is nothing saying that it is good database design to do so.
Other than the fact that the addresses have been segregated into a separate table, what other purpose does it serve?
Will it be easier to modify the data? Will it be easier to lookup the data? And the big one...will there be less storage space required?
I would agrue that you have now introduced more problems that you are solving.
You've created joins where none were necessary and introduced possible referential integrity issues. In addition, since the associations are all one-to-one, you are not saving on storage space (you will still have one address per school or teacher).
The advantages to keeping the address information in their respective tables (schools and teacher) is that you have eliminated the need to deal with unecessary associations.
Just because you have two tables that contain address information doesn't necessarily mean that you have to normalize.
I am also trying to figure out why you would want to delete an address, thereby deleting the school or student via association. That sounds backwards to me. Wouldn't you just delete the school or student itself?
So basically, you're not gaining any advantage by normalizing the addresses and are just making more work for your code/database to deal with.
This sounds like over-normalization to me.
Posted by: Chris Hall | Nov 29, 2005 at 10:26 AM
Hi Chris,
See the section called "Complex Datatypes" in the following article:
http://www-128.ibm.com/developerworks/web/library/wa-dbdsgn1.html
Regards,
Graham
Posted by: Graham Glass | Nov 29, 2005 at 12:16 PM
Hi, I haven't used RubyOnRails, however I readed fast your comments (and looking to maybe learn something by replying).
I saw in action thing you said would be bad design (like address) and linking the child to their respective parent (if a teacher you use teacher_id, if a school, you use school_id, etc. Higher generalization might even subject to use the object_id it belongs to, but ).
Maybe it is a matter of taste (or maybe I think it wrong), but I prefer to think of "object" then the table to persist my object.
And when persisting your object, you could do it in a lot of way (example: when doing generalisation and specification you may want to persist all in one table, one table per classe, one table per final classes, etc.)
Maybe when using some rule, you may choose to always use the same strategy (e.g. one table per classe) and I guess RubyOnRail give you a guideline on how thing should be done (and that is why it may be efficient, maybe I am totally wrong here ;-)).
And when you say good db design principile, are you saying it should be teacher.address_id? And if your Teacher could have multiple address would you do something like teacher, teacher_address(teacher_id, address_id), address? and the same with student, student_adress, address?
I won't say I have a strong opinion now, but any comments that will enlight it ;-) Of course there is always multiple way that it can be done...
If an object can have multiple children, example invoice 1---* invoice_items, I suppose you would persist them into invoice and then put the key into invoice_item.invoice_id.
Often when I think of this, I often say to myself there is no "one correct way" for those. But I would like to know why persisting in a your table model with address.teacher_id and address.school_id is really wrong? That is just your persistence structure. Actually you could have multiple class persisting in the same table (in that case something like [even if that won't be useful new object, I'm just trying to illustrate, hehe] AddressOfTeacher extends Address, AddressOfSchool extends Address, etc.)
If you had to support multiple address for teacher, what will be your preferred&accepted solution(s) [i.e. how would you refactor your Class and Table Model]?
Is my assumption about how you would persist invoice, invoice_items correct?
(I know those may sound like basics questions, but sometime it is those that are the most important ;-))
Posted by: Michael Muryn | Nov 30, 2005 at 01:26 PM
"But I would like to know why persisting in a your table model with address.teacher_id and address.school_id is really wrong?"
Michael: its wrong :) why? because when you scale up the types of "owner" objects, in this case schools and teachers, there is a heap of useless columns created for each row in the address table.
So an address for teacher also has an unused "schools", "students", "gym", "staff" etc.
Posted by: Ned Collyer | Dec 05, 2005 at 05:24 PM
Its good database design for the following pratical reason.
If you have a separate address table, it enables you to more easily have a one-many relationship btn a teacher and one or more addresses (shipping address and billing address, for example). This sort of thing is common in ecommerce systems & couldn't be implemented if the address was part of the teacher table. A school can have more than one address too!
Also, it makes your application more flexible - it may not be an ecommerce system at the moment, but it might evolve to be one. Also, say you want to geocode your address data at a later date, would it not be cleaner to add this sort of data to an address table (& then run a routine against that table) than store it against a teacher/school table?
Or am I barking up the wrong tree?
Cheers,
Eddie
Posted by: Eddie | Feb 18, 2007 at 08:38 AM