database schema for a product with multiple categories and hierarchical categories

The way I am planning to implement this in the backend is:

Is this the right way to go ?

One of my concerns in this approach is; even though new categories won't be added very frequently, but when we would have to add a new category, we will have to make a whole lot of updates in productID-cateogryID table since the category IDs would now be pointing to something different!

Is there a better way to deal with this ?

EDIT: Modified preorder traversal is better known as: Nested Set model

brainydexter asked Dec 19, 2011 at 18:00 brainydexter brainydexter 267 1 1 gold badge 4 4 silver badges 13 13 bronze badges

2 Answers 2

What you are proposing is a good solution for your requirement of M:N products to categories and hierarchical categories.

To avoid exposing yourself to numerous updates: You need to do two things to ensure that you don't have a lot of updates in your intersection table.

First, you need to be sure that your categories have a stable, persistent primary key.

Second, you need to link food items to leaf categories. Don't join Cherry to Red , Healthy , Fruit and Food - just join it to Red and Healthy . Your nested sets take care of all of the secondary (and higher level) associations.

answered Dec 19, 2011 at 21:42 Joel Brown Joel Brown 12.6k 2 2 gold badges 32 32 silver badges 45 45 bronze badges

Thanks for the inputs. However, what do you think about the concern I mentioned: even though new categories won't be added very frequently, but when we would have to add a new category, the left and right values in nested set model will need to be updated and hence referenced categoryIDs will need to be updated in productID-cateogryID table!

Commented Dec 20, 2011 at 7:12

Absolutely. The assumption behind visitation numbers (as Joe Celko calls them) is that you read them much more than you write them and so they are worthwhile. You have two options: 1. Integer left/right numbers mean the whole nested set is recalculated whenever any change is made to the categories - or 2. Float left/right numbers mean that you can restrict updates to the sub-tree that is impacted by using fractional left/right numbers. I'd recommend using option 1 unless you have so many category updates that it becomes a performance issue. I'd guess that it won't be a problem.

Commented Dec 20, 2011 at 13:22

Cool. That helps. I am anticipating infrequent category updates, so I think I'll stick with option 1. Thanks.

Commented Dec 20, 2011 at 13:33

Another option is the Closure table. See this question for explanation: What is the most efficient/elegant way to parse a flat table into a tree? Just to clarify: I don't mean that this is better than the Nested Set approach.

Commented Jul 1, 2012 at 8:53

@ypercube - Quite so. Visitation numbers and closure tables both work more or less equally well. Depending on what your needs are, I think each approach has small advantages. Closure tables can definitely be better if you need distance between nodes since this can be recorded directly in the table.