
Amazon Web Services DynamoDB, MongoDB, Couchbase, and MarkLogic are all examples of NoSQL databases that may offer a better way to store product catalog information.
How an ecommerce site stores and accesses product information can have a significant impact on site performance.
Online retail businesses, small and large, may benefit from a popular database technology that could offer a better way to store product information. To help understand this technology, which is called NoSQL, it can help to describe just how complex product information can be.
The Product Attribute Problem
Many — if not most — ecommerce platforms store product information in a relational database, where information is kept in tables similar to a spreadsheet. Imagine, as an example, a product listing for a small, red t-shirt. The shirt might have a unique ID, a stock keeping unit (SKU), a title, a size, and a color.
ID | SKU | title | size | color |
---|---|---|---|---|
1 | POI0987654321 | Awesome Shirt | s | red |
Each of these five attributes would take up a column in the database table. A row in the table represents the complete set of product information.
For a site that sells only t-shirts, this might work well. You could imagine additional rows in the database table representing additional shirts, say the “Awesome Shirt,” in four sizes and two colors.
ID | SKU | title | size | color |
---|---|---|---|---|
1 | POI0987654321 | Awesome Shirt | s | red |
2 | POI0987654322 | Awesome Shirt | m | red |
3 | POI0987654323 | Awesome Shirt | l | red |
4 | POI0987654324 | Awesome Shirt | xl | red |
5 | POI0987654325 | Awesome Shirt | s | blue |
6 | POI0987654326 | Awesome Shirt | m | blue |
7 | POI0987654327 | Awesome Shirt | l | blue |
8 | POI0987654328 | Awesome Shirt | xl | blue |
The database query to display this product information would be relatively simple to write and fast to load. Not much of a problem. But what if this t-shirt retailer also wants to sell pants?
Pants have a different set of attributes. Size is not measured in small, medium, and large, but typically in waist and inseam.
To solve this problem, one could add columns to the table.
ID | SKU | type | title | size | waist | inseam | color |
---|---|---|---|---|---|---|---|
1 | POI0987654321 | shirt | Awesome Shirt | s | red | ||
2 | POI0987654322 | shirt | Awesome Shirt | m | red | ||
3 | POI0987654323 | pants | Neat Pants | 30 | 30 | black | |
4 | POI0987654324 | pants | Neat Pants | 30 | 31 | black |
This will work, but it may not scale well. Imagine what would happen if the retailer added more product types.
- Belts. Need a length attribute and a material attribute.
- Hats. Need a hat type attribute (i.e., snap back or elastic).
- Shoes. Need an attribute for lace length and the number of islets.
- Stickers. Need a height, width, and adhesive type attribute.
- Music. Need an artist attribute and a way to store a list of tracks.
- Movies. Need a director attribute and attributes for actors, the movie’s rating, the year it was made, and more.
All this, and in the example, there is not yet an attribute for price, sale price, cost, weight, shipping dimensions, or similar.
Very quickly, the ecommerce product database table would be hundreds or thousands of columns wide. What’s worse, most of the columns would be empty. The small, red t-shirt would still need just five or so columns, leaving thousands blank. This single table solution can be expensive to host since it wastes memory.
An ecommerce solution could eliminate the wasted space if it assigned each new product type its own table.
There would be a product table just for t-shirts.
ID | SKU | title | size | color |
---|---|---|---|---|
1 | POI0987654321 | Awesome Shirt | s | red |
2 | POI0987654322 | Awesome Shirt | m | red |
A separate table would store information for pants.
ID | SKU | title | waist | inseam | color |
---|---|---|---|---|---|
3 | POI0987654323 | Neat Pants | 30 | 30 | black |
4 | POI0987654324 | Neat Pants | 30 | 31 | black |
This solution would work, but it would be very hard to manage.
For every new type of product, there would need to be a new, custom database table. The ecommerce platform would need to keep track of all of these tables, including understanding how they relate to each other.
Showing a list of products from a single manufacturer might take several database queries or “joins,” which are combinations of columns from different tables.
A third solution to the product attribute problem would be to implement an entity, attribute, value system — EAV. Popular ecommerce platforms like Magento take this approach.
EAV solutions frequently use multiple tables to store portions of the product information. Here’s the entity portion.
product ID | SKU | title |
---|---|---|
1 | POI0987654321 | Awesome Shirt |
2 | POI0987654322 | Awesome Shirt |
3 | POI0987654323 | Neat Pants |
4 | POI0987654324 | Neat Pants |
A different table would store the value for individual attributes.
product ID | attribute | value |
---|---|---|
1 | size | s |
1 | color | red |
2 | size | m |
2 | color | red |
3 | waist | 30 |
3 | inseam | 30 |
3 | color | black |
4 | waist | 30 |
4 | inseam | 31 |
4 | color | black |
Here two tables are storing the product information, but there are EAV database solutions for product catalogs that include three, four, five, or more separate database tables. This approach can make accessing product information quite slow. Nonetheless, EAV can be the best solution for relational databases, to manage a product catalog.
NoSQL for Ecommerce Product Information
NoSQL — non-relational databases — take a different approach to storing information. There are no tables of structured columns and rows. In a sense, NoSQL trades consistency for availability, speed, and flexibility.
Although non-relational databases or key-value pair databases have been around for decades, NoSQL has recently become popular with large Internet businesses, including Google, Facebook, eBay, and Amazon. These companies need to access significant amounts of complex information very quickly.
NoSQL architecture differs among implementations. But in general, NoSQL will store all of an item’s data in a single container.
Thus, one product equals one container (sometimes called an item or a document) in the database.
Sometimes, these items or documents will be in JSON format or similar, meaning that it will look very familiar to web developers.
For an ecommerce product, there will typically be a unique item ID, a set of common product attributes, and some attributes that are unique to a particular product. The small red shirt might look like the following.
{ unique_id: 'HY8765NBg6yYT77nBhNkln543NjNhYTR', sku: 'POI0987654321', type: ‘t-shirt’, title: 'Amazing Shirt'
attributes: { size: 's', color: 'red' } }
A particular pair of pants would have some things in common with the shirt, but in NoSQL there would be the flexibility to have different attributes.
{ unique_id: '99iIiGtg6GHGjfj776098JjHGtgfffhhh', sku: 'POI0987654323', type: 'pants', title: 'Neat Pants'
attributes: { waist: '30', inseam: '30', color: 'black' } }
With this approach, a retailer could sell any sort of product it wants without having to modify a database table or figure out how to keep track of the different sorts of attributes. A belt would simply be another container in the database.
{ unique_id: 'jjfJlkjsdfALKJljJLKDJ877KJNBB', sku: 'BE1888477', type: 'belt', title: 'Studded Belt'
attributes: { size: '30', length: '30', color: 'black', material: 'leather' } }
In this way, NoSQL is much more flexible. It is simply easier to store and manage product data. What’s more, querying for an individual product can be faster. This is especially true as the number of products in the database increases.
NoSQL Considerations
An online store using a SaaS ecommerce platform may not be able to take advantage of NoSQL. But retailers using licensed ecommerce platforms or custom solutions may find that NoSQL is a better choice for product catalog information.
There are plenty of applications for which relational databases make more sense, but NoSQL can be a better solution for product information.