TutorialMar 12, 202610 min read

Step-by-Step: Building an E-Commerce Database Schema from Scratch

Designing the database for an e-commerce application is one of the most common and most instructive database design exercises. In this guide, we'll build a complete e-commerce schema from the ground up β€” starting with the core entities and progressively adding the features that real stores need.

Designing the database for an e-commerce application is one of the most common and most instructive database design exercises. Whether you're building a Shopify competitor, a marketplace, or a simple online store, the core data model follows patterns that every developer should understand.

In this guide, we'll build a complete e-commerce schema from the ground up β€” starting with the core entities and progressively adding the features that real stores need. By the end, you'll have a production-ready data model with proper relationships, indexes, and design decisions explained.

The Core Entities

Every e-commerce application needs these fundamental tables. Let's start here and build outward.

Users

The users table is the foundation. At minimum, you need an id (primary key), email (unique, for authentication), password_hash (never store plain passwords), name, created_at, and updated_at. Depending on your auth strategy, you might also include email_verified_at, remember_token, and avatar_url.

A key design decision: should you separate authentication data from profile data? For simple applications, keeping them in one table is fine. For larger systems, splitting into users (auth) and profiles (display data) gives you more flexibility β€” especially if you later add OAuth providers or SSO.

Products

Products need an id, name, slug (URL-friendly version of the name), description (text/HTML for rich content), price (stored in cents as an integer to avoid floating-point issues), compare_at_price (for showing discounts), sku, stock_quantity, is_active (boolean to control visibility), weight and dimensions (for shipping calculations), and timestamps.

Store prices as integers in the smallest currency unit (cents for USD, pence for GBP). Using DECIMAL(10,2) works too, but integer math eliminates floating-point surprises entirely. Whichever you choose, be consistent across all tables.

Categories

Categories typically use a self-referencing structure: id, name, slug, description, parent_id (foreign key to the same table, nullable for root categories), and sort_order. The parent_id pattern creates a tree structure β€” "Electronics > Phones > Smartphones" β€” without needing a separate table for each level.

Products and categories have a many-to-many relationship (a product can be in multiple categories, and a category contains many products), so you'll need a junction table: product_categories with product_id and category_id.

Orders

The orders table captures the transaction: id, user_id (foreign key), status (enum: pending, processing, shipped, delivered, cancelled, refunded), subtotal, tax, shipping_cost, total, currency, shipping_address_id, billing_address_id, notes, placed_at, and timestamps.

Notice that we store the computed totals (subtotal, tax, total) directly on the order rather than calculating them from line items every time. This is intentional β€” prices, taxes, and discounts can change after an order is placed, but the order record should reflect what the customer actually paid.

Order Items (Line Items)

This is where orders connect to products: id, order_id (foreign key), product_id (foreign key), variant_id (foreign key, nullable), quantity, unit_price, total_price, product_name, product_sku.

Critical design decision: we store product_name, product_sku, and unit_price directly on the order item, even though we also reference product_id. Why? Because products change. A product might be renamed, repriced, or deleted after an order is placed. The order item must preserve what the customer actually bought at the time of purchase. The product_id foreign key is useful for analytics and linking, but the denormalized fields are the source of truth for the order.

Adding Product Variants

Most real e-commerce stores need product variants β€” a t-shirt comes in different sizes and colors, a laptop comes with different RAM and storage configurations.

Product Variants

Create a product_variants table: id, product_id (foreign key), sku (unique per variant), price (overrides product price if set), stock_quantity, weight, is_active, and timestamps.

Variant Options

For the attributes themselves (Size: S/M/L/XL, Color: Red/Blue/Green), you need two tables. An option_types table stores the attribute names: id, name (like "Size" or "Color"), sort_order. An option_values table stores the possible values: id, option_type_id (foreign key), value (like "Large" or "Red"), sort_order.

Then a junction table variant_option_values connects variants to their option values: variant_id and option_value_id. A "Large Red T-Shirt" variant would have two entries in this table β€” one linking it to "Large" and one linking it to "Red."

Addresses

Users need shipping and billing addresses, and they might have multiple of each.

Create an addresses table: id, user_id (foreign key), type (enum: shipping, billing, or both), is_default (boolean), first_name, last_name, address_line_1, address_line_2, city, state, postal_code, country_code (ISO 3166-1 alpha-2), phone, and timestamps.

Orders reference addresses via shipping_address_id and billing_address_id. Like with order items, consider whether to snapshot the address data on the order (in case the user later updates their address) or rely on the foreign key reference. For compliance and accuracy, snapshotting is safer β€” either by copying fields to the order or by making addresses immutable (creating a new record instead of updating).

Payments

The payments table tracks payment attempts and results: id, order_id (foreign key), payment_method (enum: credit_card, paypal, stripe, etc.), gateway_transaction_id (the ID from Stripe, PayPal, etc.), amount, currency, status (enum: pending, authorized, captured, failed, refunded), paid_at, metadata (JSONB for gateway-specific data), and timestamps.

An order can have multiple payment records β€” a failed attempt followed by a successful one, or a partial refund. This is why payments are a separate table rather than columns on the order.

Shopping Cart

There are two common approaches to cart design. Database-backed carts store cart data in your database: a carts table (id, user_id nullable for guest carts, session_id for guest carts, timestamps) and a cart_items table (id, cart_id, product_id, variant_id, quantity, timestamps). This approach supports persistent carts (user adds items on mobile, sees them on desktop) and allows you to analyze abandoned carts.

Session-based carts store cart data in the user's session or local storage and only create database records at checkout. This is simpler but loses the ability to track and recover abandoned carts.

For most e-commerce applications, database-backed carts are worth the extra complexity because abandoned cart recovery is a significant revenue driver.

Reviews and Ratings

A reviews table: id, product_id (foreign key), user_id (foreign key), rating (integer 1-5), title, body, is_verified_purchase (boolean β€” did this user actually buy this product?), is_approved (boolean β€” for moderation), helpful_count, and timestamps.

Add a unique constraint on (product_id, user_id) to prevent duplicate reviews. The is_verified_purchase flag is populated by checking whether the user has a completed order containing this product.

Coupons and Discounts

A coupons table: id, code (unique), type (enum: percentage, fixed_amount, free_shipping), value (the discount amount or percentage), minimum_order_amount, maximum_discount_amount (cap for percentage discounts), usage_limit (total uses allowed), usage_count (current uses), per_user_limit, starts_at, expires_at, is_active, and timestamps.

A coupon_usages table tracks who used what: id, coupon_id, user_id, order_id, discount_amount, used_at. This lets you enforce per-user limits and audit discount usage.

For product-specific coupons, add a junction table coupon_products with coupon_id and product_id.

Wishlists

Simple but valuable: a wishlists table with id, user_id, product_id, variant_id (nullable), and created_at. Add a unique constraint on (user_id, product_id, variant_id) to prevent duplicates.

Indexes You Shouldn't Forget

Beyond the automatic indexes on primary keys and foreign keys, add indexes on products.slug (unique, used in URLs), products.sku (unique, used in admin), products.is_active (filtered frequently), orders.user_id (users viewing their orders), orders.status (admin filtering), orders.placed_at (reporting, sorting), order_items.product_id (sales reporting), reviews.product_id (product page display), coupons.code (unique, looked up at checkout), and addresses.user_id (user's address book).

For PostgreSQL, consider partial indexes: CREATE INDEX idx_active_products ON products(id) WHERE is_active = true β€” this creates a smaller, faster index that only covers active products.

The Complete Schema Overview

At the end, your e-commerce schema has approximately 15-18 tables. The core transaction flow is: User β†’ Cart β†’ Order β†’ OrderItems β†’ Payment. Supporting entities branch out from there: Products with Variants, Categories, Reviews, Addresses, Coupons.

The relationships form a clear hierarchy. Users sit at the center, connecting to Orders, Addresses, Reviews, Wishlists, and Carts. Products connect to Categories (many-to-many), Variants, Reviews, and OrderItems. Orders connect to OrderItems, Payments, and Addresses.

Building This in ER Flow

You can create this entire schema visually in ER Flow. Start by creating the core entities (Users, Products, Orders, OrderItems), then progressively add supporting tables. Use ER Flow's visual grouping to organize tables by domain (core commerce, user management, product catalog, etc.) and color-code them for clarity.

Once your schema is complete, use ER Flow's checkpoint system to snapshot the design, then generate migrations for your target database (PostgreSQL, MySQL) or framework (Laravel, Phinx).

If you're using an AI coding assistant like Cursor, you can describe each section of this schema in natural language and let the AI create the tables through ER Flow's MCP Server β€” watching them appear on your canvas in real-time.