Best PracticesMar 28, 20268 min read

Top 10 Database Design Mistakes Developers Make (and How to Avoid Them)

Bad database design doesn't announce itself on day one. It shows up six months later when queries take 30 seconds, when a "simple" feature requires refactoring half the schema, or when you realize you've been storing prices as floats. Here are the ten most common mistakes — and how to avoid each one.

These are the ten most common mistakes we see developers make when designing databases — and how to avoid each one.

1. Not designing before coding

The most expensive mistake happens before any SQL is written: jumping straight into code without thinking through the data model. Developers create tables as they need them, add columns reactively, and end up with a schema that reflects the order they built features rather than the logical structure of the domain.

The fix is straightforward: spend 30 minutes with an ER diagram before writing any code. Map out your entities, relationships, and core attributes. You don't need perfection — you need a starting point that captures the major structural decisions. A visual tool makes this faster than you think. You can design a 20-table schema in ER Flow in under 30 minutes, and the time saved from not having to refactor later is measured in days.

2. Using the wrong data type for money

Storing prices, totals, or any financial values as FLOAT or DOUBLE is a classic mistake. Floating-point arithmetic is inherently imprecise — 0.1 + 0.2 equals 0.30000000000000004 in most languages. Over millions of transactions, rounding errors accumulate into real financial discrepancies.

Use DECIMAL(10,2) for currency values, or store amounts as integers in the smallest currency unit (cents). A product priced at $29.99 is stored as 2999. Integer math is exact, and you convert to display format only at the presentation layer.

3. Not indexing foreign keys

Every foreign key column should have an index. When you query all orders for a user (WHERE user_id = 123), the database scans the entire orders table if user_id isn't indexed. Some databases (like MySQL with InnoDB) create indexes on foreign keys automatically. PostgreSQL does not.

Beyond foreign keys, add indexes on any column you frequently filter, sort, or join on: status, created_at, email, slug. But don't over-index — every index slows down writes and consumes storage. Index the columns your queries actually use, not every column that might theoretically be queried.

4. Storing repeated data instead of using relationships

When a developer stores customer_name, customer_email, and customer_phone directly on every order instead of referencing a customers table, they've created a denormalization problem. When the customer updates their email, you need to update it in every order. If you miss one, your data is inconsistent.

The fix is normalization: store each fact in one place and use foreign keys to reference it. Orders have a customer_id that points to the customers table. The customer's email exists in exactly one row.

There are legitimate reasons to denormalize — storing the product name and price on order items (because the product might change after the order), or caching computed values for performance. But denormalize intentionally, not accidentally. Know when you're making a trade-off versus when you're creating a mess.

5. Using a single "God table"

We've all seen it: a data table or a records table with 50+ columns, where different types of records use different subsets of columns. A row might be a user, a product, or an order, distinguished by a type column. Most columns are NULL for any given row.

This happens when developers try to avoid creating new tables. The result is a schema that's impossible to reason about, impossible to index efficiently, and a nightmare to maintain.

Every distinct entity deserves its own table. If users, products, and orders have different attributes and different lifecycles, they should be separate tables. The slight overhead of creating a new table and migration is trivial compared to the long-term cost of a God table.

6. Ignoring many-to-many relationships

When two entities have a many-to-many relationship (students enrolled in courses, products in categories, users with roles), some developers try to store it as a comma-separated list: categories = "1,2,5,12". This makes querying nearly impossible — how do you find all products in category 5? — and violates first normal form.

The correct approach is a junction table: product_categories with product_id and category_id columns, each with a foreign key. This is queryable, indexable, and allows additional attributes on the relationship (like sort_order or added_at).

7. Not planning for soft deletes

Deleting a user from the database seems simple until you realize that their orders, comments, reviews, and other associated data either gets cascade-deleted (losing business-critical records) or leaves orphaned foreign keys.

Soft deletes — adding a deleted_at timestamp column that marks records as deleted without actually removing them — solve this cleanly. A non-null deleted_at means the record is "deleted." Your application filters these out by default, but the data remains intact for historical records, compliance, and audit trails.

Plan for this from the start. Adding soft deletes to an existing schema with millions of rows and dozens of queries is significantly harder than including deleted_at from day one.

8. Storing JSON for everything

PostgreSQL's jsonb and MySQL's JSON column types are powerful — but they're not a replacement for proper schema design. Storing structured data as JSON means you lose type safety (a "price" inside JSON could be a string, number, or null), you can't enforce constraints (no foreign keys, no NOT NULL, no unique constraints inside JSON), index support is limited (GIN indexes help, but they're not as efficient as B-tree indexes on typed columns), and schema changes become invisible (anyone can add or remove a key in the JSON, with no migration or documentation).

JSON columns are excellent for truly unstructured data: user preferences, API response caches, metadata that varies per record. They're a poor choice for data that has consistent structure and relationships — that data belongs in properly typed columns and related tables.

9. Not considering query patterns during design

A normalized schema is theoretically correct, but if your most common query requires joining 8 tables, you have a performance problem. Database design should balance normalization with the queries your application actually runs.

Before finalizing your schema, list your ten most common queries. If a critical query requires expensive joins across many tables, consider strategic denormalization: add a computed column, create a materialized view, or restructure the tables to reduce join depth.

This is a design-time decision, not a "we'll optimize later" task. Adding a total_orders_count column to the users table during initial design is easy. Adding it after a year of production data requires a migration that updates millions of rows.

10. No naming conventions

Inconsistent naming makes schemas confusing: some tables are singular (user), others plural (orders). Some columns use camelCase (firstName), others use snake_case (first_name). Foreign keys are sometimes userId, sometimes user_id, sometimes fk_user.

Pick a convention and stick to it. The most common in the industry is plural, snake_case table names (users, order_items), singular, snake_case column names (first_name, created_at), and foreign keys named as singular_table_id (user_id, order_id). Boolean columns prefixed with is_ or has_ (is_active, has_verified).

Document your convention and enforce it in code reviews. A consistent schema is dramatically easier to work with than one where every table follows different rules.

Prevention Is Cheaper Than Cure

All of these mistakes share a common trait: they're easy to avoid at design time and expensive to fix later. Spending an hour designing your schema visually — seeing all tables, relationships, and data types in one view — catches most of these issues before they become embedded in production code and data.

Tools like ER Flow make this design step fast and collaborative. Visual design surfaces structural problems (missing relationships, denormalization issues, missing indexes) that are invisible in a linear list of migration files. And with features like AI-assisted design and migration generation, the design step doesn't slow down your development — it accelerates it.

Best Practices7 min read