Database Fundamentals - 3/4

From Rigid Tables to Flexible Data Structures

You’ve mastered SQL databases, complex JOINs, and ACID transactions. You can design normalized schemas, write efficient queries, and maintain data integrity under concurrent load. Your relational database skills are solid. But here’s the reality that every modern developer faces: not all data fits neatly into rows and columns, and not all applications need the overhead of strict consistency.

The SQL limitation reality check:

-- Your beautiful normalized user profile system
CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) NOT NULL
);

CREATE TABLE user_profiles (
  user_id INTEGER REFERENCES users(id),
  first_name VARCHAR(50),
  last_name VARCHAR(50),
  bio TEXT
);

CREATE TABLE user_addresses (
  user_id INTEGER REFERENCES users(id),
  address_type VARCHAR(20),
  street VARCHAR(255),
  city VARCHAR(100),
  country VARCHAR(100)
);

-- But then business requirements evolve:
-- "Users need custom profile fields"
-- "Support social media links (unknown number of them)"
-- "Store user preferences (different for each user type)"
-- "Add location data for mobile users"
-- "Support JSON resume data"

-- Your rigid schema becomes a maintenance nightmare
ALTER TABLE user_profiles ADD COLUMN twitter_handle VARCHAR(255);
ALTER TABLE user_profiles ADD COLUMN github_profile VARCHAR(255);
ALTER TABLE user_profiles ADD COLUMN linkedin_url VARCHAR(255);
-- ...endless ALTER statements for every new field

The schema flexibility problem:

// Real user data doesn't fit rigid structures
const userProfiles = [
  {
    email: "photographer@example.com",
    profile: {
      displayName: "Sarah Photos",
      portfolio: {
        featured: ["img1.jpg", "img2.jpg"],
        categories: ["wedding", "portrait"],
        equipment: ["Canon 5D", "85mm lens"],
      },
      socialMedia: {
        instagram: "@sarahphotos",
        facebook: "sarah.photography",
      },
    },
  },
  {
    email: "developer@example.com",
    profile: {
      username: "codemaster",
      skills: ["JavaScript", "Python", "React"],
      projects: [
        {
          name: "E-commerce App",
          tech: ["Node.js", "MongoDB"],
          github: "github.com/user/ecommerce",
        },
      ],
      contributions: {
        openSource: 45,
        stackOverflowRep: 15420,
      },
    },
  },
];

// Each user has completely different profile structure!
// How do you handle this with rigid SQL tables?

The uncomfortable truth: Modern applications deal with semi-structured data, rapid schema evolution, massive scale requirements, and varied consistency needs that SQL databases weren’t designed to handle efficiently.

Real applications need databases that handle:

  • Flexible schemas that adapt as requirements change without migrations
  • Horizontal scaling across multiple servers for massive datasets
  • Semi-structured data like JSON documents, logs, and user-generated content
  • High-performance scenarios where eventual consistency is acceptable
  • Specialized access patterns like caching, real-time messaging, and graph relationships

This article introduces you to the NoSQL universe: document databases for flexible schemas, key-value stores for high-performance caching, column-family databases for massive datasets, and graph databases for relationship-heavy applications. You’ll learn when to choose each type and how to model data without tables.


The NoSQL Landscape: Different Problems, Different Solutions

Document Databases: JSON-Native Storage

Document databases store data as JSON-like documents, making them perfect for applications that work with nested, variable-structure data:

// MongoDB: Natural JSON document storage
const db = client.db("blog");
const users = db.collection("users");

// Insert flexible user documents
await users.insertOne({
  email: "tech_blogger@example.com",
  profile: {
    displayName: "Tech Blogger",
    bio: "Full-stack developer sharing knowledge",
    avatar: "https://example.com/avatar.jpg",
    socialLinks: {
      twitter: "@tech_blogger",
      linkedin: "linkedin.com/in/tech-blogger",
      github: "github.com/tech-blogger",
      website: "techblog.dev",
    },
    skills: ["JavaScript", "Python", "Node.js", "React"],
    experience: [
      {
        company: "TechCorp",
        position: "Senior Developer",
        startDate: new Date("2020-01-15"),
        endDate: null,
        technologies: ["Node.js", "React", "MongoDB"],
      },
      {
        company: "StartupXYZ",
        position: "Full Stack Developer",
        startDate: new Date("2018-06-01"),
        endDate: new Date("2019-12-31"),
        technologies: ["Python", "Django", "PostgreSQL"],
      },
    ],
  },
  settings: {
    notifications: {
      email: true,
      push: false,
      sms: true,
    },
    privacy: {
      profilePublic: true,
      showEmail: false,
      allowMessages: true,
    },
    theme: "dark",
  },
  posts: [
    {
      title: "Introduction to NoSQL",
      slug: "intro-nosql",
      content: "NoSQL databases offer flexibility...",
      tags: ["database", "nosql", "mongodb"],
      publishedAt: new Date("2024-01-20"),
      stats: {
        views: 1250,
        likes: 89,
        comments: 23,
      },
    },
  ],
  createdAt: new Date(),
  updatedAt: new Date(),
});

// Query documents with flexible criteria
const bloggers = await users
  .find({
    "profile.skills": { $in: ["JavaScript", "Python"] },
    "posts.stats.views": { $gte: 1000 },
  })
  .toArray();

// Update nested fields without schema changes
await users.updateOne(
  { email: "tech_blogger@example.com" },
  {
    $set: {
      "profile.location": "San Francisco",
      "profile.timezone": "PST",
      "settings.language": "en",
    },
    $push: {
      posts: {
        title: "Advanced MongoDB Queries",
        slug: "advanced-mongodb",
        publishedAt: new Date(),
      },
    },
  }
);

MongoDB aggregation for complex queries:

// Complex analytics pipeline
const userEngagementStats = await users
  .aggregate([
    // Match active users
    { $match: { "settings.notifications.email": true } },

    // Unwind posts array to work with individual posts
    { $unwind: "$posts" },

    // Group by user and calculate stats
    {
      $group: {
        _id: "$_id",
        email: { $first: "$email" },
        displayName: { $first: "$profile.displayName" },
        totalPosts: { $sum: 1 },
        totalViews: { $sum: "$posts.stats.views" },
        totalLikes: { $sum: "$posts.stats.likes" },
        avgViews: { $avg: "$posts.stats.views" },
        mostPopularPost: {
          $max: {
            title: "$posts.title",
            views: "$posts.stats.views",
          },
        },
      },
    },

    // Calculate engagement rate
    {
      $addFields: {
        engagementRate: {
          $divide: ["$totalLikes", "$totalViews"],
        },
      },
    },

    // Sort by engagement rate
    { $sort: { engagementRate: -1 } },

    // Limit results
    { $limit: 10 },
  ])
  .toArray();

Key-Value Databases: High-Performance Simplicity

Key-value stores provide the fastest possible data access for simple get/set operations:

// Redis: In-memory key-value store
const redis = require("redis");
const client = redis.createClient();

// Simple key-value operations
await client.set(
  "user:123:session",
  JSON.stringify({
    userId: 123,
    username: "alice",
    role: "premium",
    loginTime: Date.now(),
    lastActivity: Date.now(),
  }),
  "EX",
  3600
); // Expires in 1 hour

const session = JSON.parse(await client.get("user:123:session"));

// Advanced Redis data structures
// Lists for activity feeds
await client.lpush(
  "user:123:feed",
  JSON.stringify({
    type: "post_liked",
    postId: 456,
    likedBy: "bob",
    timestamp: Date.now(),
  })
);

// Keep only latest 100 activities
await client.ltrim("user:123:feed", 0, 99);

// Sets for user relationships
await client.sadd("user:123:following", "456", "789", "321");
await client.sadd("user:456:followers", "123");

// Check if users follow each other
const isFollowing = await client.sismember("user:123:following", "456");
const mutualFollows = await client.sinter(
  "user:123:following",
  "user:456:following"
);

// Sorted sets for leaderboards
await client.zadd("user_scores", 1250, "alice", 980, "bob", 1340, "charlie");

// Get top 10 users
const topUsers = await client.zrevrange("user_scores", 0, 9, "WITHSCORES");

// Hash maps for complex objects
await client.hset("user:123:profile", {
  name: "Alice Johnson",
  email: "alice@example.com",
  posts: 45,
  followers: 1250,
});

// Get specific fields
const userInfo = await client.hmget("user:123:profile", "name", "followers");

Redis for real-time features:

// Real-time chat with Redis pub/sub
const publisher = redis.createClient();
const subscriber = redis.createClient();

// Chat room functionality
const sendMessage = async (roomId, userId, message) => {
  const messageData = {
    id: generateId(),
    userId,
    username: await getUserName(userId),
    message,
    timestamp: Date.now(),
  };

  // Store in chat history
  await client.lpush(`chat:${roomId}:messages`, JSON.stringify(messageData));
  await client.ltrim(`chat:${roomId}:messages`, 0, 999); // Keep last 1000 messages

  // Publish to subscribers
  await publisher.publish(`chat:${roomId}`, JSON.stringify(messageData));
};

// Subscribe to chat room
subscriber.subscribe("chat:room1");
subscriber.on("message", (channel, message) => {
  const messageData = JSON.parse(message);
  console.log(`${messageData.username}: ${messageData.message}`);
});

// Online presence tracking
const updateUserPresence = async (userId) => {
  await client.setex(`presence:${userId}`, 300, Date.now()); // 5 minute TTL
};

const getOnlineUsers = async () => {
  const keys = await client.keys("presence:*");
  return keys.map((key) => key.split(":")[1]);
};

Column-Family Databases: Massive Scale

Column-family databases like Cassandra handle massive datasets across distributed clusters:

// Cassandra: Wide-column store for time-series data
const cassandra = require("cassandra-driver");
const client = new cassandra.Client({
  contactPoints: ["127.0.0.1"],
  localDataCenter: "datacenter1",
});

// Create keyspace and tables
await client.execute(`
  CREATE KEYSPACE IF NOT EXISTS analytics 
  WITH REPLICATION = {
    'class': 'SimpleStrategy',
    'replication_factor': 3
  }
`);

await client.execute(`
  CREATE TABLE IF NOT EXISTS analytics.user_events (
    user_id UUID,
    event_date DATE,
    event_time TIMESTAMP,
    event_type TEXT,
    page_url TEXT,
    user_agent TEXT,
    session_id UUID,
    properties MAP<TEXT, TEXT>,
    PRIMARY KEY ((user_id, event_date), event_time)
  ) WITH CLUSTERING ORDER BY (event_time DESC)
`);

// Insert time-series events
await client.execute(
  `
  INSERT INTO analytics.user_events 
  (user_id, event_date, event_time, event_type, page_url, session_id, properties)
  VALUES (?, ?, ?, ?, ?, ?, ?)
`,
  [
    userId,
    new Date().toISOString().split("T")[0], // Today's date
    new Date(),
    "page_view",
    "/blog/nosql-fundamentals",
    sessionId,
    {
      referrer: "google.com",
      device: "mobile",
      browser: "chrome",
    },
  ]
);

// Query user's daily activity
const dailyEvents = await client.execute(
  `
  SELECT * FROM analytics.user_events
  WHERE user_id = ? AND event_date = ?
  ORDER BY event_time DESC
`,
  [userId, today]
);

// Aggregate queries with time windows
const hourlyPageViews = await client.execute(
  `
  SELECT 
    event_type,
    toHour(event_time) as hour,
    COUNT(*) as event_count
  FROM analytics.user_events
  WHERE user_id = ? 
    AND event_date = ?
    AND event_type = 'page_view'
  GROUP BY event_type, toHour(event_time)
`,
  [userId, today]
);

Graph Databases: Relationship-Centric Data

Graph databases excel at managing complex relationships:

// Neo4j: Graph database for social networks
const neo4j = require("neo4j-driver");
const driver = neo4j.driver(
  "bolt://localhost:7687",
  neo4j.auth.basic("neo4j", "password")
);

// Create nodes and relationships
const session = driver.session();

// Create user nodes
await session.run(`
  CREATE (alice:User {name: 'Alice', email: 'alice@example.com', age: 28})
  CREATE (bob:User {name: 'Bob', email: 'bob@example.com', age: 32})
  CREATE (charlie:User {name: 'Charlie', email: 'charlie@example.com', age: 25})
  CREATE (diana:User {name: 'Diana', email: 'diana@example.com', age: 30})
`);

// Create relationships
await session.run(`
  MATCH (alice:User {name: 'Alice'}), (bob:User {name: 'Bob'})
  CREATE (alice)-[:FOLLOWS {since: date('2024-01-15')}]->(bob)
  
  MATCH (bob:User {name: 'Bob'}), (charlie:User {name: 'Charlie'})
  CREATE (bob)-[:FOLLOWS {since: date('2024-01-20')}]->(charlie)
  
  MATCH (alice:User {name: 'Alice'}), (diana:User {name: 'Diana'})
  CREATE (alice)-[:FRIENDS {since: date('2023-06-10'), strength: 8}]->(diana)
  CREATE (diana)-[:FRIENDS {since: date('2023-06-10'), strength: 8}]->(alice)
`);

// Complex graph queries
// Find friends of friends (potential connections)
const friendSuggestions = await session.run(`
  MATCH (user:User {name: 'Alice'})-[:FRIENDS]->(friend)-[:FRIENDS]->(suggestion)
  WHERE user <> suggestion 
    AND NOT (user)-[:FRIENDS]-(suggestion)
    AND NOT (user)-[:FOLLOWS]-(suggestion)
  RETURN suggestion.name, suggestion.email, count(friend) as mutual_friends
  ORDER BY mutual_friends DESC
  LIMIT 5
`);

// Find shortest path between users
const connectionPath = await session.run(`
  MATCH path = shortestPath((alice:User {name: 'Alice'})-[*1..6]-(target:User {name: 'Diana'}))
  RETURN [node in nodes(path) | node.name] as connection_path
`);

// Analyze influence in network
const influencers = await session.run(`
  MATCH (user:User)-[r:FOLLOWS|FRIENDS]->(other:User)
  RETURN 
    user.name,
    count(r) as connections,
    avg(r.strength) as avg_strength
  ORDER BY connections DESC, avg_strength DESC
  LIMIT 10
`);

await session.close();

SQL vs NoSQL: Choosing the Right Tool

Decision Framework

Choose SQL databases when you need:

-- Complex relationships and joins
SELECT
  c.company_name,
  u.username,
  o.order_date,
  p.product_name,
  oi.quantity,
  oi.unit_price,
  (oi.quantity * oi.unit_price) as line_total
FROM customers c
JOIN users u ON c.user_id = u.id
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.order_date BETWEEN '2024-01-01' AND '2024-12-31'
  AND o.status = 'completed'
ORDER BY o.order_date DESC;

-- ACID transactions for critical operations
BEGIN;
UPDATE accounts SET balance = balance - 1000 WHERE id = 1;
UPDATE accounts SET balance = balance + 1000 WHERE id = 2;
INSERT INTO transactions (from_account, to_account, amount) VALUES (1, 2, 1000);
COMMIT;

-- Complex analytical queries
SELECT
  DATE_TRUNC('month', order_date) as month,
  category,
  SUM(total_amount) as revenue,
  COUNT(*) as orders,
  AVG(total_amount) as avg_order_value,
  SUM(total_amount) - LAG(SUM(total_amount)) OVER (
    PARTITION BY category
    ORDER BY DATE_TRUNC('month', order_date)
  ) as month_over_month_change
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
JOIN categories c ON p.category_id = c.id
GROUP BY DATE_TRUNC('month', order_date), category
ORDER BY month DESC, revenue DESC;

SQL is perfect for:

  • E-commerce systems with complex order management
  • Financial applications requiring ACID compliance
  • CRM systems with structured customer data
  • Reporting and business intelligence
  • Applications with stable, well-defined schemas

Choose NoSQL when you need:

// Flexible, evolving schemas
const blogPost = {
  title: "Understanding NoSQL",
  content: "...",
  author: {
    name: "Alice",
    bio: "Tech writer",
    socialLinks: ["twitter.com/alice", "github.com/alice"],
  },
  tags: ["database", "nosql", "mongodb"],
  metadata: {
    readTime: "8 minutes",
    difficulty: "intermediate",
    lastUpdated: new Date(),
    seoKeywords: ["nosql", "database", "mongodb", "cassandra"],
  },
  comments: [
    {
      author: "Bob",
      text: "Great explanation!",
      timestamp: new Date(),
      replies: [
        {
          author: "Alice",
          text: "Thanks!",
          timestamp: new Date(),
        },
      ],
    },
  ],
};

// Horizontal scaling across multiple servers
// Cassandra automatically distributes data based on partition key
await client.execute(
  `
  INSERT INTO posts (post_id, title, content, author_data)
  VALUES (uuid(), ?, ?, ?)
`,
  [title, content, JSON.stringify(authorData)]
);

// High-performance caching
await redis.setex(`popular_posts`, 3600, JSON.stringify(popularPosts));
await redis.setex(`user:${userId}:feed`, 1800, JSON.stringify(userFeed));

NoSQL is perfect for:

  • Content management with varying document structures
  • Real-time applications (chat, gaming, IoT)
  • User-generated content platforms
  • Mobile applications with offline sync
  • Big data and analytics platforms
  • Microservices with independent data models

Hybrid Approaches

Modern applications often use both:

// Polyglot persistence: Different databases for different needs
const userService = {
  // PostgreSQL for user authentication and profiles
  createUser: async (userData) => {
    return await pg.query(
      `
      INSERT INTO users (email, password_hash, created_at)
      VALUES ($1, $2, $3) RETURNING id
    `,
      [userData.email, userData.passwordHash, new Date()]
    );
  },

  // MongoDB for flexible user preferences and activity
  updatePreferences: async (userId, preferences) => {
    return await mongodb.users.updateOne(
      { userId: userId },
      {
        $set: { preferences, updatedAt: new Date() },
        $setOnInsert: { createdAt: new Date() },
      },
      { upsert: true }
    );
  },

  // Redis for session management and caching
  createSession: async (userId, sessionData) => {
    const sessionId = generateSessionId();
    await redis.setex(
      `session:${sessionId}`,
      86400,
      JSON.stringify({
        userId,
        ...sessionData,
        createdAt: Date.now(),
      })
    );
    return sessionId;
  },

  // Cassandra for user activity logging
  logActivity: async (userId, activity) => {
    await cassandra.execute(
      `
      INSERT INTO user_activities (user_id, activity_date, activity_time, activity_type, details)
      VALUES (?, ?, ?, ?, ?)
    `,
      [
        userId,
        new Date().toISOString().split("T")[0],
        new Date(),
        activity.type,
        activity.details,
      ]
    );
  },
};

Data Modeling in NoSQL: Design Patterns

Document Database Modeling

Embed vs Reference decision:

// ❌ Over-normalization (SQL mindset in NoSQL)
// Users collection
{
  _id: ObjectId("..."),
  email: "alice@example.com",
  username: "alice123"
}

// Posts collection
{
  _id: ObjectId("..."),
  title: "My Blog Post",
  authorId: ObjectId("..."), // Reference to users
  categoryId: ObjectId("...") // Reference to categories
}

// Comments collection
{
  _id: ObjectId("..."),
  postId: ObjectId("..."), // Reference to posts
  authorId: ObjectId("..."), // Reference to users
  content: "Great post!"
}

// This requires multiple queries for simple operations!

// ✅ Thoughtful embedding for common access patterns
{
  _id: ObjectId("..."),
  title: "My Blog Post",
  slug: "my-blog-post",
  content: "Full blog post content...",

  // Embed author info (won't change often)
  author: {
    id: ObjectId("..."),
    username: "alice123",
    displayName: "Alice Johnson",
    avatar: "https://example.com/alice.jpg"
  },

  // Embed category (stable reference data)
  category: {
    id: ObjectId("..."),
    name: "Technology",
    slug: "technology"
  },

  // Embed recent comments (limited subset)
  recentComments: [
    {
      id: ObjectId("..."),
      author: {
        username: "bob",
        displayName: "Bob Smith"
      },
      content: "Great post!",
      createdAt: ISODate("...")
    }
    // Only store last 5 comments here
  ],

  // Statistics that update frequently
  stats: {
    views: 1250,
    likes: 89,
    commentCount: 23
  },

  tags: ["javascript", "nodejs", "database"],
  publishedAt: ISODate("..."),
  updatedAt: ISODate("...")
}

Schema versioning for evolving documents:

// Handle schema evolution gracefully
const UserSchema = {
  v1: {
    email: String,
    profile: {
      name: String,
      bio: String,
    },
  },

  v2: {
    email: String,
    profile: {
      firstName: String, // Split name into first/last
      lastName: String,
      bio: String,
      avatar: String, // New field
    },
    schemaVersion: 2, // Track version
  },

  v3: {
    email: String,
    profile: {
      firstName: String,
      lastName: String,
      bio: String,
      avatar: String,
      socialLinks: {
        // New nested object
        twitter: String,
        linkedin: String,
      },
    },
    preferences: {
      // New top-level object
      newsletter: Boolean,
      notifications: Boolean,
    },
    schemaVersion: 3,
  },
};

// Migration function
const migrateUser = (user) => {
  if (!user.schemaVersion || user.schemaVersion === 1) {
    // v1 to v2: Split name field
    if (user.profile.name) {
      const [firstName, ...lastNameParts] = user.profile.name.split(" ");
      user.profile.firstName = firstName;
      user.profile.lastName = lastNameParts.join(" ");
      delete user.profile.name;
    }
    user.schemaVersion = 2;
  }

  if (user.schemaVersion === 2) {
    // v2 to v3: Add default preferences
    user.preferences = {
      newsletter: true,
      notifications: true,
    };
    user.schemaVersion = 3;
  }

  return user;
};

// Apply migration during reads
const getUser = async (userId) => {
  let user = await users.findOne({ _id: userId });
  if (user && user.schemaVersion < 3) {
    user = migrateUser(user);
    await users.updateOne({ _id: userId }, { $set: user });
  }
  return user;
};

Key-Value Modeling Patterns

Namespace your keys strategically:

// Poor key naming
await redis.set("alice", userData); // Conflicts with other data
await redis.set("session123", sessionData); // No context

// Better key naming with namespaces
await redis.set("user:alice:profile", JSON.stringify(userData));
await redis.set("session:123:data", JSON.stringify(sessionData));
await redis.set("cache:posts:popular", JSON.stringify(popularPosts));

// Time-based keys for analytics
const today = new Date().toISOString().split("T")[0];
await redis.incr(`stats:page_views:${today}`);
await redis.incr(`stats:user_signups:${today}`);

// Hash tags for Redis Cluster (keeps related data on same node)
await redis.set("user:{123}:profile", userData);
await redis.set("user:{123}:sessions", sessionData);
await redis.set("user:{123}:cart", cartData);
// All user:123 data stays on the same cluster node

Leverage Redis data structures:

// Activity feed using sorted sets
const addToFeed = async (userId, activityData) => {
  const score = Date.now(); // Timestamp as score
  const member = JSON.stringify(activityData);

  await redis.zadd(`feed:${userId}`, score, member);
  await redis.zremrangebyrank(`feed:${userId}`, 0, -101); // Keep only latest 100
};

const getFeed = async (userId, page = 0, pageSize = 20) => {
  const start = page * pageSize;
  const end = start + pageSize - 1;

  const activities = await redis.zrevrange(`feed:${userId}`, start, end);
  return activities.map(JSON.parse);
};

// Leaderboard with user scores
const updateScore = async (userId, points) => {
  await redis.zincrby("leaderboard:global", points, userId);

  // Also update monthly leaderboard
  const month = new Date().toISOString().slice(0, 7); // YYYY-MM
  await redis.zincrby(`leaderboard:${month}`, points, userId);
};

const getLeaderboard = async (type = "global", count = 10) => {
  return await redis.zrevrange(
    `leaderboard:${type}`,
    0,
    count - 1,
    "WITHSCORES"
  );
};

CAP Theorem: Understanding Trade-offs

The CAP Theorem Basics

CAP Theorem states you can only guarantee two of:

  • Consistency: All nodes see the same data simultaneously
  • Availability: System remains operational even during failures
  • Partition Tolerance: System continues despite network failures between nodes
// CP System (PostgreSQL with synchronous replication)
// Chooses Consistency + Partition Tolerance over Availability
const transferMoney = async (fromAccount, toAccount, amount) => {
  try {
    await pgClient.query("BEGIN");

    // Both operations must succeed on all replicas
    await pgClient.query(
      "UPDATE accounts SET balance = balance - $1 WHERE id = $2",
      [amount, fromAccount]
    );
    await pgClient.query(
      "UPDATE accounts SET balance = balance + $1 WHERE id = $2",
      [amount, toAccount]
    );

    await pgClient.query("COMMIT");
    // If network partition occurs, system becomes unavailable
    // rather than risk inconsistent data
  } catch (error) {
    await pgClient.query("ROLLBACK");
    throw error;
  }
};

// AP System (Cassandra/DynamoDB)
// Chooses Availability + Partition Tolerance over Consistency
const updateUserProfile = async (userId, profileData) => {
  // Write succeeds even if some nodes are unreachable
  await cassandraClient.execute(
    `
    UPDATE user_profiles 
    SET profile_data = ?, updated_at = ?
    WHERE user_id = ?
  `,
    [JSON.stringify(profileData), new Date(), userId],
    {
      consistency: cassandra.types.consistencies.one, // Only need one node to acknowledge
    }
  );

  // Different nodes might return different data temporarily
  // Eventually consistent - all nodes will sync up
};

// CA System (Traditional single-node databases)
// Consistency + Availability, but no Partition Tolerance
// If network fails, entire system goes down

Practical CAP Considerations

Most distributed systems choose AP (Availability + Partition Tolerance):

// MongoDB with eventual consistency
const updatePostStats = async (postId, increment) => {
  // Primary write for strong consistency on critical path
  await mongodb.posts.updateOne(
    { _id: postId },
    { $inc: { "stats.views": increment } },
    { writeConcern: { w: "majority", j: true } }
  );

  // Secondary write for analytics (eventual consistency okay)
  await mongodb.analytics.updateOne(
    { postId: postId, date: new Date().toISOString().split("T")[0] },
    {
      $inc: { views: increment },
      $setOnInsert: {
        postId: postId,
        date: new Date().toISOString().split("T")[0],
      },
    },
    { upsert: true, writeConcern: { w: 1 } } // Faster, eventual consistency
  );
};

Design for eventual consistency:

// E-commerce inventory with eventual consistency
const purchaseItem = async (userId, productId, quantity) => {
  // Optimistic inventory update
  const result = await cassandra.execute(
    `
    UPDATE inventory 
    SET quantity = quantity - ?
    WHERE product_id = ?
    IF quantity >= ?
  `,
    [quantity, productId, quantity]
  );

  if (!result.rows[0]["[applied]"]) {
    throw new Error("Insufficient inventory");
  }

  // Create order (might succeed even if inventory update fails on some nodes)
  await cassandra.execute(
    `
    INSERT INTO orders (id, user_id, product_id, quantity, status, created_at)
    VALUES (uuid(), ?, ?, ?, 'pending', toTimestamp(now()))
  `,
    [userId, productId, quantity]
  );

  // Background process reconciles inventory across all nodes
  await scheduleInventoryReconciliation(productId);
};

// Reconciliation process handles eventual consistency
const reconcileInventory = async (productId) => {
  // Read from multiple nodes to find inconsistencies
  const inventoryCounts = await Promise.all([
    cassandra.execute(
      "SELECT quantity FROM inventory WHERE product_id = ?",
      [productId],
      { consistency: "one" }
    ),
    cassandra.execute(
      "SELECT quantity FROM inventory WHERE product_id = ?",
      [productId],
      { consistency: "quorum" }
    ),
  ]);

  // Resolve conflicts (usually take the most recent timestamp)
  const correctQuantity = Math.min(
    ...inventoryCounts.map((r) => r.rows[0].quantity)
  );

  await cassandra.execute(
    `
    UPDATE inventory 
    SET quantity = ?
    WHERE product_id = ?
  `,
    [correctQuantity, productId],
    { consistency: "all" }
  );
};

Key Takeaways

NoSQL databases provide specialized solutions for modern application requirements that SQL databases handle poorly. Understanding when to use document stores, key-value databases, column-family systems, and graph databases enables you to choose the right tool for each specific use case.

The NoSQL decision framework you need:

  • Document databases (MongoDB): Flexible schemas, nested data, rapid prototyping
  • Key-value stores (Redis): High-performance caching, sessions, real-time features
  • Column-family (Cassandra): Massive scale, time-series data, high write throughput
  • Graph databases (Neo4j): Complex relationships, social networks, recommendation engines

What distinguishes professional NoSQL usage:

  • Strategic data modeling that matches access patterns to database strengths
  • Understanding CAP theorem trade-offs and designing for eventual consistency
  • Hybrid architectures that use the right database for each specific requirement
  • Schema evolution strategies that handle changing requirements gracefully

What’s Next

We’ve explored the diverse NoSQL landscape and learned when to choose each type over traditional SQL. In the final database fundamentals article, we’ll cover the operational aspects that make databases production-ready: connection pooling and management, ORM/ODM abstractions like Prisma and Mongoose, database migrations and versioning, backup and recovery strategies, and performance monitoring and tuning.

The data storage landscape is complete—you understand both SQL and NoSQL paradigms. Next, we’ll focus on the operational excellence that keeps databases running smoothly in production environments.

You’re no longer just choosing between SQL and NoSQL—you’re architecting data strategies that leverage the unique strengths of each paradigm to build scalable, maintainable systems that serve real business requirements efficiently.