Advanced Data Management - 1/2

From Basic CRUD to Data Management Mastery

You’ve mastered databases, can query efficiently across SQL and NoSQL systems, and understand when to use each paradigm. You can design schemas, handle complex relationships, and even manage database operations in production. Your data fundamentals are rock solid. But here’s the reality check that hits every developer building real applications: knowing how to store and retrieve data is just the beginning. Managing data effectively means handling caching, sessions, file uploads, media processing, and serialization at scale.

The data management gap that trips up developers:

// Your basic CRUD operations work great in development
const getUser = async (id) => {
  const user = await db.users.findOne({ id });
  return user;
};

const getPosts = async (userId) => {
  const posts = await db.posts.find({ userId }).sort({ createdAt: -1 });
  return posts;
};

// But in production with 10,000+ users:
// 1. Database gets hammered with repeated queries for the same data
// 2. User sessions are scattered across server memory, breaking load balancing
// 3. File uploads crash your server when users upload 100MB videos
// 4. Profile images load slowly and consume massive bandwidth
// 5. Data serialization becomes a performance bottleneck

The uncomfortable scalability truth:

// Development reality: Everything works beautifully
const userDashboard = async (userId) => {
  const user = await getUser(userId);
  const posts = await getPosts(userId);
  const followers = await getFollowers(userId);
  const analytics = await getAnalytics(userId);

  return { user, posts, followers, analytics }; // Fast and simple!
};

// Production reality: Same code, 1000x slower
// - 4 separate database queries per dashboard load
// - Popular users' data queried hundreds of times per second
// - Database connection pool exhausted
// - Response times measured in seconds, not milliseconds
// - Users abandoning your app for faster alternatives

The hard truth: Modern applications require sophisticated data management beyond basic CRUD operations. You need caching systems that eliminate redundant queries, session management that scales across servers, file handling that processes media efficiently, and serialization strategies that maintain performance as data complexity grows.

Production-ready data management requires mastery of:

  • Intelligent caching strategies that dramatically reduce database load and response times
  • Scalable session management that works across distributed server deployments
  • Professional file upload handling that processes images, videos, and documents safely
  • Efficient media processing that optimizes images and handles large files gracefully
  • Smart data serialization that maintains performance with complex data structures

This article transforms you from a database user into a data management architect. You’ll learn caching patterns that make applications lightning-fast, session strategies that scale infinitely, file handling that never breaks, and serialization techniques that maintain speed as complexity grows.


Caching Strategies: Eliminate Database Load

The Performance Multiplier

Caching is the difference between good and great applications:

// ❌ Without caching: Every request hits the database
const getUserProfile = async (userId) => {
  // Database query every single time
  const user = await db.users.findOne({ id: userId });
  const posts = await db.posts.find({ userId }).limit(10);
  const followers = await db.follows.count({ followedId: userId });

  return { user, posts, followerCount: followers };

  // Problems:
  // - Popular profiles queried hundreds of times per minute
  // - Database becomes the bottleneck
  // - Response times increase with user count
  // - Server costs scale linearly with traffic
};

// ✅ With intelligent caching: Database rarely touched
const getUserProfileCached = async (userId) => {
  const cacheKey = `user:${userId}:profile`;

  // Check cache first
  let profile = await redis.get(cacheKey);
  if (profile) {
    return JSON.parse(profile); // Instant response!
  }

  // Cache miss: Query database
  const user = await db.users.findOne({ id: userId });
  const posts = await db.posts.find({ userId }).limit(10);
  const followers = await db.follows.count({ followedId: userId });

  profile = { user, posts, followerCount: followers };

  // Store in cache for 5 minutes
  await redis.setex(cacheKey, 300, JSON.stringify(profile));

  return profile;

  // Benefits:
  // - 95%+ requests served from cache (millisecond response)
  // - Database load reduced by 20x
  // - Handles traffic spikes gracefully
  // - Server costs stay flat as traffic grows
};

Multi-Level Caching Architecture

Professional applications use layered caching:

// Complete caching strategy with multiple levels
class CacheManager {
  constructor() {
    this.memoryCache = new Map(); // L1: In-memory cache (fastest)
    this.redisClient = redis.createClient(); // L2: Redis cache (shared)
    this.maxMemoryItems = 1000;
    this.memoryTTL = 60 * 1000; // 1 minute
    this.redisTTL = 300; // 5 minutes
  }

  async get(key) {
    // L1: Check memory cache first
    const memoryCached = this.memoryCache.get(key);
    if (memoryCached && Date.now() < memoryCached.expires) {
      console.log(`Cache HIT (memory): ${key}`);
      return memoryCached.data;
    }

    // L2: Check Redis cache
    const redisCached = await this.redisClient.get(key);
    if (redisCached) {
      console.log(`Cache HIT (redis): ${key}`);
      const data = JSON.parse(redisCached);

      // Populate L1 cache for next time
      this.setMemory(key, data);

      return data;
    }

    console.log(`Cache MISS: ${key}`);
    return null;
  }

  async set(key, data, ttl = this.redisTTL) {
    // Store in both levels
    this.setMemory(key, data);
    await this.redisClient.setex(key, ttl, JSON.stringify(data));
  }

  setMemory(key, data) {
    // Implement LRU eviction
    if (this.memoryCache.size >= this.maxMemoryItems) {
      const firstKey = this.memoryCache.keys().next().value;
      this.memoryCache.delete(firstKey);
    }

    this.memoryCache.set(key, {
      data,
      expires: Date.now() + this.memoryTTL,
    });
  }

  async invalidate(pattern) {
    // Clear memory cache entries matching pattern
    for (const [key] of this.memoryCache) {
      if (key.includes(pattern)) {
        this.memoryCache.delete(key);
      }
    }

    // Clear Redis cache entries
    const keys = await this.redisClient.keys(`*${pattern}*`);
    if (keys.length > 0) {
      await this.redisClient.del(keys);
    }
  }
}

const cache = new CacheManager();

// Smart cache patterns for different data types
const DataCache = {
  // User profiles: Cache aggressively, invalidate on updates
  async getUserProfile(userId) {
    const key = `user:${userId}:profile`;
    let profile = await cache.get(key);

    if (!profile) {
      profile = await db.users.findOne({ id: userId });
      await cache.set(key, profile, 1800); // 30 minutes
    }

    return profile;
  },

  // Posts feed: Short cache, frequently updated
  async getUserFeed(userId, page = 0) {
    const key = `feed:${userId}:${page}`;
    let feed = await cache.get(key);

    if (!feed) {
      feed = await db.posts
        .find({ userId })
        .sort({ createdAt: -1 })
        .skip(page * 20)
        .limit(20);

      await cache.set(key, feed, 300); // 5 minutes
    }

    return feed;
  },

  // Popular content: Cache with warming
  async getPopularPosts() {
    const key = "posts:popular";
    let posts = await cache.get(key);

    if (!posts) {
      posts = await db.posts
        .find({ published: true })
        .sort({ views: -1, likes: -1 })
        .limit(50);

      await cache.set(key, posts, 600); // 10 minutes

      // Background refresh before expiry
      setTimeout(() => {
        this.warmPopularPosts();
      }, 480000); // Refresh at 8 minutes
    }

    return posts;
  },

  async warmPopularPosts() {
    const posts = await db.posts
      .find({ published: true })
      .sort({ views: -1, likes: -1 })
      .limit(50);

    await cache.set("posts:popular", posts, 600);
  },

  // Cache invalidation on data changes
  async invalidateUserCache(userId) {
    await cache.invalidate(`user:${userId}`);
    await cache.invalidate(`feed:${userId}`);
  },

  async invalidatePostCache(postId, userId) {
    await cache.invalidate(`post:${postId}`);
    await cache.invalidate(`feed:${userId}`);
    await cache.invalidate("posts:popular");
  },
};

Write-through and write-back caching patterns:

// Write-through: Update cache and database together
const updateUserProfile = async (userId, updates) => {
  try {
    // Update database first
    const updatedUser = await db.users.findOneAndUpdate(
      { id: userId },
      { $set: updates },
      { returnDocument: "after" }
    );

    // Update cache immediately (write-through)
    const cacheKey = `user:${userId}:profile`;
    await cache.set(cacheKey, updatedUser, 1800);

    // Invalidate related caches
    await DataCache.invalidateUserCache(userId);

    return updatedUser;
  } catch (error) {
    // If cache update fails, invalidate to prevent stale data
    await cache.invalidate(`user:${userId}`);
    throw error;
  }
};

// Write-back: Update cache immediately, database asynchronously
const updateUserActivity = async (userId, activity) => {
  const cacheKey = `user:${userId}:activity`;

  // Get current activity from cache
  let currentActivity = (await cache.get(cacheKey)) || {
    lastSeen: null,
    actionsToday: 0,
    streak: 0,
  };

  // Update in memory
  currentActivity = {
    ...currentActivity,
    lastSeen: new Date(),
    actionsToday: currentActivity.actionsToday + 1,
    ...activity,
  };

  // Update cache immediately
  await cache.set(cacheKey, currentActivity, 300);

  // Schedule database update (write-back)
  setImmediate(async () => {
    try {
      await db.users.updateOne(
        { id: userId },
        { $set: { lastActivity: currentActivity } }
      );
    } catch (error) {
      console.error("Background activity update failed:", error);
      // Cache will expire, forcing fresh database read
    }
  });

  return currentActivity;
};

CDN and Asset Caching

Content Delivery Network integration:

// CDN-backed asset management
class AssetManager {
  constructor() {
    this.cdnBase = "https://cdn.yourdomain.com";
    this.s3Client = new AWS.S3();
    this.cloudfront = new AWS.CloudFront();
  }

  async uploadAsset(file, category) {
    const timestamp = Date.now();
    const fileName = `${category}/${timestamp}-${file.originalname}`;

    // Upload to S3
    const uploadParams = {
      Bucket: "your-assets-bucket",
      Key: fileName,
      Body: file.buffer,
      ContentType: file.mimetype,
      CacheControl: "public, max-age=31536000", // 1 year cache
      Metadata: {
        uploadedAt: timestamp.toString(),
        originalName: file.originalname,
      },
    };

    const uploadResult = await this.s3Client.upload(uploadParams).promise();

    // Generate CDN URLs for different sizes
    const urls = {
      original: `${this.cdnBase}/${fileName}`,
      thumbnail: `${this.cdnBase}/${fileName}?w=150&h=150&fit=crop`,
      medium: `${this.cdnBase}/${fileName}?w=800&h=600&fit=inside`,
      large: `${this.cdnBase}/${fileName}?w=1200&h=900&fit=inside`,
    };

    // Cache asset metadata
    const assetData = {
      id: generateId(),
      fileName,
      originalName: file.originalname,
      size: file.size,
      mimetype: file.mimetype,
      urls,
      uploadedAt: new Date(timestamp),
    };

    await cache.set(`asset:${assetData.id}`, assetData, 86400); // 24 hours

    return assetData;
  }

  async getAsset(assetId) {
    // Check cache first
    let asset = await cache.get(`asset:${assetId}`);

    if (!asset) {
      // Fallback to database
      asset = await db.assets.findOne({ id: assetId });
      if (asset) {
        await cache.set(`asset:${assetId}`, asset, 86400);
      }
    }

    return asset;
  }

  async invalidateAsset(fileName) {
    // Create CloudFront invalidation
    const invalidationParams = {
      DistributionId: "YOUR_CLOUDFRONT_DISTRIBUTION_ID",
      InvalidationBatch: {
        CallerReference: Date.now().toString(),
        Paths: {
          Quantity: 1,
          Items: [`/${fileName}*`], // Invalidate all sizes
        },
      },
    };

    await this.cloudfront.createInvalidation(invalidationParams).promise();

    // Clear local cache
    const keys = await cache.redisClient.keys(`asset:*${fileName}*`);
    if (keys.length > 0) {
      await cache.redisClient.del(keys);
    }
  }
}

Session Management: Scalable User State

Beyond Server Memory Sessions

The session scalability problem:

// ❌ In-memory sessions: Don't scale beyond one server
const express = require("express");
const session = require("express-session");

const app = express();

app.use(
  session({
    secret: "your-secret",
    resave: false,
    saveUninitialized: false,
    cookie: { maxAge: 24 * 60 * 60 * 1000 }, // 24 hours
  })
);

// Problems with memory sessions:
// 1. Sessions lost when server restarts
// 2. Load balancing breaks (sticky sessions required)
// 3. Can't scale horizontally
// 4. Memory usage grows with concurrent users
// 5. No session sharing between services

Professional session management with Redis:

// ✅ Redis-backed sessions: Scale infinitely
const RedisStore = require("connect-redis")(session);
const redis = require("redis");

const redisClient = redis.createClient({
  host: process.env.REDIS_HOST,
  port: process.env.REDIS_PORT,
  password: process.env.REDIS_PASSWORD,
  db: 1, // Separate database for sessions
  retry_strategy: (options) => {
    if (options.error && options.error.code === "ECONNREFUSED") {
      console.error("Redis server connection refused");
    }
    if (options.total_retry_time > 1000 * 60 * 60) {
      return new Error("Redis retry time exhausted");
    }
    if (options.attempt > 10) {
      return undefined; // Stop retrying
    }
    return Math.min(options.attempt * 100, 3000);
  },
});

app.use(
  session({
    store: new RedisStore({
      client: redisClient,
      prefix: "sess:", // Namespace sessions
      ttl: 86400, // 24 hours in seconds
    }),
    secret: process.env.SESSION_SECRET,
    resave: false,
    saveUninitialized: false,
    rolling: true, // Reset expiry on activity
    cookie: {
      secure: process.env.NODE_ENV === "production", // HTTPS only in prod
      httpOnly: true, // Prevent XSS
      maxAge: 24 * 60 * 60 * 1000,
      sameSite: "strict", // CSRF protection
    },
  })
);

// Benefits of Redis sessions:
// ✅ Survive server restarts
// ✅ Work with any load balancer
// ✅ Scale to millions of users
// ✅ Share sessions across microservices
// ✅ Persistent and reliable

Advanced Session Patterns

JWT vs Session Store comparison:

// Session-based authentication
const SessionAuth = {
  async createSession(userId, userAgent, ipAddress) {
    const sessionId = generateSecureId();
    const sessionData = {
      userId,
      createdAt: new Date(),
      lastActivity: new Date(),
      userAgent,
      ipAddress,
      isActive: true,
      permissions: await this.getUserPermissions(userId),
    };

    // Store in Redis with expiration
    await redisClient.setex(
      `session:${sessionId}`,
      86400, // 24 hours
      JSON.stringify(sessionData)
    );

    // Track user's active sessions
    await redisClient.sadd(`user:${userId}:sessions`, sessionId);
    await redisClient.expire(`user:${userId}:sessions`, 86400);

    return sessionId;
  },

  async getSession(sessionId) {
    const sessionData = await redisClient.get(`session:${sessionId}`);
    if (!sessionData) return null;

    const session = JSON.parse(sessionData);

    // Update last activity
    session.lastActivity = new Date();
    await redisClient.setex(
      `session:${sessionId}`,
      86400,
      JSON.stringify(session)
    );

    return session;
  },

  async invalidateSession(sessionId) {
    const session = await this.getSession(sessionId);
    if (session) {
      // Remove from user's active sessions
      await redisClient.srem(`user:${session.userId}:sessions`, sessionId);
    }

    await redisClient.del(`session:${sessionId}`);
  },

  async invalidateAllUserSessions(userId) {
    const sessionIds = await redisClient.smembers(`user:${userId}:sessions`);

    if (sessionIds.length > 0) {
      const pipeline = redisClient.pipeline();
      sessionIds.forEach((sessionId) => {
        pipeline.del(`session:${sessionId}`);
      });
      pipeline.del(`user:${userId}:sessions`);
      await pipeline.exec();
    }
  },
};

// JWT-based authentication (stateless)
const JWTAuth = {
  generateTokens(userId, permissions) {
    const accessToken = jwt.sign(
      {
        userId,
        permissions,
        type: "access",
        iat: Math.floor(Date.now() / 1000),
      },
      process.env.JWT_SECRET,
      { expiresIn: "15m" }
    );

    const refreshToken = jwt.sign(
      {
        userId,
        type: "refresh",
        iat: Math.floor(Date.now() / 1000),
      },
      process.env.JWT_REFRESH_SECRET,
      { expiresIn: "7d" }
    );

    return { accessToken, refreshToken };
  },

  async verifyAccessToken(token) {
    try {
      const decoded = jwt.verify(token, process.env.JWT_SECRET);

      // Check if token is blacklisted (for logout)
      const isBlacklisted = await redisClient.get(`blacklist:${token}`);
      if (isBlacklisted) {
        throw new Error("Token is blacklisted");
      }

      return decoded;
    } catch (error) {
      return null;
    }
  },

  async refreshTokens(refreshToken) {
    try {
      const decoded = jwt.verify(refreshToken, process.env.JWT_REFRESH_SECRET);

      // Verify user still exists and is active
      const user = await db.users.findOne({
        id: decoded.userId,
        isActive: true,
      });

      if (!user) {
        throw new Error("User not found or inactive");
      }

      const permissions = await this.getUserPermissions(decoded.userId);
      return this.generateTokens(decoded.userId, permissions);
    } catch (error) {
      throw new Error("Invalid refresh token");
    }
  },

  async blacklistToken(token) {
    const decoded = jwt.decode(token);
    if (decoded && decoded.exp) {
      const ttl = decoded.exp - Math.floor(Date.now() / 1000);
      if (ttl > 0) {
        await redisClient.setex(`blacklist:${token}`, ttl, "true");
      }
    }
  },
};

Session security and monitoring:

// Advanced session security
class SecureSessionManager {
  constructor() {
    this.maxSessionsPerUser = 5;
    this.sessionTimeout = 30 * 60; // 30 minutes of inactivity
    this.suspiciousActivityThreshold = 10;
  }

  async createSecureSession(userId, request) {
    const fingerprint = this.generateFingerprint(request);
    const sessionId = generateSecureId();

    // Check for suspicious activity
    await this.checkSuspiciousActivity(userId, request);

    // Limit concurrent sessions
    await this.enforceSessionLimit(userId);

    const sessionData = {
      userId,
      sessionId,
      fingerprint,
      createdAt: Date.now(),
      lastActivity: Date.now(),
      ipAddress: request.ip,
      userAgent: request.get("User-Agent"),
      loginMethod: "password", // or 'oauth', 'magic-link', etc.
      securityLevel: "standard",
      isActive: true,
    };

    await redisClient.setex(
      `session:${sessionId}`,
      this.sessionTimeout,
      JSON.stringify(sessionData)
    );

    // Track in user's sessions list
    await redisClient.sadd(`user:${userId}:sessions`, sessionId);
    await redisClient.expire(`user:${userId}:sessions`, 86400);

    // Log session creation
    await this.logSessionEvent(userId, sessionId, "created", request);

    return sessionId;
  }

  async validateSession(sessionId, request) {
    const sessionData = await redisClient.get(`session:${sessionId}`);
    if (!sessionData) {
      return { valid: false, reason: "session_not_found" };
    }

    const session = JSON.parse(sessionData);
    const currentFingerprint = this.generateFingerprint(request);

    // Check fingerprint consistency
    if (session.fingerprint !== currentFingerprint) {
      await this.logSessionEvent(
        session.userId,
        sessionId,
        "fingerprint_mismatch",
        request
      );

      // Don't immediately invalidate - could be legitimate browser update
      session.securityLevel = "elevated";
    }

    // Check for IP address changes
    if (session.ipAddress !== request.ip) {
      await this.logSessionEvent(
        session.userId,
        sessionId,
        "ip_change",
        request
      );

      session.securityLevel = "elevated";
      session.ipAddress = request.ip; // Update to new IP
    }

    // Update activity and extend session
    session.lastActivity = Date.now();
    await redisClient.setex(
      `session:${sessionId}`,
      this.sessionTimeout,
      JSON.stringify(session)
    );

    return { valid: true, session };
  }

  generateFingerprint(request) {
    const components = [
      request.get("User-Agent") || "",
      request.get("Accept-Language") || "",
      request.get("Accept-Encoding") || "",
      request.get("Accept") || "",
    ];

    return crypto
      .createHash("sha256")
      .update(components.join("|"))
      .digest("hex");
  }

  async checkSuspiciousActivity(userId, request) {
    const key = `activity:${userId}:${request.ip}`;
    const attempts = await redisClient.incr(key);

    if (attempts === 1) {
      await redisClient.expire(key, 300); // 5 minutes window
    }

    if (attempts > this.suspiciousActivityThreshold) {
      await this.logSessionEvent(userId, null, "suspicious_activity", request);

      // Implement rate limiting
      throw new Error("Too many login attempts. Please try again later.");
    }
  }

  async enforceSessionLimit(userId) {
    const sessions = await redisClient.smembers(`user:${userId}:sessions`);

    if (sessions.length >= this.maxSessionsPerUser) {
      // Remove oldest session
      const sessionDetails = await Promise.all(
        sessions.map(async (sessionId) => {
          const data = await redisClient.get(`session:${sessionId}`);
          return data ? { sessionId, ...JSON.parse(data) } : null;
        })
      );

      const validSessions = sessionDetails
        .filter(Boolean)
        .sort((a, b) => a.lastActivity - b.lastActivity);

      if (validSessions.length >= this.maxSessionsPerUser) {
        const oldestSession = validSessions[0];
        await this.invalidateSession(oldestSession.sessionId);
      }
    }
  }

  async logSessionEvent(userId, sessionId, event, request) {
    const eventData = {
      userId,
      sessionId,
      event,
      timestamp: Date.now(),
      ip: request.ip,
      userAgent: request.get("User-Agent"),
      endpoint: request.path,
    };

    // Store in time-series for analysis
    await redisClient.zadd(
      `events:${userId}`,
      Date.now(),
      JSON.stringify(eventData)
    );

    // Keep only last 1000 events per user
    await redisClient.zremrangebyrank(`events:${userId}`, 0, -1001);
  }
}

File Upload Handling: Safe and Scalable Media Processing

Secure File Upload Implementation

The file upload security nightmare:

// ❌ Dangerous file upload (don't do this)
app.post("/upload", upload.single("file"), (req, res) => {
  // Accepting any file type - SECURITY RISK!
  // No file size validation - RESOURCE EXHAUSTION!
  // No virus scanning - MALWARE RISK!
  // Storing with original filename - PATH TRAVERSAL!

  const file = req.file;
  fs.writeFileSync(`./uploads/${file.originalname}`, file.buffer);
  res.json({ url: `/uploads/${file.originalname}` });
});

// This code is a hacker's dream:
// - Upload ../../../etc/passwd to access system files
// - Upload 10GB files to crash the server
// - Upload executable files to inject malware
// - Upload files with malicious headers

Professional secure file upload system:

// ✅ Secure, production-ready file upload
const multer = require("multer");
const path = require("path");
const crypto = require("crypto");
const sharp = require("sharp"); // Image processing
const fileType = require("file-type"); // Real file type detection

class SecureFileUploader {
  constructor() {
    this.allowedMimeTypes = {
      "image/jpeg": { ext: "jpg", maxSize: 10 * 1024 * 1024 }, // 10MB
      "image/png": { ext: "png", maxSize: 10 * 1024 * 1024 },
      "image/gif": { ext: "gif", maxSize: 5 * 1024 * 1024 }, // 5MB
      "image/webp": { ext: "webp", maxSize: 10 * 1024 * 1024 },
      "application/pdf": { ext: "pdf", maxSize: 25 * 1024 * 1024 }, // 25MB
      "text/plain": { ext: "txt", maxSize: 1 * 1024 * 1024 }, // 1MB
      "application/json": { ext: "json", maxSize: 1 * 1024 * 1024 },
    };

    this.uploadDir = process.env.UPLOAD_DIR || "./secure-uploads";
    this.maxFiles = 10; // Maximum files per request
    this.maxTotalSize = 100 * 1024 * 1024; // 100MB total per request

    this.setupMulter();
  }

  setupMulter() {
    const storage = multer.memoryStorage(); // Keep files in memory for processing

    this.upload = multer({
      storage,
      limits: {
        fileSize: Math.max(
          ...Object.values(this.allowedMimeTypes).map((t) => t.maxSize)
        ),
        files: this.maxFiles,
        fieldSize: 1024 * 1024, // 1MB field size
        fieldNameSize: 100,
        fields: 50,
      },
      fileFilter: (req, file, callback) => {
        this.validateFile(file, callback);
      },
    });
  }

  validateFile(file, callback) {
    // Check MIME type against whitelist
    if (!this.allowedMimeTypes[file.mimetype]) {
      return callback(new Error(`File type ${file.mimetype} not allowed`));
    }

    // Validate filename
    const filename = file.originalname;
    if (!/^[a-zA-Z0-9._-]+$/.test(filename)) {
      return callback(new Error("Invalid filename characters"));
    }

    if (filename.length > 255) {
      return callback(new Error("Filename too long"));
    }

    callback(null, true);
  }

  async processUpload(files, userId) {
    const processedFiles = [];
    let totalSize = 0;

    for (const file of files) {
      // Verify total size limit
      totalSize += file.size;
      if (totalSize > this.maxTotalSize) {
        throw new Error("Total file size exceeds limit");
      }

      // Detect real file type from buffer
      const detectedType = await fileType.fromBuffer(file.buffer);
      if (!detectedType || detectedType.mime !== file.mimetype) {
        throw new Error(`File type mismatch for ${file.originalname}`);
      }

      // Verify against size limit for this file type
      const typeConfig = this.allowedMimeTypes[file.mimetype];
      if (file.size > typeConfig.maxSize) {
        throw new Error(`File ${file.originalname} exceeds size limit`);
      }

      // Generate secure filename
      const fileId = crypto.randomBytes(16).toString("hex");
      const secureFilename = `${fileId}.${typeConfig.ext}`;
      const filePath = path.join(this.uploadDir, secureFilename);

      // Virus scanning (integrate with ClamAV or similar)
      await this.scanForViruses(file.buffer);

      // Process based on file type
      let processedBuffer = file.buffer;
      let metadata = {};

      if (file.mimetype.startsWith("image/")) {
        const result = await this.processImage(file.buffer, file.mimetype);
        processedBuffer = result.buffer;
        metadata = result.metadata;
      }

      // Save to secure location
      await fs.writeFile(filePath, processedBuffer);

      // Store file metadata in database
      const fileRecord = {
        id: fileId,
        originalName: file.originalname,
        filename: secureFilename,
        mimetype: file.mimetype,
        size: processedBuffer.length,
        uploadedBy: userId,
        uploadedAt: new Date(),
        metadata,
        status: "active",
      };

      await db.files.insertOne(fileRecord);

      processedFiles.push({
        id: fileId,
        url: `/files/${fileId}`,
        originalName: file.originalname,
        size: fileRecord.size,
        type: file.mimetype,
      });
    }

    return processedFiles;
  }

  async processImage(buffer, mimetype) {
    const image = sharp(buffer);
    const imageMetadata = await image.metadata();

    // Security: Remove EXIF data that might contain sensitive info
    let processedImage = image.rotate(); // Auto-rotate based on EXIF

    // Validate image dimensions (prevent decompression bombs)
    if (imageMetadata.width > 10000 || imageMetadata.height > 10000) {
      throw new Error("Image dimensions too large");
    }

    // Convert to safe format if needed
    let outputBuffer;
    switch (mimetype) {
      case "image/jpeg":
        outputBuffer = await processedImage
          .jpeg({ quality: 85, mozjpeg: true })
          .toBuffer();
        break;

      case "image/png":
        outputBuffer = await processedImage
          .png({ compressionLevel: 6 })
          .toBuffer();
        break;

      case "image/webp":
        outputBuffer = await processedImage.webp({ quality: 85 }).toBuffer();
        break;

      default:
        outputBuffer = buffer;
    }

    return {
      buffer: outputBuffer,
      metadata: {
        width: imageMetadata.width,
        height: imageMetadata.height,
        format: imageMetadata.format,
        hasAlpha: imageMetadata.hasAlpha,
        colorSpace: imageMetadata.space,
      },
    };
  }

  async scanForViruses(buffer) {
    // Integrate with ClamAV or similar antivirus
    // This is a placeholder - implement actual virus scanning
    try {
      // const scanResult = await clamAV.scanBuffer(buffer);
      // if (scanResult.isInfected) {
      //   throw new Error('File contains malware');
      // }

      // Basic content inspection for now
      const content = buffer.toString(
        "ascii",
        0,
        Math.min(buffer.length, 1024)
      );

      // Check for suspicious patterns
      const suspiciousPatterns = [
        /\x00\x00\x00\x00.*shell/i,
        /eval\s*\(/i,
        /<script/i,
        /javascript:/i,
      ];

      for (const pattern of suspiciousPatterns) {
        if (pattern.test(content)) {
          throw new Error("Suspicious content detected");
        }
      }
    } catch (error) {
      throw new Error(`Security scan failed: ${error.message}`);
    }
  }

  // File serving with access control
  async serveFile(fileId, userId, req, res) {
    // Get file record
    const fileRecord = await db.files.findOne({ id: fileId, status: "active" });
    if (!fileRecord) {
      return res.status(404).json({ error: "File not found" });
    }

    // Check access permissions
    if (!(await this.checkFileAccess(fileRecord, userId))) {
      return res.status(403).json({ error: "Access denied" });
    }

    const filePath = path.join(this.uploadDir, fileRecord.filename);

    // Check if file exists on disk
    if (
      !(await fs
        .access(filePath)
        .then(() => true)
        .catch(() => false))
    ) {
      return res.status(404).json({ error: "File not found on disk" });
    }

    // Set appropriate headers
    res.setHeader("Content-Type", fileRecord.mimetype);
    res.setHeader("Content-Length", fileRecord.size);
    res.setHeader(
      "Content-Disposition",
      `inline; filename="${fileRecord.originalName}"`
    );

    // Security headers
    res.setHeader("X-Content-Type-Options", "nosniff");
    res.setHeader("Content-Security-Policy", "default-src 'none'");

    // Serve file
    const fileStream = fs.createReadStream(filePath);
    fileStream.pipe(res);

    // Log file access
    await this.logFileAccess(fileId, userId, req);
  }

  async checkFileAccess(fileRecord, userId) {
    // Owner can always access
    if (fileRecord.uploadedBy === userId) {
      return true;
    }

    // Check if file is shared publicly
    const sharing = await db.fileSharing.findOne({
      fileId: fileRecord.id,
      isActive: true,
    });

    if (sharing) {
      if (sharing.shareType === "public") {
        return true;
      }

      if (
        sharing.shareType === "users" &&
        sharing.allowedUsers.includes(userId)
      ) {
        return true;
      }
    }

    return false;
  }
}

// Usage
const fileUploader = new SecureFileUploader();

app.post(
  "/upload",
  fileUploader.upload.array("files", 10),
  async (req, res) => {
    try {
      const userId = req.user.id;
      const files = req.files;

      if (!files || files.length === 0) {
        return res.status(400).json({ error: "No files provided" });
      }

      const processedFiles = await fileUploader.processUpload(files, userId);

      res.json({
        success: true,
        files: processedFiles,
      });
    } catch (error) {
      console.error("File upload error:", error);
      res.status(400).json({
        error: error.message || "Upload failed",
      });
    }
  }
);

app.get("/files/:fileId", async (req, res) => {
  try {
    const { fileId } = req.params;
    const userId = req.user?.id;

    await fileUploader.serveFile(fileId, userId, req, res);
  } catch (error) {
    console.error("File serve error:", error);
    res.status(500).json({ error: "Failed to serve file" });
  }
});

Image and Media Processing: Optimization at Scale

Automated Image Processing Pipeline

Real-time image optimization:

// Professional image processing service
const sharp = require("sharp");
const ffmpeg = require("fluent-ffmpeg");
const AWS = require("aws-sdk");

class MediaProcessor {
  constructor() {
    this.s3 = new AWS.S3();
    this.bucket = process.env.MEDIA_BUCKET;
    this.cdnBase = process.env.CDN_BASE_URL;

    this.imageFormats = {
      thumbnail: { width: 150, height: 150, quality: 80 },
      small: { width: 400, height: 300, quality: 85 },
      medium: { width: 800, height: 600, quality: 85 },
      large: { width: 1200, height: 900, quality: 90 },
      original: { quality: 95 },
    };

    this.videoFormats = {
      preview: { width: 320, height: 240, format: "mp4" },
      sd: { width: 640, height: 480, format: "mp4" },
      hd: { width: 1280, height: 720, format: "mp4" },
      fullhd: { width: 1920, height: 1080, format: "mp4" },
    };
  }

  async processImage(buffer, originalName, userId) {
    const fileId = generateId();
    const tasks = [];
    const results = { id: fileId, variants: {} };

    // Analyze original image
    const image = sharp(buffer);
    const metadata = await image.metadata();

    // Validate image
    if (metadata.width > 15000 || metadata.height > 15000) {
      throw new Error("Image dimensions too large");
    }

    // Generate all size variants
    for (const [sizeName, config] of Object.entries(this.imageFormats)) {
      tasks.push(
        this.generateImageVariant(buffer, fileId, sizeName, config, metadata)
      );
    }

    // Process all variants in parallel
    const variants = await Promise.all(tasks);

    // Upload to S3 and CDN
    const uploadTasks = variants.map((variant) =>
      this.uploadImageVariant(variant, fileId, sizeName)
    );

    await Promise.all(uploadTasks);

    // Store metadata
    const imageRecord = {
      id: fileId,
      type: "image",
      originalName,
      uploadedBy: userId,
      metadata: {
        width: metadata.width,
        height: metadata.height,
        format: metadata.format,
        size: buffer.length,
        hasAlpha: metadata.hasAlpha,
      },
      variants: variants.reduce((acc, variant) => {
        acc[variant.size] = {
          url: `${this.cdnBase}/${fileId}/${variant.size}.${variant.format}`,
          width: variant.width,
          height: variant.height,
          fileSize: variant.buffer.length,
        };
        return acc;
      }, {}),
      processedAt: new Date(),
      status: "ready",
    };

    await db.media.insertOne(imageRecord);
    return imageRecord;
  }

  async generateImageVariant(
    buffer,
    fileId,
    sizeName,
    config,
    originalMetadata
  ) {
    let image = sharp(buffer);

    // Progressive JPEG loading
    if (originalMetadata.format === "jpeg") {
      image = image.jpeg({
        quality: config.quality,
        progressive: true,
        mozjpeg: true,
      });
    }

    // Resize if dimensions specified
    if (config.width || config.height) {
      image = image.resize(config.width, config.height, {
        fit: sizeName === "thumbnail" ? "cover" : "inside",
        withoutEnlargement: true,
      });
    }

    // Apply optimizations
    image = image
      .rotate() // Auto-rotate based on EXIF
      .removeMetadata() // Strip EXIF data for privacy
      .sharpen({ sigma: 0.5 }); // Light sharpening after resize

    const processedBuffer = await image.toBuffer({ resolveWithObject: true });

    return {
      size: sizeName,
      buffer: processedBuffer.data,
      width: processedBuffer.info.width,
      height: processedBuffer.info.height,
      format: processedBuffer.info.format,
      fileSize: processedBuffer.data.length,
    };
  }

  async uploadImageVariant(variant, fileId, sizeName) {
    const key = `images/${fileId}/${sizeName}.${variant.format}`;

    const uploadParams = {
      Bucket: this.bucket,
      Key: key,
      Body: variant.buffer,
      ContentType: `image/${variant.format}`,
      CacheControl: "public, max-age=31536000", // 1 year
      Metadata: {
        width: variant.width.toString(),
        height: variant.height.toString(),
        size: sizeName,
        processedAt: new Date().toISOString(),
      },
    };

    await this.s3.upload(uploadParams).promise();
    return key;
  }

  async processVideo(filePath, originalName, userId) {
    const fileId = generateId();
    const results = { id: fileId, variants: {}, thumbnails: [] };

    // Generate video metadata
    const metadata = await this.getVideoMetadata(filePath);

    // Generate thumbnail from video
    const thumbnailPath = await this.generateVideoThumbnail(filePath, fileId);
    const thumbnailBuffer = await fs.readFile(thumbnailPath);

    // Process thumbnail like regular image
    const thumbnail = await this.processImage(
      thumbnailBuffer,
      "thumbnail.jpg",
      userId
    );
    results.thumbnail = thumbnail;

    // Generate video variants for different qualities
    const videoTasks = [];

    for (const [quality, config] of Object.entries(this.videoFormats)) {
      if (metadata.width >= config.width || quality === "preview") {
        videoTasks.push(
          this.generateVideoVariant(filePath, fileId, quality, config)
        );
      }
    }

    const videoVariants = await Promise.all(videoTasks);

    // Upload video files
    const uploadTasks = videoVariants.map((variant) =>
      this.uploadVideoVariant(variant, fileId)
    );

    await Promise.all(uploadTasks);

    // Store video record
    const videoRecord = {
      id: fileId,
      type: "video",
      originalName,
      uploadedBy: userId,
      metadata: {
        duration: metadata.duration,
        width: metadata.width,
        height: metadata.height,
        format: metadata.format,
        size: metadata.size,
        bitrate: metadata.bitrate,
        fps: metadata.fps,
      },
      thumbnail: thumbnail.id,
      variants: videoVariants.reduce((acc, variant) => {
        acc[variant.quality] = {
          url: `${this.cdnBase}/${fileId}/${variant.quality}.${variant.format}`,
          width: variant.width,
          height: variant.height,
          fileSize: variant.size,
          bitrate: variant.bitrate,
        };
        return acc;
      }, {}),
      processedAt: new Date(),
      status: "ready",
    };

    await db.media.insertOne(videoRecord);

    // Cleanup temporary files
    await fs.unlink(filePath).catch(() => {});
    await fs.unlink(thumbnailPath).catch(() => {});

    return videoRecord;
  }

  async generateVideoVariant(inputPath, fileId, quality, config) {
    const outputPath = `/tmp/${fileId}_${quality}.${config.format}`;

    return new Promise((resolve, reject) => {
      ffmpeg(inputPath)
        .size(`${config.width}x${config.height}`)
        .videoBitrate("1000k")
        .audioBitrate("128k")
        .format(config.format)
        .videoCodec("libx264")
        .audioCodec("aac")
        .addOptions([
          "-preset fast",
          "-crf 23",
          "-movflags +faststart", // Enable progressive download
          "-profile:v baseline", // Better compatibility
          "-level 3.0",
        ])
        .on("end", async () => {
          const stats = await fs.stat(outputPath);
          resolve({
            quality,
            format: config.format,
            width: config.width,
            height: config.height,
            path: outputPath,
            size: stats.size,
            bitrate: "1000k",
          });
        })
        .on("error", reject)
        .save(outputPath);
    });
  }

  async generateVideoThumbnail(inputPath, fileId) {
    const thumbnailPath = `/tmp/${fileId}_thumb.jpg`;

    return new Promise((resolve, reject) => {
      ffmpeg(inputPath)
        .screenshots({
          timestamps: ["10%"], // Capture at 10% through video
          filename: path.basename(thumbnailPath),
          folder: path.dirname(thumbnailPath),
          size: "800x600",
        })
        .on("end", () => resolve(thumbnailPath))
        .on("error", reject);
    });
  }
}

// Background processing with queues
const Queue = require("bull");
const mediaQueue = new Queue("media processing", {
  redis: {
    host: process.env.REDIS_HOST,
    port: process.env.REDIS_PORT,
  },
});

mediaQueue.process("process-image", async (job) => {
  const { buffer, originalName, userId } = job.data;
  const processor = new MediaProcessor();

  try {
    const result = await processor.processImage(buffer, originalName, userId);
    return result;
  } catch (error) {
    console.error("Image processing failed:", error);
    throw error;
  }
});

mediaQueue.process("process-video", async (job) => {
  const { filePath, originalName, userId } = job.data;
  const processor = new MediaProcessor();

  try {
    const result = await processor.processVideo(filePath, originalName, userId);
    return result;
  } catch (error) {
    console.error("Video processing failed:", error);
    throw error;
  }
});

// Usage in upload endpoint
app.post("/upload-media", upload.single("media"), async (req, res) => {
  try {
    const file = req.file;
    const userId = req.user.id;

    if (file.mimetype.startsWith("image/")) {
      // Queue image processing
      const job = await mediaQueue.add("process-image", {
        buffer: file.buffer,
        originalName: file.originalname,
        userId,
      });

      res.json({
        success: true,
        jobId: job.id,
        message: "Image processing started",
      });
    } else if (file.mimetype.startsWith("video/")) {
      // Save to temp file for video processing
      const tempPath = `/tmp/${Date.now()}-${file.originalname}`;
      await fs.writeFile(tempPath, file.buffer);

      const job = await mediaQueue.add("process-video", {
        filePath: tempPath,
        originalName: file.originalname,
        userId,
      });

      res.json({
        success: true,
        jobId: job.id,
        message: "Video processing started",
      });
    }
  } catch (error) {
    console.error("Media upload error:", error);
    res.status(500).json({ error: "Upload failed" });
  }
});

Key Takeaways

Advanced data management separates basic web applications from professional, scalable systems. Intelligent caching eliminates database bottlenecks, proper session management enables horizontal scaling, secure file handling protects against attacks, and efficient media processing delivers great user experiences.

The data management mindset you need:

  • Cache strategically, not universally: Use multi-level caching with appropriate TTLs and invalidation strategies
  • Sessions must scale beyond memory: Redis-backed sessions enable load balancing and service distribution
  • File uploads are security battlegrounds: Validate everything, process safely, store securely
  • Media optimization is user experience: Responsive images and optimized videos keep users engaged

What distinguishes professional data management:

  • Caching architectures that reduce database load by 90%+ while maintaining data consistency
  • Session systems that handle millions of users across distributed infrastructure
  • File upload pipelines that prevent security breaches while processing media efficiently
  • Background processing that handles expensive operations without blocking user interactions

What’s Next

We’ve covered the foundational aspects of data management—caching, sessions, and file handling. In the next article, we’ll dive deeper into advanced data patterns: search implementations, data synchronization, background job processing, message queues, and streaming data architectures.

The data layer is more than storage—it’s the performance foundation that determines whether your application scales gracefully or crumbles under load. Master these patterns, and you can build systems that handle millions of users while maintaining the responsiveness of a local application.

You’re no longer just moving data around—you’re architecting data flows that optimize for performance, security, and scalability simultaneously. The foundation is solid. Time to build advanced data orchestration systems.