From 31 Seconds to 50ms: MongoDB Aggregation Performance Optimization

Recently, I encountered a performance issue in a web application that involved retrieving a user's recent chat history. The endpoint in question was painfully slow—taking over 31 seconds to respond.

In a world where users expect lightning-fast responses, this delay was unacceptable.

This is the story of how I diagnosed the problem, restructured the query using MongoDB's aggregation framework, and brought that response time down to just 50 milliseconds.

The Problem: Inefficient Data Fetching

The root cause was how we were handling chat session data.

Entire message arrays were being fetched from MongoDB.
All filtering and processing was happening client-side in Python.
We were running complex logic post-query, including locating the last user message in each chat session.

This meant we were pulling potentially hundreds of messages per session, only to discard most of them. Multiply that by several sessions per user, and the payloads—and processing time—became massive.

The Fix: Letting MongoDB Do the Heavy Lifting

To solve the problem, I restructured the logic using MongoDB’s aggregation pipeline. Here's what changed:

1. Match the Relevant User

We scoped the data early by matching on the user ID:

{ "$match": { "User_id": user_obj_id } }

2. Sort and Limit

We sorted sessions by most recent and limited the number to 7:

{ "$sort": { "_id": -1 } },
{ "$limit": 7 }

3. Project Only What’s Needed

Using $project, we reduced the payload:

Fetched only Session_id, Created_at, and the last user message.
Used $reduce and $filter to extract that message within MongoDB, not Python.

{
  "$project": {
    "Session_id": 1,
    "Created_at": 1,
    "LastUserMessage": {
      "$reduce": {
        "input": {
          "$filter": {
            "input": "$Messages",
            "as": "msg",
            "cond": { "$eq": ["$$msg.Role", "user"] }
          }
        },
        "initialValue": "(No title)",
        "in": "$$this.Message"
      }
    }
  }
}

Now we’re only returning the most recent user message, not the entire conversation history.

The Result: 620x Faster

After deploying the change, the response time plummeted:

Before: ~31 seconds
After: ~50 milliseconds

That’s a 620x improvement—all by offloading the computation to the database engine designed to do exactly this.

Lessons Learned

Don't fetch more than you need. Always project only the required fields.
Avoid client-side filtering when the database can do it faster and more efficiently.
Use MongoDB’s aggregation pipeline to minimize payloads and reduce post-processing logic.
Stress test with real-world data. What seems fast with a few records may break down at scale.

TL;DR

Problem: Chat history endpoint took 31s due to pulling entire message arrays and filtering in Python.
Solution: Rewrote the query using MongoDB's aggregation pipeline with $project, $filter, and $reduce.
Result: Reduced query time to 50ms, a 620x performance gain.
Takeaway: Let MongoDB do the heavy lifting. Push logic into the database wherever possible.