Wednesday, November 12, 2014

Node.js Request Memory Leak

I have been a Node.js enthusiast for the past few years and have always loved it's ability to spawn numerous requests in parallel. This ability has allowed me to write software that far exceeded my expectations of "fast".

As I've moved forward in my career I've come to a place where I'm working with Big Data now. These days getting my feet wet consists of processing millions of records through these Node.js apps and often times making hundreds of thousands of http requests in the process.

On a recent project I noticed that while looping through a 2.5 Million item XML Feed I was getting a consistent memory growth (memory leak, possibly), until the app would eventually run out of memory then start back at 1... X( This was very frustrating and a bit hard to debug. Especially with the prevalence of 3rd party modules used, the leak could have been anywhere.

After hours of debugging I narrowed the issue to a new http request being made 1 time per record and decided to focus my energy on that. I then learned what the real problem was, and it's quite simple and easy to fix!

In Node.js the http requests use a connection pool managed by "http.Agent" with the following properties:
  1. agent.maxSockets
  2. agent.sockets
  3. agent.requests
agent.maxSockets sets how many connections per address can be open at once and is defaults to 5

agent.sockets is an object that contains the currently used sockets

agent.requests is an object of requests in queue waiting for an open socket

So picture this... You're processing thousands of items per second and sending out http requests but only 5 of those requests are getting set to a socket at a time, and the rest are piling up in your agent.requests object. Chances are very good that agent.requests object is going to keep growing and growing...and growing, until it eventually consumes all the memory in your system. This was the case for me.

Have no fear! The fix is easy:

Simply create your own instance of http.Agent as follows:

New Custom Connection Pool
var pool = new http.Agent;
pool.maxSockets = 1000; // play with this number to adjust memory consumption for your app

then in your request options:

Add Pool to Request
request({
    url: 'myrequrl.com',
    pool: pool
}, callback);

That should do the trick!! Please share and comment. Let me know if this helped you!