12 February 2017

Avoiding memory problems when working with huge amounts of data

There would be times when you run applications with libraries like Apache Storm or similar programs, and you'd be surprised when these applications crash and restart with no apparent reason.

The reason in many cases (as you'll infer from the error logs) is that the application ran out of memory. You just didn't have enough RAM.

Why this happens:
If you specify (using -Xmx2g) that your application can use a maximum of 2GB memory, you have to remember that there are other applications running that might also be free to use that much memory.

Let's say you have 4GB RAM.
App1 takes up 2GB.
App2 takes up 2GB.

App1 deletes some large datastructures that it's using, before creating a new datastructure.

The request for creation of a new datastructure makes the Garbage Collector try to allocate more memory. Sometimes, operations happen so quickly that the GC has no time to remove the unused memory before allocating new memory.

Since you've run out of RAM space, the OS would typically allow your application to use virtual memory. To do that, you need a swap space. If you don't have one, your application might crash with the message that there was insufficient memory. Your application might not even catch this exception and yet crash.

In many managed servers like AWS, the nodes don't have a default swap space.  You have to create one. It's very simple and completely worth it.

How much swap space to create?
Go for double the size of your existing RAM. I know that sounds like a lot, but believe me; I've seen how a system with 4GB RAM has used more than 4GB of swap space when I had allocated it 10GB of swap space. Hardware is cheap, and we now have disks with plenty of space, so create large swap partitions/files.

How to create the swap space?
First check if it's already there with

free -m

and check for how much disk space is available with

df -h

On an Ubuntu 14.04 system, just run these commands:

sudo fallocate -l 6G /swapfile (the "6G" means 6GB of disk space)

sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

To make the swap file permanent:

sudo vi /etc/fstab

and at the bottom of the fstab file, add this line (press tab instead of a space wherever you see a space in the line below):
/swapfile   none    swap    sw    0   0

If you like tweaking things more, you can also tweak the swappiness setting which determines how often the OS will transfer data from the RAM to swap space.

No comments: