Monday, July 8, 2013

Linux ate all memory

Memory is used by kernel, this is an old issue with SLAB system. Most of the ram is used by "dentry" slab structure. which is keep cache for directory structure. Of course 13 G is a much more than directory cache should use, but the good news is - kernel should free its caches in case if any process ask for more memory, so we should think about this memory not like its already consumed but more like its free...

Well in other hand, couple weeks ago we did seen how kernel start to kill processes with OOM-killer and after killing some important processes its decided to reboot. I believe it was related to SLAB dentry cache.

we can Flush Slab mem with "echo 2 > /proc/sys/vm/drop_caches" Which I will try in couple mins. 

ok tested on stage-am3 server

cat /proc/meminfo| grep ^Slab:
Slab: 13986092 kB

echo 2 > /proc/sys/vm/drop_caches

cat /proc/meminfo| grep ^Slab:
Slab: 59816 kB 

Verified : #cat /proc/meminfo| grep ^Sl

#sync; echo 2 > /proc/sys/vm/drop_caches ; sync

Verified prod logs - to make sure everything was okay. 


----------------------------------------------------------------------

Dentry cache, inode cache and page cache in cached column.
Dentry cache: Directory Entry Cache, pathname (filename) lookup cache.
Inode cache: Cache for inode, not actual data block.
Page cache: Cache for actual data block

#awk '/dentry|inode/ { print $1,$2,$3,$4}' /proc/slabinfo 


awk '/dentry|inode/ { print $1,$2,$3,$4}' /proc/slabinfo

#sum up them (bytes)

$awk '/dentry|inode/ { x=x+$3*$4} END {print x }' /proc/slabinfo

view live stats with slabtop by sorting by cache size
$slabtop -s c

write dirty pages to disk

sync

#To free pagecache:

echo 1 > /proc/sys/vm/drop_caches

#To free dentries and inodes:

echo 2 > /proc/sys/vm/drop_caches

#To free pagecache, dentries and inodes:

echo 3 > /proc/sys/vm/drop_caches
-----------------------------------------------

I did not like the approach of adding this to crontab, so I investigated further, asked at mailing lists, learned that Linus himself says that "unused memory is dead memory" and that's why kernel is hungry. Still, I decided to reduce the hunger and added this to /etc/sysctl.conf

vm.vfs_cache_pressure=10000

That did slow it down, but it was still growing. You can run sysctl -p to apply changes to the running kernel without restarting. Next I added these as well:

vm.overcommit_ratio=2
vm.dirty_background_ratio=5
vm.dirty_ratio=20

However, it was still growing, and I decided to leave it be and see what happens. Is my server going to crash, become unavailable, or something. 24 hours later, dentry was again going up like crazy and suddenly it dropped. By itself. See the blue arrow in the screenshot. It seems like kernel figure out that RAM is going to be exhausted, filesystem cache would be reduced, etc. After this point, everything went back to normal.

I tried this experiment again, about a week later, with same results. High-rise, drop and things going back to normal. So, if you're worried your dentry cache is growing like crazy, don't. Just tweak those settings in sysctl and wait for at least 48 hours before drawing any conclusions.


http://www.backwardcompatible.net/139-Reducing-dentry-slab-usage-on-machines-with-a-lot-of-RAM

http://superuser.com/questions/393713/will-slab-memory-be-freed-when-needed-by-user-programs

http://www.ibm.com/developerworks/library/l-virtual-filesystem-switch/

tsu - a new social networking platform which pays users to post!

https://www.tsu.co/ashoklucky

rsync with delete option and different ssh port

How to rsync e.g PIPELINE dir from Source to Destination? #rsync -avzr   --delete-before  -e "ssh -p $portNumber"  /local...