Bootstrap

Open files rlimit 1024 reached for uid XXXX

关键字:
Novell SUSE Linux Enterprise Server 9
Novell SUSE Linux Enterprise Server 9 Service Pack 3

Open files rlimit 1024 reached for uid XXXX


现象:
在Novell SUSE Linux Enterprise Server 9 的环境下

/var/log/messages 中出现
Open files rlimit 1024 reached for uid XXXX

那么恭喜你,你的机器很快就有可能down..我这边遇到的现象是重启。

具体日志为:

nbcsa kernel: open files rlimit 1024 reached for uid 0 pid 8003

我的服务器是IBM的X3650..出现这样的日志后,基本3-4天就会重启。

重启的时间很规律,都是凌晨的4:15左右。


解决办法:
你可以按照,以下方式来解决
[url]http://www.novell.com/support/php/search.do?cmd=displayKC&docType=kc&externalId=3302273&sliceId=1&docTypeID=DT_TID_1_1&dialogID=63021602&stateId=0%200%2063025219[/url]

我使用了第一种方式:
Modify the file "/etc/init.d/cron" adding the line "ulimit -n 65536" (without quotes)

改了之后,故障依旧。

我觉得自己改的有些唐突,没有搞清状况就动手了。

首先要搞清楚 pid 8003 是哪个进程。

但ps -ef|grep 8003,找不到这个进程。。

于是我写了一个shell,每分钟执行一次进行捕捉,但依旧找不到8003。


于是。。。

我就采用了第二种方式:
Use the "/etc/initscript" file

As stated in the man pages for initscript (man initscript) is possible to create the file "/etc/initscript" that can be used to set things like ulimit and umask default values for every process. The following initscript will increase the Hard limits for the open files to 65536 for every process:

# Increase the hard file descriptor limit for all processes
# to 65536. The soft limit is still 1024, but any unprivileged
# process can increase it's soft limit up to the hard limit
# with "ulimit -Sn xxx" (needs a 2.2.13 or later Linux kernel).
ulimit -Hn 65536

# Execute the program.
eval exec "$4"

监测了3个礼拜后,故障得到解决。

遗留问题:
uid o pid 8003,到底是哪个进程。

为何此进程不受PAM影响.(/etc/security/limits.conf 在装oracle9i时,已经被orarun修改过)
;