admin 管理员组文章数量: 1086019
2024年12月25日发(作者:html源代码怎么看)
关于SUSE LINUX系统假死问题,我们需要分几个方面来看:
一、如果这个时候系统网络能有响应(比如能ping通),但是kernel运行正常,同时,我们
也要确保Ctrl+ALt+F1--F6键时console控制台正常,那么我们可以通过下面的方法来获取一
些信息:
基于SUSE LINUX操作系统方面,我们部署在故障发生时通过魔术键将系统状态的
CALLTRACE(如内存、线程堆栈等)抓出来,则可以清晰的了解系统当时的状态。可通过
配置串口控制台及操作本地键盘魔术键来将系统状态导出到控制台上。此时可以通过触发魔
术键来获取有用信息。
服务器发生死机前,必须先配置服务器,具体步骤如下:
1、 进入以下的界面提示后按
“Press
Press
选择Server Mangement菜单下:
Console Redirection,即选择控制台重定向的串口,设置为enable
记录下默认的串口波特率等参数以备死机时连接使用。
2、有运行业务的机器上开启sysrq功能:
echo 1 > /proc/sys/kernel/sysrq 这种方法不用重启系统
修改上面目录下的sysrq文件,将文件内容改为1,可知系统已启用sysrq。
再通过修改/etc/文件,这样可以保证系统启动后自动开启sysrq功能。方法是:
在/etc/文件中加入:
=1
运行:
sysctl -p
使之立即生效
3、现某台机器业务中断后,先尝试网络登录,如果可以,执行下面命令:
echo t > /proc/sysrq-trigger
echo p > /proc/sysrq-trigger
echo m > /proc/sysrq-trigger
如果网络不能登录,可尝试本地操作,串口登入,在控制台执行上述3条命令。
如果本地也不能登录,可执行在死机的服务器的键盘上先后同时按下:
Alt + SysRq + “t ”
Alt + SysRq + “p”
Alt + SysRq + “m”
二、如果这个时候系统属于真死机状态,也就是出现kernel panic或kernel Oops的话,那么
我们需要部署LKCD工具来做Crash DUMP,从获取的DUMP文件中来分析系统死机的原因,
配置LKCD的方法如下:
1、开启core dump功能
1) edit /etc/profile and comment following lines:
将ulimit -Sc 0注释掉,即:
修改成 #ulimit -Sc 0
2) edit /etc/security/ and add two line like:
* soft core unlimited
* hard core unlimited
2、配置lkcd
1) Edit /etc/sysconfig/dump
修改并激活以下选项:
DUMP_ACTIVE='1'
DUMPDEV='/dev/cciss/c0d0p2'
DUMPDIR='/var/log/dump'(存放core dump文件的路径,一般默认即可)
DUMP_LEVEL='4' (在SLES8上默认是8 ,在SLES9上默认是2,是生成DUMP文件的级别)
DUMP_FLAGS='0x80000000'
TARGET_HOST='' (“”里头输入主机的IP地址)
2)运行以下命令使脚本生效:
#lkcd config
#lkcd_config –q (输出信息来自刚才lkcd config的配置)
#insserv /etc/init.d/
3、设置LKCD在系统启动时自启动:
运行YAST
选择system>runlevel>专家模式,将改成运行级别3和5都在系统启动的时候启动
,保存退出!重新启动服务器检查是否LKCD已经启动。
检查是否LKCD已经启动的方法:
在服务器的键盘上先后同时按下:
Alt + SysRq + “t ”
Alt + SysRq + “p”
Alt + SysRq + “m”
若有对话框跳出,即已启动。
后面的是官方网站的文章参考:
Configuring a Remote Serial Console for SLESThis document (3456486) is provided subject to the
disclaimer at the end of this document.
Environment
Novell SUSE Linux Enterprise Server 10
Novell SUSE Linux Enterprise Server 9
Novell SUSE Linux Enterprise Server 8
Novell SUSE Linux Enterprise Desktop 10
Novell Linux Desktop
Novell SUSE Linux Openexchange Server 4.1
Novell SUSE Linux Standard Server 8
Serial Console
Remote Management
Situation
Purpose
Configure access to a system using a serial connection, e.g. in order to manually trigger a kernel
crash dump.
Resolution
Assumptions
Another Linux system is to be configured to act as the serial console for a server, rather than, say, a
data terminal or a Microsoft Windows system.
On both systems, the null modem cable is attached to the first serial port ("COM1" in
DOS-terminology).
The server is booted using GRUB.
The connection will use a baud rate of 115200, 8 data bits, 1 stop bit and odd parity ("115200
8N1").
Configuration Steps
Connect a null modem cable between the system that will act as the console and the server. Refer to
the Wikipedia article Null modem for details, including pin mapping.
If the server's BIOS supports serial console, configure the BIOS for it. The details of this procedure
are dependent on the BIOS vendor - refer to vendor documentation.
Configure GRUB on the server to use the first serial port. In the file /boot/grub/, comment
out the color and gfxmenu lines and add the following lines:
serial --unit=0 --speed=115200
terminal --timeout=15 serial console
(在启动标题栏上方)
Configure the kernel (and hypervisor) on the server to use the serial port. This configuration differs
between Xen setups and non-Xen setups.
Non-Xen setup
In the file /boot/grub/, add the following options to the kernel command line:
console=tty0 console=ttyS0,115200
Kernel messages will be written to both tty0 and ttyS0, but OS messages will only be written to
ttyS0. OS messages go to the last console defined on the boot options line.
A sample /boot/grub/ file illustrating these changes:
#color white/blue black/light-gray
default 0
timeout 8
#gfxmenu (hd0,1)/boot/message
serial --unit=0 --speed=115200
terminal --timeout=15 serial console
title Linux ! SERIAL CONSOLE !
kernel (hd0,1)/boot/vmlinuz root=/dev/sda3 selinux=0 splash=0 resume=/dev/sda1 showopts
elevator=cfq vga=791 console=tty0 console=ttyS0,115200
initrd (hd0,1)/boot/initrd
Xen setup
When Xen virtualization is used, both the Xen hypervisor and the Dom0 kernel need to be
instructed to use the serial connection:
Add console=vga,com1 com1=115200 to the parameters for the hypervisor.
Add console=tty0 console=xvc0,115200 to the parameters for the Dom0 kernel.
A sample /boot/grub/ file illustrating these changes:
#color white/blue black/light-gray
default 0
timeout 8
#gfxmenu (hd1,0)/boot/message
serial --unit=0 --speed=115200
terminal --timeout=15 serial console
title Linux - Xen ! SERIAL CONSOLE !
kernel (hd0,1)/boot/ console=vga,com1 com1=115200
module (hd0,1)/boot/vmlinuz root=/dev/sda3 console=tty0 console=xvc0,115200
module (hd0,1)/boot/initrd
Configure the server to allow logins over the serial connection. In the file /etc/inittab, add the
following line.
S0:12345:respawn:/sbin/agetty -L 115200 console vt102
To allow single-user mode to work using the serial connection, additionally change the line
~~:S:respawn:/sbin/sulogin
in /etc/inittab to
~~:S:respawn:/sbin/sulogin /dev/console
NOTE: Single-user mode will only work on the serial console with this option. You will need to
change it back, to run on the local console.
Configure the serial port on the server as a secure port, so a login as root is possible on it without
the need to log in as a regular user first.
Add lines
console
ttyS0
xvc0
to the file /etc/securetty
Ensure the package screen is installed on the server; this will be used later on to send control
sequences to it.
Triggering kernel crash dumps using the serial console
The serial console connection can be used to perform "magic SysRq" control of the server,
including triggering a kernel crash dump. This is particularly useful when analysing system hangs
where "magic SysRq" via the system's keyboard is not working.
Configure the server for kernel crash dump capture. Refer to the appropriate TID for details:
TID 3374462, Configure kernel core dump capture, documents the Kdump method for SLE 10.
TID 3044267, Configure lkcd to capture a kernel core dump, documents the lkcdutils method,
primarily used with SLES9 and related products.
Use a serial program like minicom on the serial console system to connect to the server over the
serial port.
Login to the system as root to the serial console system and run
screen -S console /dev/ttyS0 115200
This sets up a screen session connected to the first serial port. To use this session, do the following:
Login as root to the serial console system from any machine on the network.
Run the following command:
screen -x -r console
On a reboot of the SUSE host, GRUB will prompt "Press any key to continue." If a key is pressed,
then the GRUB menu will be displayed on the device used. If no key is pressed, the GRUB menu
will be displayed on the serial console screen as defined by the terminal option in
the/boot/grub/ file.
The screen command allows for multiple users to attach and control the screen simultaneously. This
allows for multiple people to participate in the troubleshooting process if necessary.
To trigger a kernel crash dump:
Non-Xen setup
Send a break to the serial port and then the magic sysrq key. For example: Ctrl-A, Ctrl-B, d. Refer
to the screen man page for more commands.
Xen setup
With the Xen hypervisor, the magic sysrq key is Ctrl-O; send Ctrl-O, d to trigger a crash dump.
版权声明:本文标题:关于SUSE LINUX系统假死问题(魔术键和Serial Console配置) 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://roclinux.cn/p/1735189240a1641713.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论