How to enable Kdump on RHEL 7 and CentOS 7

by Pradeep Kumar · Published March 6, 2016 · Updated August 3, 2017

Kdump is a kernel feature which is used to capture crash dumps when the system or kernel crash. For enabling kdump we have to reserve some portion of physical RAM which will be used to execute kdump kernel in the event of kernel panic or crash.

When a kernel crash or kernel panic occurs then running kernel runs ‘kexec(kdump kernel)‘ and it loads kdump kernel from reserve memory and then contents of RAM and Swap is copied to vmcore file either on local disk or on remote disk and finally reboot the box.

By analyzing the crash dumps we can find the reason or the root case of system failure. If you have OS support then you can share the crash dumps to the vendor for analysis.

In this article we will demonstrate how to enable kdump on RHEL 7 and CentOS 7

Step:1 Install ‘kexec-tools’ using yum command

Use the below yum command to install ‘kexec-tools’ package in case it is not installed.

[root@cloud ~]# yum install kexec-tools

Step:2 Update the GRUB2 file to Reserve Memory for Kdump kernel

Edit the GRUB2 file (/etc/default/grub), add the parameter ‘crashkernel=<Reserved_size_of_RAM>‘ in the line beginning with ‘GRUB_CMDLINE_LINUX‘

GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=128M vconsole.keymap=us rhgb quiet"

Execute the below command to regenerate grub2 configuration.

[root@cloud ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

In case of UEFI firmware, use the below command

[root@cloud ~]# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

Above command will inform bootlaoder to reserve 128 MB RAM after reboot.

Reboot the box now using below command :

[root@cloud ~]# shutdown -r now

Step:3 Update the dump location & default action in the file (/etc/kdump.conf)

To store crash dump or vmcore file on a local file system, edit the file ‘/etc/kdump.conf‘ and specify the location as per your setup. In my case i am using a separate local file system ( /var/crash). It is recommended that size of file system should be equivalent to the size of your system’s RAM or file system should have free space equivalent to the size of RAM. Kdump allows to compress the dump data using ‘core collector’ option (core_collector makedumpfile -c ) where -c is used for compression.

In case if kdump fails to store the dump file to specified location then default action will be performed which is mention in the default directive. In my case default action is reboot.

Update the below three directives in kdump.conf file.

[root@cloud ~]# vi /etc/kdump.conf

path /var/crash
core_collector makedumpfile -c
default reboot

Different Options to store dump :

Step:4 Start and enable kdump service

[root@cloud ~]# systemctl start kdump.service
[root@cloud ~]# systemctl enable kdump.service
[root@cloud ~]#

Step:5 Now Test Kdump by manually crashing the system

Before crashing your system , please verify whether the kdump service is running or not using below command.

[root@cloud crash]# systemctl is-active kdump.service
[root@cloud crash]# service kdump status

To test our kdump configuration we will manually crash our system with below commands.

[root@cloud ~]# echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger

This will create a crash dump file (vmcore ) under ‘/var/crash‘ file system.

[root@cloud ~]# ls -lR /var/crash
/var/crash:
total 0
drwxr-xr-x. 2 root root 42 Mar 4 03:02 127.0.0.1-2016-03-04-03:02:17

/var/crash/127.0.0.1-2016-03-04-03:02:17:
total 135924
-rw-------. 1 root root 139147524 Mar 4 03:02 vmcore
-rw-r--r--. 1 root root 35640 Mar 4 03:02 vmcore-dmesg.txt
[root@cloud ~]#

Step:6 Use ‘crash’ command to analyze and debug crash dumps

Crash is the utility or command to debug and analyze the crash dump or vmcore file.

To use the crash, make sure two packages are installed : ‘crash & kernel-debuginfo‘

[root@cloud ~]# yum install crash

To install ‘kernel-debuginfo’ package , first enable debug repo. Edit the repo file /etc/yum.repos.d/CentOS-Debuginfo.repo

change ‘enbled=0’ to ‘enabled=1’

[root@cloud ~]# yum install kernel-debuginfo

Once the kernel-debuginfo is installed , then try to execute below crash command, it will give us a crash prompt where we can run commands to find process info , list of open files when the system got crashed.

[root@cloud ~]# crash /var/crash/127.0.0.1-2016-03-04-14\:20\:06/vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux

crash>

Type ‘ps‘ command to list the Process which were running when the system got crashed.

crash> ps

To view the files that were open when system got crashed , type ‘files’ command at crash prompt.

crash> files
PID: 5577 TASK: ffff88007b44f300 CPU: 0 COMMAND: "bash"
ROOT: / CWD: /root
 FD FILE DENTRY INODE TYPE PATH
 0 ffff880036b85000 ffff8800796fa540 ffff88007966f4d0 CHR /dev/pts/0
 1 ffff880036b73900 ffff880068c409c0 ffff8800794a8d10 REG /proc/sysrq-trigger
 2 ffff880036b85000 ffff8800796fa540 ffff88007966f4d0 CHR /dev/pts/0
 10 ffff880036b85000 ffff8800796fa540 ffff88007966f4d0 CHR /dev/pts/0
255 ffff880036b85000 ffff8800796fa540 ffff88007966f4d0 CHR /dev/pts/0
crash>

Type ‘sys’ command to list the system info when it got crashed.

crash> sys
 KERNEL: /usr/lib/debug/lib/modules/3.10.0-327.10.1.el7.x86_64/vmlinux
 DUMPFILE: /var/crash/127.0.0.1-2016-03-04-14:20:06/vmcore
 CPUS: 1
 DATE: Fri Mar 4 14:20:01 2016
 UPTIME: 00:02:00
LOAD AVERAGE: 0.75, 0.48, 0.19
 TASKS: 115
 NODENAME: cloud.linuxtechi.com
 RELEASE: 3.10.0-327.10.1.el7.x86_64
 VERSION: #1 SMP Tue Feb 16 17:03:50 UTC 2016
 MACHINE: x86_64 (2388 Mhz)
 MEMORY: 2 GB
 PANIC: "SysRq : Trigger a crash"
crash>

To get help of any command on crash prompt , type ‘help <command>‘ , example is shown below.

That’s conclude the article, Please don’t hesitate to share it if you have enjoyed.

Step:1 Install ‘kexec-tools’ using yum command

Step:2 Update the GRUB2 file to Reserve Memory for Kdump kernel

Step:3 Update the dump location & default action in the file (/etc/kdump.conf)

Step:4 Start and enable kdump service

Step:5 Now Test Kdump by manually crashing the system

Step:6 Use ‘crash’ command to analyze and debug crash dumps

You Might Also Like

震惊了！原来这才是kafka！

nginx 配置 google代理

centos服务器设置代理上网的方法

Leave a Reply Cancel reply