Bootstrap

Linux tracing之基于uprobe/kprobe的调试调优浅析

经过长期的发展, kprobes/uprobes 机制在事件(events)的基础上分别为内核态和用户态提供了追踪调试的功能, 这也构成了 tracepoint 机制的基础, 后期的很多工具, 比如 perf_eventsftrace 等都是在其基础上演化而来. 参考由 Brendan Gregg 提供的资料来看, kprobes/uprobes 在 Linux 动态追踪层面起到了基石的作用, 如下所示:

uprobe与krobe对应,动态附加到用户态调用函数的切入点称为uprobe,相比如kprobe 内核函数的稳定性,uprobe 的函数由开发者定义。uprobe是用户态的探针,它和kprobe是相对应的,kprobe是内核态的探针。uprobe需要制定用户态探针在执行文件中的位置,插入探针的原理和kprobe类似。

下面是对uprobe使用方法的简单介绍。比如跟踪下面程序中的callee函数调用:

#include <stdio.h>

int callee(void)
{
    printf("hello world.\n");

    return 0;
}

int main(void)
{    
    callee();

    return 0;
}

Makefile

分别编译地址加载无关和地址相关两个版本的应用程序,分别抓取其PROBE结果。

all:
	gcc -O0 -no-pie -fno-pic -g main.c -o main-nopie
	gcc -O0 -g main.c -o main
clean:
	rm -fr main main-nopie

增加一个新的uprobe event,命令如下(在可执行文件main的0x63a偏移处增加一个uprobe探针),当callee函数被调用时,截获调用事件信息:

$ readelf -s main|grep callee
    64: 000000000000063a    23 FUNC    GLOBAL DEFAULT   14 callee
$ 
# echo 'p:callee /home/zlcao/uprobe/main:0x63a' > /sys/kernel/debug/tracing/uprobe_events

增加一个返回事件信息:

echo 'r:callee_ret /home/zlcao/uprobe/main:0x63a' >> /sys/kernel/debug/tracing/uprobe_events

之后使用如下命令查看注册的EVENT事件

cat /sys/kernel/debug/tracing/uprobe_events

定义以后,使能所有的events:

echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable

之后执行测试用例,执行完毕后,通过如下命令查看探测记录。

cat /sys/kernel/debug/tracing/trace

可以看到,callee函数的调用和返回事件都被触发了,输出显示给我们uprobe被触发时:main-27731、程序PC 0x55f1b22cb63a,uretprobe被触发时:PC从函数入口0x55f1b22cb63a返回0x55f1b22cb65a。

之后,通过如下命令关闭探测。

echo 0 > /sys/kernel/debug/tracing/events/uprobes/enable

最后,使用如下命令清除掉所有注册的events。

echo > /sys/kernel/debug/tracing/uprobe_events

添加探测点的步骤比较麻烦,perf很贴心地添加了一键添加探测点的功能,只需要执行一个简单的命令即可:

# perf probe -x ./main-nopie callee
# perf probe -x ./main-nopie callee%return

之后使能探测,运行用例。

# echo 1 > /sys/kernel/debug/tracing/events/probe_main/enable
# ./main-nopie

或者直接执行如下指令运行:

root@zlcao-Vostro-3268:/home/zlcao/uprobe# perf record -e probe_main:callee -aR ./main-nopie 
hello world.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.537 MB perf.data (1 samples) ]
root@zlcao-Vostro-3268:/home/zlcao/uprobe# 

使用命令perf report查看生成的perf.data文件

查看当前所有的EVENT 探测点

root@zlcao-Vostro-3268:/home/zlcao/uprobe# perf probe -l
  kprobes:myprobe      (on _do_fork)
  kprobes:myretprobe   (on _do_fork%return with arg1)
  probe_libc:memcpy    (on __new_memcpy_ifunc@x86_64/multiarch/memcpy.c in /lib/x86_64-linux-gnu/libc-2.27.so)
  probe_main:callee    (on callee@/home/zlcao/uprobe/main.c in /home/zlcao/uprobe/main-nopie)
  probe_main:callee__return (on callee%return@/home/zlcao/uprobe/main.c in /home/zlcao/uprobe/main-nopie)
root@zlcao-Vostro-3268:/home/zlcao/uprobe# 

PIE和NO-PIE应用的区别

通过上面的例子可以看到,默认编译的PIE应用和非PIE应用都可以被事件机制探测调试,那么它们的区被是什么呢?通过下面几幅图可以看出,PIE应用的符号地址是相对的,加载地址是随机的,而非PIE应用的加载地址和运行地址完全是由应用符号表提供的,运行时加载器必须遵守,不能更改。

通过traceing_pipe查看探测结果,/sys/kernel/debug/tracing/trace_pipe文件为同步阻塞式读取文件,执行cat /sys/kernel/debug/tracing/trace_pipe后控制台处于阻塞状态,当运行测试用例时,实时更新探测结果。

通过内核模块截获uprobe事件
#include <linux/module.h>
#include <linux/ptrace.h>
#include <linux/uprobes.h>
#include <linux/namei.h>
#include <linux/moduleparam.h>

MODULE_AUTHOR("john doe");
MODULE_LICENSE("GPL v2");

static char *filename;
module_param(filename, charp, S_IRUGO);

static long offset;
module_param(offset, long, S_IRUGO);

static int handler_pre(struct uprobe_consumer *self, struct pt_regs *regs)
{
	pr_info("handler: arg0 = %d arg1 =%d \n", (int)regs->di, (int)regs->si);
	return 0;
}

static int handler_ret(struct uprobe_consumer *self,
                       unsigned long func,
                       struct pt_regs *regs)
{
	pr_info("ret_handler ret = %d \n", (int)regs->ax);
	return 0;
}

static struct uprobe_consumer uc = {
	.handler = handler_pre,
	.ret_handler = handler_ret,
};


static struct inode *inode;

static int __init uprobe_init(void)
{
	struct path path;
	int ret;

	ret = kern_path(filename, LOOKUP_FOLLOW, &path);
	if (ret < 0) {
		pr_err("kern_path failed, returned %d\n", ret);
		return ret;
	}

	inode = igrab(path.dentry->d_inode);
	path_put(&path);

	ret = uprobe_register(inode, offset, &uc);
	if (ret < 0) {
		pr_err("register_uprobe failed, returned %d\n", ret);
		return ret;
	}

	return 0;
}

static void __exit uprobe_exit(void)
{
	uprobe_unregister(inode, offset, &uc);
}

module_init(uprobe_init);
module_exit(uprobe_exit);

测试用例和MAKEFILE

#include <stdio.h>

int callee(int a, int b)
{
    printf("hello world %d.\n", a + b);

    return a+b;
}

int main(void)
{    
    callee(8, 9);

    return 0;
}
ifneq ($(KERNELRELEASE),)
obj-m:=uprobe.o
else
KERNELDIR:=/lib/modules/$(shell uname -r)/build
PWD:=$(shell pwd)
all:
	$(MAKE) -C $(KERNELDIR) M=$(PWD) modules
	gcc -O0 -no-pie -fno-pic -g main.c -o main-nopie
	gcc -O0 -g main.c -o main
 
clean:
	rm -rf *.o *.mod.c *.mod.o *.ko *.symvers *.mod .*.cmd *.order
format:
	astyle --options=linux.astyle *.[ch]
endif

测试

uprobe原理

uprobe基于int3 debug指令实现,以下是回调CALLSTACK。

设置uprobe,通过GDB查看callee函数入口,发现其指令已经变为INT3。

探测过程如下图所示,指定位置上的指令,头部修改为软件中断指令(同时原指令存档他处):

  1. 当执行到该位置时,触发软件中断,陷入内核
  2. 在内核,执行以 注入的 Handler
  3. 单步执行原指令
  4. 修正寄存器和栈,回到原有指令流

在这里插入图片描述

如果指令地址对应的页面还没有映射,会通过:

uprobe_write_opcode->get_user_pages_remote->__get_user_pages_locked->__get_user_pages->faultin_page->handle_mm_fault ...

进行PAGE FAULT的处理,之后在写入INT3指令。

Kprobe实践

root@zlcao-Vostro-3268:~# echo 'p:myprobe _do_fork' > /sys/kernel/debug/tracing/kprobe_events
root@zlcao-Vostro-3268:~# echo 'r:myretprobe _do_fork $retval' >> /sys/kernel/debug/tracing/kprobe_events
root@zlcao-Vostro-3268:~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
root@zlcao-Vostro-3268:~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/kprobe_events
p:kprobes/myprobe _do_fork
r4:kprobes/myretprobe _do_fork arg1=$retval
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 4/4   #P:4
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| /     delay
#           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
#              | |         |   ||||      |         |
            bash-8783    [002] .... 150455.854930: myprobe: (_do_fork+0x0/0x3e0)
            bash-8783    [002] d... 150455.855143: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2322
            bash-8783    [001] .... 150464.091940: myprobe: (_do_fork+0x0/0x3e0)
            bash-8783    [001] d... 150464.092151: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2323
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 80/80   #P:4
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| /     delay
#           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
#              | |         |   ||||      |         |
            bash-8783    [002] .... 150455.854930: myprobe: (_do_fork+0x0/0x3e0)
            bash-8783    [002] d... 150455.855143: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2322
            bash-8783    [001] .... 150464.091940: myprobe: (_do_fork+0x0/0x3e0)
            bash-8783    [001] d... 150464.092151: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2323
  gsd-media-keys-2620    [003] .... 150466.748575: myprobe: (_do_fork+0x0/0x3e0)
  gsd-media-keys-2620    [003] d... 150466.749024: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2324
  gsd-media-keys-8996    [001] .... 150466.749133: myprobe: (_do_fork+0x0/0x3e0)
  gsd-media-keys-8996    [001] d... 150466.749647: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2325
  gsd-media-keys-2620    [003] .... 150466.750599: myprobe: (_do_fork+0x0/0x3e0)
  gsd-media-keys-2620    [003] d... 150466.750610: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2326
           gdbus-2656    [000] .... 150466.750718: myprobe: (_do_fork+0x0/0x3e0)
           gdbus-2656    [000] d... 150466.750743: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2327
  gnome-terminal-8997    [001] .... 150466.797071: myprobe: (_do_fork+0x0/0x3e0)
  gnome-terminal-8997    [001] d... 150466.797090: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2328
  gnome-terminal-8997    [001] .... 150466.797825: myprobe: (_do_fork+0x0/0x3e0)
  gnome-terminal-8997    [001] d... 150466.798023: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2329
 gnome-terminal.-9001    [003] .... 150466.812982: myprobe: (_do_fork+0x0/0x3e0)
 gnome-terminal.-9001    [003] d... 150466.813005: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232a
 gnome-terminal.-9001    [003] .... 150466.813730: myprobe: (_do_fork+0x0/0x3e0)
 gnome-terminal.-9001    [003] d... 150466.813742: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232b
 gnome-terminal.-9001    [003] .... 150466.834816: myprobe: (_do_fork+0x0/0x3e0)
 gnome-terminal.-9001    [003] d... 150466.834839: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232c
 gnome-terminal--3324    [000] .... 150466.852493: myprobe: (_do_fork+0x0/0x3e0)
 gnome-terminal--3324    [000] d... 150466.852528: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232d
 gnome-terminal--3324    [000] .... 150466.852532: myprobe: (_do_fork+0x0/0x3e0)
 gnome-terminal--3324    [000] d... 150466.852544: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232e
            pool-9006    [003] .... 150466.853913: myprobe: (_do_fork+0x0/0x3e0)
            pool-9006    [003] d... 150466.854893: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232f
 xdg-desktop-por-2796    [003] .... 150466.858576: myprobe: (_do_fork+0x0/0x3e0)
 xdg-desktop-por-2796    [003] d... 150466.858605: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2330
 xdg-desktop-por-2796    [003] .... 150466.858609: myprobe: (_do_fork+0x0/0x3e0)
 xdg-desktop-por-2796    [003] d... 150466.858617: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2331
            bash-9007    [001] .... 150466.859319: myprobe: (_do_fork+0x0/0x3e0)
            bash-9007    [001] d... 150466.859386: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2332
            bash-9010    [002] .... 150466.859621: myprobe: (_do_fork+0x0/0x3e0)
            bash-9010    [002] d... 150466.859678: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2334
        lesspipe-9012    [001] .... 150466.860217: myprobe: (_do_fork+0x0/0x3e0)
        lesspipe-9012    [001] d... 150466.860247: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2335
        lesspipe-9012    [002] .... 150466.861003: myprobe: (_do_fork+0x0/0x3e0)
        lesspipe-9012    [002] d... 150466.861039: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2336
        lesspipe-9014    [003] .... 150466.864089: myprobe: (_do_fork+0x0/0x3e0)
        lesspipe-9014    [003] d... 150466.864147: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2337
            bash-9007    [001] .... 150466.865518: myprobe: (_do_fork+0x0/0x3e0)
            bash-9007    [001] d... 150466.865572: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2338
            bash-9016    [002] .... 150466.865788: myprobe: (_do_fork+0x0/0x3e0)
            bash-9016    [002] d... 150466.865841: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2339
            bash-9007    [000] .... 150467.633920: myprobe: (_do_fork+0x0/0x3e0)
            bash-9007    [000] d... 150467.634127: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233a
             vim-9018    [000] .... 150467.760802: myprobe: (_do_fork+0x0/0x3e0)
             vim-9018    [000] d... 150467.760936: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233b
            bash-9019    [001] .... 150467.761927: myprobe: (_do_fork+0x0/0x3e0)
            bash-9019    [001] d... 150467.761959: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233c
             vim-9018    [000] .... 150468.007578: myprobe: (_do_fork+0x0/0x3e0)
             vim-9018    [000] d... 150468.007916: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233d
             vim-9018    [000] .... 150468.069323: myprobe: (_do_fork+0x0/0x3e0)
             vim-9018    [000] d... 150468.069689: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233e


参考资料

一种基于调试器的运行时非侵入调试方式_mips sdbbp_papaofdoudou的博客-CSDN博客

Linux内核调试利器|kprobe 原理与实现 - 知乎

Linux内核调试技术——kretprobe使用与实现_register_kretprobe_luckyapple1028的博客-CSDN博客

Linux kretprobe使用和原理_小立爱学习的博客-CSDN博客

Linux jprobe的使用和原理_小立爱学习的博客-CSDN博客

Linux内核调试技术——jprobe使用与实现_luckyapple1028的博客-CSDN博客

Linux内核文档,Documentation/trace/uprobetracer.rst

Uprobe-tracer: Uprobe-based Event Tracing — The Linux Kernel documentation

Linux uprobe: User-Level Dynamic Tracing

在kernel中如何实现应用层程序执行时候的指令监控和打印 - 简书

Linux 系统动态追踪技术介绍 | arstercz’s blog

Linux K/Uprobe 使用指南 · GitBook

深入ftrace uprobe原理和功能介绍_uprobe 原理_奇小葩的博客-CSDN博客

结束
;