Ubuntu 下 nginx-1.24.0 源码分析 - ngx_sprintf_num 函数

`ngx_sprintf_num`

声明就在 ngx_string.c 的开头

static u_char *ngx_sprintf_num(u_char *buf, u_char *last, uint64_t ui64,
    u_char zero, ngx_uint_t hexadecimal, ngx_uint_t width);

ngx_sprintf_num 实现

static u_char *
ngx_sprintf_num(u_char *buf, u_char *last, uint64_t ui64, u_char zero,
    ngx_uint_t hexadecimal, ngx_uint_t width)
{
    u_char         *p, temp[NGX_INT64_LEN + 1];
                       /*
                        * we need temp[NGX_INT64_LEN] only,
                        * but icc issues the warning
                        */
    size_t          len;
    uint32_t        ui32;
    static u_char   hex[] = "0123456789abcdef";
    static u_char   HEX[] = "0123456789ABCDEF";

    p = temp + NGX_INT64_LEN;

    if (hexadecimal == 0) {

        if (ui64 <= (uint64_t) NGX_MAX_UINT32_VALUE) {

            /*
             * To divide 64-bit numbers and to find remainders
             * on the x86 platform gcc and icc call the libc functions
             * [u]divdi3() and [u]moddi3(), they call another function
             * in its turn.  On FreeBSD it is the qdivrem() function,
             * its source code is about 170 lines of the code.
             * The glibc counterpart is about 150 lines of the code.
             *
             * For 32-bit numbers and some divisors gcc and icc use
             * a inlined multiplication and shifts.  For example,
             * unsigned "i32 / 10" is compiled to
             *
             *     (i32 * 0xCCCCCCCD) >> 35
             */

            ui32 = (uint32_t) ui64;

            do {
                *--p = (u_char) (ui32 % 10 + '0');
            } while (ui32 /= 10);

        } else {
            do {
                *--p = (u_char) (ui64 % 10 + '0');
            } while (ui64 /= 10);
        }

    } else if (hexadecimal == 1) {

        do {

            /* the "(uint32_t)" cast disables the BCC's warning */
            *--p = hex[(uint32_t) (ui64 & 0xf)];

        } while (ui64 >>= 4);

    } else { /* hexadecimal == 2 */

        do {

            /* the "(uint32_t)" cast disables the BCC's warning */
            *--p = HEX[(uint32_t) (ui64 & 0xf)];

        } while (ui64 >>= 4);
    }

    /* zero or space padding */

    len = (temp + NGX_INT64_LEN) - p;

    while (len++ < width && buf < last) {
        *buf++ = zero;
    }

    /* number safe copy */

    len = (temp + NGX_INT64_LEN) - p;

    if (buf + len > last) {
        len = last - buf;
    }

    return ngx_cpymem(buf, p, len);
}

作用：将给定的 64 位无符号整数格式化为字符串，并填充到目标缓冲区

buf：目标缓冲区

last：目标缓冲区的最后一个有效位置的下一个位置（防止溢出）

ui64：64 位无符号整数，要转换的数值

zero：用于填充的字符

hexadecimal 决定进制（0=十进制，1=小写十六进制，2=大写十六进制）

width 是输出字符串的总宽度（不足时填充）

返回指针指向最后一个有效字符的下一个位置，便于链式调用

u_char *p, temp[NGX_INT64_LEN + 1];

temp 临时存放转换后字符的地方

p 指向这个临时区的当前处理位置

`NGX_INT64_LEN`

NGX_INT64_LEN 的定义

在 ngx_config.h 中：

#define NGX_INT64_LEN   (sizeof("-9223372036854775808") - 1)

"-9223372036854775808"：这是一个字符串，表示64位整数的最小值。64位整数的范围是从-9223372036854775808到9223372036854775807，这个字符串的长度就是64位整数转换成字符串后字符串的最大长度。

-1：因为字符串的长度包括了结尾的空字符\0，所以减去1，得到实际的数字长度。

为什么数组声明是 NGX_INT64_LEN + 1？

注释说实际只需要 NGX_INT64_LEN，但为了消除 ICC 编译器的警告。可能的场景是：当从后向前填充 temp 数组时，严格检查数组越界的编译器可能误报，增加 +1 可以绕过这种警告。

size_t          len;

len：记录转换后的字符串的长度

uint32_t        ui32;

如果 ui64 是 32 位以内的值则转为 uint32_t 处理，以减少 64 位除法的性能损耗

static u_char hex[] = "0123456789abcdef";  
static u_char HEX[] = "0123456789ABCDEF";

为什么它们是静态（static）的？

因为这两个表在函数多次调用时内容不变。声明为 static 可以避免每次调用时重新初始化，节省时间和栈空间。

且十六进制转换需要频繁查表，静态表能提升性能。

p = temp + NGX_INT64_LEN;

初始化时，p 指向 temp 数组的末尾之后？这会不会越界？

这里 p 是一个指针，初始指向 temp 数组的“虚拟”末尾位置（temp[NGX_INT64_LEN]）。当从后向前填充数字时，*--p 会先减指针再写入，因此第一个写入的位置是 temp[NGX_INT64_LEN - 1]，确保不会越界。

if (hexadecimal == 0) {

hexadecimal == 0 表示采用十进制表示

if (ui64 <= (uint64_t) NGX_MAX_UINT32_VALUE) {
    ui32 = (uint32_t) ui64;

如果 ui64 是 32 位以内的值（<= NGX_MAX_UINT32_VALUE），则转为 uint32_t 处理，以减少 64 位除法的性能损耗

NGX_MAX_UINT32_VALUE

的定义在ngx_config.h 中：

#define NGX_MAX_UINT32_VALUE  (uint32_t) 0xffffffff

代表32位无符号整数的最大值

do {
                *--p = (u_char) (ui32 % 10 + '0');
            } while (ui32 /= 10);

通过循环取余 10，将每一位转为 ASCII 字符，从 temp 数组末尾向前填充

} else {
            do {
                *--p = (u_char) (ui64 % 10 + '0');
            } while (ui64 /= 10);
        }

大于32位无符号整数的最大值的情况

else if (hexadecimal == 1) {

        do {

            /* the "(uint32_t)" cast disables the BCC's warning */
            *--p = hex[(uint32_t) (ui64 & 0xf)];

        } while (ui64 >>= 4);

这段代码处理的是十六进制小写（hexadecimal == 1）的情况

do-while 循环，每次取 ui64 的最低 4 位（ui64 & 0xf），将其映射为十六进制字符（小写），然后右移 4 位，直到所有位处理完。结果从 temp 数组末尾向前填充。

每次循环后，右移 4 位相当于“丢弃”已处理的最低位，准备处理下一个 4 位组。例如，0xABC 第一次处理 C，右移后变为 0xAB，下一次处理 B，依此类推。

do-while 而非 while

即使 ui64 初始为 0，也会执行一次循环，输出 '0'，确保数值 0 能被正确表示

else { /* hexadecimal == 2 */

        do {

            /* the "(uint32_t)" cast disables the BCC's warning */
            *--p = HEX[(uint32_t) (ui64 & 0xf)];

        } while (ui64 >>= 4);
    }

这部分处理的是十六进制大写的情况。

条件为hexadecimal == 2，对应大写十六进制。

类似于之前的十六进制小写处理，但使用的是HEX数组，包含大写字母。

len = (temp + NGX_INT64_LEN) - p;

    while (len++ < width && buf < last) {
        *buf++ = zero;
    }

    /* number safe copy */

    len = (temp + NGX_INT64_LEN) - p;

    if (buf + len > last) {
        len = last - buf;
    }

    return ngx_cpymem(buf, p, len);

这部分代码负责填充和复制处理后的数字到缓冲区。

首先，len = (temp + NGX_INT64_LEN) - p;

这里计算转换后的数字字符串的长度。

因为p指向有效字符的起始位置，而temp数组的起始地址加上NGX_INT64_LEN得到的是数组末尾，所以两者相减得到数字的实际长度。

接下来是填充循环：

while (len++ < width && buf < last)

这里len初始值是数字转换字符串后的长度，width是用户指定的总宽度。

如果数字长度小于width，就需要在数字前面填充zero字符（如'0'或空格），直到达到指定宽度。

每次循环填充一个字符，同时检查buf是否超过last，避免缓冲区溢出。

这里需要注意的是，len在循环条件中被自增，所以循环次数是width - len_initial，例如初始len为3，width为5，循环会执行两次，填充两个字符，使得总长度变为5。

接下来的 len = (temp + NGX_INT64_LEN) - p;

前面的填充循环修改了len变量，因此需要重新获取正确的数字长度

然后，检查if (buf + len > last)，确定缓冲区剩余空间是否足够复制整个数字字符串

如果不够，则调整len为剩余空间的大小，防止溢出

最后使用ngx_cpymem函数将转换后的字符串从临时区复制到缓冲区，并返回新的buf位置。