hashmap为什么每次扩容都是2倍？

HashMap 扩容为什么是 2 倍，且可以用移位操作代替与运算？

在 HashMap 中，哈希桶（数组）的大小总是 2 的幂，扩容时也是原大小的 2 倍。这样做的主要目的是 优化哈希计算，使得索引计算可以用位运算（&）替代取模（%），提高性能。

1. HashMap 扩容规则

HashMap 的数组容量 始终是 2 的幂（16, 32, 64...）。
扩容时，容量翻倍。
索引计算采用 (n - 1) & hash，而不是 hash % n。

2. 为什么扩容 2 倍可以用位运算替代取模？

（1）传统哈希取模的性能问题

通常，哈希表的索引计算方式是：

index=hashmod table.length\text{index} = \text{hash} \mod \text{table.length}

但 % 取模运算较慢，尤其是 table.length 不是 2 的幂时，需要 CPU 执行 除法操作，这比位运算 慢很多。

（2）利用 2 的幂优化索引计算

如果 table.length = 2^n，可以用 位运算 替代 % 运算：

index=(hash&(table.length−1))\text{index} = (\text{hash} \& (\text{table.length} - 1))

原因是：

计算 hash % (2^n) 等价于 hash & (2^n - 1)。
& 运算 比 % 更快（CPU 一条指令即可完成）。

示例：假设 table.length = 16（即 2^4），计算索引：

int hash = 27;
int index1 = hash % 16;         // 27 % 16 = 11
int index2 = hash & (16 - 1);   // 27 & 15 = 11

27 & 15 计算过程（15 = 1111）：

  27  =  11011
& 15  =  01111
----------------
         01011  (即 11)

结果相同，但 & 更快。

3. 为什么扩容 2 倍后元素的位置要么不变，要么 `+oldCap`？

在 HashMap 扩容时，新容量是旧容量的 2 倍：

newCapacity = oldCapacity * 2;

元素的索引计算变为：

newIndex = hash & (newCapacity - 1);

我们来分析一个元素的索引如何变化。

（1）扩容前的索引计算

假设 oldCapacity = 16，那么索引计算是：

oldIndex = hash & (16 - 1)  // oldIndex = hash & 15

（2）扩容后的索引计算

扩容后 newCapacity = 32，索引计算变为：

newIndex = hash & (32 - 1)  // newIndex = hash & 31

其中：

31 = 0001 1111
15 = 0000 1111

那么 hash & 31 和 hash & 15 有什么区别？

4. 扩容后索引的变化

假设 hash 值是 11011（即 27），计算 oldIndex 和 newIndex：

  27   =  11011
& 15   =  01111   (oldIndex = 11)
& 31   =  11111   (newIndex = 27)

发现什么？

oldIndex = hash & 15
newIndex = hash & 31
newIndex 要么等于 oldIndex，要么等于 oldIndex + oldCapacity。

为什么？ 因为 newCapacity = 2 × oldCapacity，所以 (newCapacity - 1) 的二进制是比 (oldCapacity - 1) 多了最高位的 1。

如果 hash 在 oldCapacity 位置上的位是 0，新索引和旧索引相同。
如果 hash 在 oldCapacity 位置上的位是 1，新索引变成 oldIndex + oldCapacity。

5. 例子验证

（1）hash = 5（0000 0101）

oldIndex = 5 & 15 = 5
newIndex = 5 & 31 = 5
位置不变

（2）hash = 21（0001 0101）

oldIndex = 21 & 15 = 5
newIndex = 21 & 31 = 21
索引变为 oldIndex + 16

结论：

索引要么不变（hash 在 oldCapacity 位的值为 0）
要么变成 oldIndex + oldCapacity（hash 在 oldCapacity 位的值为 1）

6. 为什么这样做可以提高性能？

减少 rehash 计算
- 只需要判断 hash 在 oldCapacity 位上的值是 0 还是 1，无需重新计算 hash。
迁移更简单
- 不需要重新计算哈希，只需要 将元素分配到原索引或 oldIndex + oldCapacity。
- O(N) 复杂度，而不是 O(N logN)。
利用 CPU 位运算优化
- & 操作比 % 取模快得多，提高哈希表的查询性能。

7. 结论

HashMap 采用 2 倍扩容，确保 table.length 始终是 2 的幂。
索引计算 (n - 1) & hash 替代 % n，提高哈希计算效率。
扩容后索引要么不变，要么变成 oldIndex + oldCapacity，迁移元素更高效。
利用二进制特性，避免重新计算哈希，提高性能。

这就是 HashMap 为什么用 2 倍扩容，并使用移位操作来优化哈希索引计算的核心原因！ 🚀