C#的Dictionary实现原理分析

1：实现中的参数

//哈希映射后首个元素在entries中的Index
int[] buckets;
//存储Key,Value的数组，C#的dictionary是用数组来组织数据的
Dictionary<TKey, TValue>.Entry[] entries;
private struct Entry
{
public int hashCode;
public int next;
public TKey key;
public TValue value;
}
//当前数据的数量
int count;
//version主要是控制dictionary的version信息，比如你在遍历数组的过程中，删了一个数据，version改变，遍历就可能出问题，下面不针对这个参数做分析。
int version;
//删除会造成数组内部有空隙，这些空隙会链成一个链表一样的结构，freeList对应entries中一个空的Index，并且能往后查找到所有空隙
int freeList;
//数组内部的空隙元素数量
int freeCount;

2：实例分析

主要对增加，删除以及查询做分析

举个例子：
增加
对一些人的Id和名字做存储，比如有以下5人,(1000, “Ally”), (1005, “Green”), (1010, “Curry”), (1011, “Tompson”), (1014, “Wiseman”)
Dictionary<int, strint> dict = new Dictionary<int, string>();
dict.Add(1000, “Ally”);
dict.Add(1005, “Green”);
dict.Add(1010, “Curry”);
dict.Add(1011, “Tompson”);
dict.Add(1014, “Wiseman”);

注意：这边为了分析方便，假设初始化之后entries和buckets的Length是5，（这两个数组长度是一样的，但初始长度不一定为5）

执行dict.Add(1000,
“Ally”)之后，根据Key=1000获取到key的hashCode1000(程序中有个hashHelper，这里假设hashCode和key相同)
entries=> [{hashCode = 1000, next = -1, key = 1000, value = “Ally”},
{}, {}, {}, {}] 1000%5 = 0;(5是buckets的Length) 所以buckets=>[0, -1, -1,
-1, -1], 意思hashCode余数为0首元素在entries中的Index为0.
执行dict.Add(1005, “Green”);根据key=1005获取到key的hashCode1005 1005%5 =5;
所以buckets=>[1, -1, -1, -1, -1],将当前元素置为余数为0的首元素 entries=> [{hashCode
= 1000, next = -1, key = 1000, value = “Ally”}, {hashCode = 1005, next = 0, key = 1005, value = “Green”}, {}, {}, {}]
注意1005这个next对应到第一个元素
执行dict.Add(1010, “Curry”);根据key=1010获取到key对应的hashCode1010; 1010%5
=0; 所以buckets=>[2, -1, -1, -1, -1],将当前元素置为余数为0的首元素 entries=> [{hashCode = 1000, next = -1, key = 1000, value = “Ally”}, {hashCode
= 1005, next = 0, key = 1005, value = “Green”}, {hashCode = 1010, next = 1, key = 1010, value = “Curry”}, {}, {}]
注意到这三个元素的hashCode余数均为0，通过next将三个元素串在了一起，buckets[0]则表示首个元素对应的Index.
执行dict.Add(1011, “Tompson”); 1011%5 =1；所以entries=> [{hashCode =
1000, next = -1, key = 1000, value = “Ally”}, {hashCode = 1005, next
= 0, key = 1005, value = “Green”}, {hashCode = 1010, next = 1, key = 1010, value = “Curry”}, {hashCode = 1011, next = -1, key = 1011,
value = “Tompson”}, {}] buckets=>[2, 3, -1, -1, -1]
执行dict.Add(1014, “Wiseman”); 1014%5 = 4；所以entries=> [{hashCode =
1000, next = -1, key = 1000, value = “Ally”}, {hashCode = 1005, next
= 0, key = 1005, value = “Green”}, {hashCode = 1010, next = 1, key = 1010, value = “Curry”}, {hashCode = 1011, next = -1, key = 1011,
value = “Tompson”}, {hashCode = 1014, next = -1, key = 1014, value
= “Wiseman”}] buckets=>[2, 3, -1, -1, 4]

查询
bool b1 = dict.Containes(1010);
bool b2 = dict.Containes(1040);

执行bool b1 = dict.Containes(1010); 根据key=1010获取到hashCode1010,
1010%5=0；这时候去取buckets[0], 值为2；检查entries[2]中元素，该元素为{hashCode =
1010, next = 1, key = 1010, value = “Curry”}，发现是该值，那么返回true
执行bool b2 = dict.Containes(1040); 根据key=1040获取到hashCode1040，
1040%5=0；这时候去取buckets[0], 值为2；检查entries[2]中元素，该元素为{hashCode =
1010, next = 1, key = 1010, value = “Curry”}，发现不是，但该元素next为1，
检查entries[1]中元素，该元素为{hashCode = 1005, next = 0, key = 1005, value =
“Green”}，发现不是，但该元素next为0，检查entries[0]中元素，该元素为{hashCode = 1000, next
= -1, key = 1000, value = “Ally”}，发现不是，该元素next为-1，说明查找不到该元素，返回false;

删除
dict.Remove(1005);
dict.Remove(1010);

执行dict.Remove(1005); 根据key=1005获取到hashCode1005, 1005%5=0；
这时候去取buckets[0], 值为2；初始化一个Index = -1; 检查entries[2]中元素，
该元素为{hashCode = 1010, next = 1, key = 1010, value =
“Curry”}，发现不是，但该元素next为1，设置Index = 2; 检查entries[1]中元素，该元素为{hashCode
= 1005, next = 0, key = 1005, value = “Green”}, 发现是该元素，该元素next为0，所以设置entries[Index]也就是entries[2].next = 0;
并且初始化该元素，操作之后 entries=> [{hashCode = 1000, next = -1, key = 1000,
value = “Ally”}, {hashCode = -1, next = -1, key = 0, value = “”},
{hashCode = 1010, next = 0, key = 1010, value = “Curry”}, {hashCode
= 1011, next = -1, key = 1011, value = “Tompson”}, {hashCode = 1014, next = -1, key = 1014, value = “Wiseman”}] buckets=>[2, 3, -1,
-1, 4] 注意这时候freeList起作用：freeList = 1,也就是空隙元素的Index.freeCount = 1;
执行操作dict.Remove(1010); 根据key=1010获取到hashCode1010, 1010%5=0；
这时候去取buckets[0], 值为2；初始化一个Index = -1; 检查entries[2]中元素，
该元素为{hashCode = 1010, next = 1, key = 1010, value =
“Curry”}，发现就是该元素，该元素next为1 Index = -1,
说明该元素是第一个，直接将它的next修改到buckets中执行操作之后 entries=> [{hashCode = 1000,
next = -1, key = 1000, value = “Ally”}, {hashCode = -1, next = -1,
key = 0, value = “”}, {hashCode = -1, next = -1, key = -1, value =
“”}, {hashCode = 1011, next = -1, key = 1011, value = “Tompson”},
{hashCode = 1014, next = -1, key = 1014, value = “Wiseman”}]
buckets=>[0, 3, -1, -1, 4] freeList=2,freeCount =
2;我们可以发现空隙元素也变成了一个链表结构，首个Index=2, Index=2的元素的next为1，将这些空隙串联了起来。

增查删的操作基本如上，C#的dictionary是基于hash表来操作，对应hash之后重复的情况通过链表串联起来，使用时从头往后查询。删除造成的数组元素空隙也通过链表串联起来，重复利用。
（感觉应该是没有红黑树这种结构效率高，因为元素满了之后扩容很慢，复制数组，内部元素重新处理，查询时如果hash重复多的情况都比较慢）.