GCN
文章目录
然而现实世界中并不是所有的事物都可以表示成一个序列或者一个网格,例如社交网络、知识图谱、复杂的文件系统等,也就是说很多事物都是非结构化的。
相比于简单的文本和图像,这种网络类型的非结构化的数据非常复杂,处理它的难点包括:
- 图的大小是任意的,图的拓扑结构复杂,没有像图像一样的空间局部性
- 图没有固定的节点顺序,或者说没有一个参考节点
- 图经常是动态图,而且包含多模态的特征
那么对于这类数据我们该如何建模呢?能否将深度学习进行扩展使得能够建模该类数据呢?这些问题促使了图神经网络的出现与发展。
当然,其实GCN的缺点也是很显然易见的,第一,GCN需要将整个图放到内存和显存,这将非常耗内存和显存,处理不了大图;第二,GCN在训练时需要知道整个图的结构信息(包括待预测的节点), 这在现实某些任务中也不能实现(比如用今天训练的图模型预测明天的数据,那么明天的节点是拿不到的)。
GCN四个特征:
- GCN 是对卷积神经网络在 graph domain 上的自然推广。
- 它能同时对节点特征信息与结构信息进行端对端学习,是目前对图数据学习任务的最佳选择。
- 图卷积适用性极广,适用于任意拓扑结构的节点与图。
- 在节点分类与边预测等任务上,在公开数据集上效果要远远优于其他方法。
关于GCN常见问题:
- 对于很多网络,我们可能没有节点的特征,这个时候可以使用GCN吗?答案是可以的,如论文中作者对那个俱乐部网络,采用的方法就是用单位矩阵 I 替换特征矩阵 X。
- 我没有任何的节点类别的标注,或者什么其他的标注信息,可以使用GCN吗?当然,就如前面讲的,不训练的GCN,也可以用来提取graph embedding,而且效果还不错。
- GCN网络的层数多少比较好?论文的作者做过GCN网络深度的对比研究,在他们的实验中发现,GCN层数不宜多,2-3层的效果就很好了。
GCN python代码实现
找了网上很多GCN代码,都不是太能看得懂,最后找到一个写的很详细的使用DGL图神经网络库来实现的代码。并且根据需要改进了代码,实现节点的多分类问题。https://zhuanlan.zhihu.com/p/93828551
一、空手道俱乐部问题
空手道俱乐部是一个包含34个成员的社交网络,有成对的文档交互发生在成员之间。俱乐部后来分裂成两个群体,分别以指导员(节点0)和俱乐部主席(节点33)为首,整个网络可视化如下图:
任务是预测每个节点会加入哪一边(0or33)
创建Zachary’s karate club图如下:
import dgl
def build_karate_club_graph():
g = dgl.DGLGraph()
# add 34 nodes into the graph; nodes are labeled from 0~33
g.add_nodes(34)
# all 78 edges as a list of tuples
edge_list = [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2),
(4, 0), (5, 0), (6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
(7, 2), (7, 3), (8, 0), (8, 2), (9, 2), (10, 0), (10, 4),
(10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2),
(13, 3), (16, 5), (16, 6), (17, 0), (17, 1), (19, 0), (19, 1),
(21, 0), (21, 1), (25, 23), (25, 24), (27, 2), (27, 23),
(27, 24), (28, 2), (29, 23), (29, 26), (30, 1), (30, 8),
(31, 0), (31, 24), (31, 25), (31, 28), (32, 2), (32, 8),
(32, 14), (32, 15), (32, 18), (32, 20), (32, 22), (32, 23),
(32, 29), (32, 30), (32, 31), (33, 8), (33, 9), (33, 13),
(33, 14), (33, 15), (33, 18), (33, 19), (33, 20), (33, 22),
(33, 23), (33, 26), (33, 27), (33, 28), (33, 29), (33, 30),
(33, 31), (33, 32)]
# add edges two lists of nodes: src and dst
src, dst = tuple(zip(*edge_list))
g.add_edges(src, dst)
# edges are directional in DGL; make them bi-directional
g.add_edges(dst, src)
return g
输出创建的节点和边的数量
G = build_karate_club_graph()
print('We have %d nodes.' % G.number_of_nodes())
print('We have %d edges.' % G.number_of_edges())
>>>We have 34 nodes.
>>>We have 156 edges.
利用networkx画graph
import networkx as nx
import matplotlib.pyplot as plt
fig = plt.figure(dpi=150)
nx_G = G.to_networkx().to_undirected()
pos = nx.kamada_kawai_layout(nx_G)
nx.draw(nx_G, pos, with_labels=True, node_color=[[.7, .7, .7]])
plt.show()
完整绘制网络代码如下:
import dgl
import matplotlib
import torch
# https://zhuanlan.zhihu.com/p/93828551
import networkx as nx
def build_karate_club_graph():
g = dgl.DGLGraph()
# add 34 nodes into the graph; nodes are labeled from 0~33
g.add_nodes(34)
# all 78 edges as a list of tuples
edge_list = [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2),
(4, 0), (5, 0), (6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
(7, 2), (7, 3), (8, 0), (8, 2), (9, 2), (10, 0), (10, 4),
(10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2),
(13, 3), (16, 5), (16, 6), (17, 0), (17, 1), (19, 0), (19, 1),
(21, 0), (21, 1), (25, 23), (25, 24), (27, 2), (27, 23),
(27, 24), (28, 2), (29, 23), (29, 26), (30, 1), (30, 8),
(31, 0), (31, 24), (31, 25), (31, 28), (32, 2), (32, 8),
(32, 14), (32, 15), (32, 18), (32, 20), (32, 22), (32, 23),
(32, 29), (32, 30), (32, 31), (33, 8), (33, 9), (33, 13),
(33, 14), (33, 15), (33, 18), (33, 19), (33, 20), (33, 22),
(33, 23), (33, 26), (33, 27), (33, 28), (33, 29), (33, 30),
(33, 31), (33, 32)]
# add edges two lists of nodes: src and dst
src, dst = tuple(zip(*edge_list))
g.add_edges(src, dst)
# edges are directional in DGL; make them bi-directional
g.add_edges(dst, src)
return g
G = build_karate_club_graph()
print('We have %d nodes.' % G.number_of_nodes())
print('We have %d edges.' % G.number_of_edges())
G.ndata['feat'] = torch.eye(34)
import networkx as nx
import matplotlib.pyplot as plt
fig = plt.figure(dpi=150)
nx_G = G.to_networkx().to_undirected()
pos = nx.kamada_kawai_layout(nx_G)
nx.draw(nx_G, pos, with_labels=True, node_color=[[.7, .7, .7]])
plt.show()
二 : 给边和节点赋予特征
Graph neural networks associate features with nodes and edges for training. For our classification example, we assign each node an input feature as a one-hot vector: node vivi‘s feature vector is [0,…,1,…,0][0,…,1,…,0], where the ithith position is one.
In DGL, you can add features for all nodes at once, using a feature tensor that batches node features along the first dimension. The code below adds the one-hot feature for all nodes:
联合边和节点信息做图训练。对于整个节点分类的例子,将每个节点的特征转化成one-hot向量:节点变为[0,…,1,…,0][0,…,1,…,0],对应的位置上的数值为1。
在DGL里面,可以使用一个feature张量在第一维上一次性给所有的节点添加特征,代码如下
import torch
G.ndata['feat'] = torch.eye(34)
输出节点看赋值结果
# print out node 2's input feature
print(G.nodes[2].data['feat'])
# print out node 10 and 11's input features
print(G.nodes[[10, 11]].data['feat'])
>>>tensor([[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
>>>tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
三 : 定义一个GCN
简单的定义一个图卷积神经网络框架。(推荐阅读原始论文获取更多细节)
上面整个步骤可以看作一个message-passing的范式:每个节点会接受邻居节点的信息从而更新自身的节点表示。一个图形化的例子就是:向目标节点流动
接下来,是一个使用GDL实现GCN的例子
import torch.nn as nn
import torch.nn.functional as F
# 主要定义message方法和reduce方法
# NOTE: 为了易于理解,整个教程忽略了归一化的步骤
def gcn_message(edges):
# 参数:batch of edges
# 得到计算后的batch of edges的信息,这里直接返回边的源节点的feature.
return {'msg' : edges.src['h']}
def gcn_reduce(nodes):
# 参数:batch of nodes.
# 得到计算后batch of nodes的信息,这里返回每个节点mailbox里的msg的和
return {'h' : torch.sum(nodes.mailbox['msg'], dim=1)}
# Define the GCNLayer module
class GCNLayer(nn.Module):
def __init__(self, in_feats, out_feats):
super(GCNLayer, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
def forward(self, g, inputs):
# g 为图对象; inputs 为节点特征矩阵
# 设置图的节点特征
g.ndata['h'] = inputs
# 触发边的信息传递
g.send(g.edges(), gcn_message)
# 触发节点的聚合函数
g.recv(g.nodes(), gcn_reduce)
# 取得节点向量
h = g.ndata.pop('h')
# 线性变换
return self.linear(h)
通常情况下,节点通过message方法传播计算后的节点特征,reduce方法负责将收集到的节点特征进行聚合。
下面定义一个更深的GCN模型,包含两层 GCN层:
class GCN(nn.Module):
def __init__(self, in_feats, hidden_size, num_classes):
super(GCN, self).__int__()
self.gcn1 = GCNLayer(in_feats, hidden_size)
self.gcn2 = GCNLayer(hidden_size, num_classes)
def forward(self, g, inputs):
h = self.gcn1(g, inputs)
h = torch.relu(h)
h = self.gcn2(g, h)
return h
# 以空手道俱乐部为例
# 第一层将34层的输入转化为隐层为5
# 第二层将隐层转化为最终的分类数2
net = GCN(34,5,2)
四 : 数据准备和初始化
我们使用one-hot向量初始化节点。因为是一个半监督的设定,仅有指导员(节点0)和俱乐部主席(节点33)被分配了label,实现如下:
inputs = torch.eye(34)
labeled_nodes = torch.tensor([0, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1]) # their labels are different
五 : 训练和可视化
训练的步骤和PyTorch模型一样,(1)创建优化器,(2)输入input数据,(3)计算loss,(4)使用反向传播优化模型
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
all_logits = []
for epoch in range(30):
logits = net(G, inputs)
# we save the logits for visualization later
all_logits.append(logits.detach())
logp = F.log_softmax(logits, 1)
# we only compute loss for labeled nodes
loss = F.nll_loss(logp[labeled_nodes], labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print('Epoch %d | Loss: %.4f' % (epoch, loss.item()))
output:
Epoch 0 | Loss: 0.4679
Epoch 1 | Loss: 0.3354
Epoch 2 | Loss: 0.2421
Epoch 3 | Loss: 0.1677
Epoch 4 | Loss: 0.1110
Epoch 5 | Loss: 0.0714
Epoch 6 | Loss: 0.0437
Epoch 7 | Loss: 0.0259
Epoch 8 | Loss: 0.0151
Epoch 9 | Loss: 0.0089
Epoch 10 | Loss: 0.0053
Epoch 11 | Loss: 0.0032
Epoch 12 | Loss: 0.0020
Epoch 13 | Loss: 0.0013
Epoch 14 | Loss: 0.0009
Epoch 15 | Loss: 0.0006
Epoch 16 | Loss: 0.0004
Epoch 17 | Loss: 0.0003
Epoch 18 | Loss: 0.0002
Epoch 19 | Loss: 0.0002
Epoch 20 | Loss: 0.0001
Epoch 21 | Loss: 0.0001
Epoch 22 | Loss: 0.0001
Epoch 23 | Loss: 0.0001
Epoch 24 | Loss: 0.0001
Epoch 25 | Loss: 0.0000
Epoch 26 | Loss: 0.0000
Epoch 27 | Loss: 0.0000
Epoch 28 | Loss: 0.0000
Epoch 29 | Loss: 0.0000
这是一个非常简单的小例子,甚至没有划分验证集和测试集。因此,因为模型最后输出了每个节点的二维向量,我们可以轻易的在2D的空间将这个过程可视化出来,下面的代码动态的展示了训练过程中从开始的状态到到最后所有节点都线性可分的过程。
import matplotlib.animation as animation
import matplotlib.pyplot as plt
def draw(i):
cls1color = '#00FFFF'
cls2color = '#FF00FF'
pos = {}
colors = []
for v in range(34):
pos[v] = all_logits[i][v].numpy()
cls = pos[v].argmax()
colors.append(cls1color if cls else cls2color)
ax.cla()
ax.axis('off')
ax.set_title('Epoch: %d' % i)
nx.draw_networkx(nx_G.to_undirected(), pos, node_color=colors,
with_labels=True, node_size=300, ax=ax)
fig = plt.figure(dpi=150)
fig.clf()
ax = fig.subplots()
draw(0) # draw the prediction of the first epoch
plt.close()
下面的动态过程展示了模型经过一段训练之后能够准确预测节点属于哪个群组。
ani = animation.FuncAnimation(fig, draw, frames=len(all_logits), interval=200)
(1)完整代码:
import dgl
import matplotlib
import torch
# https://zhuanlan.zhihu.com/p/93828551
import networkx as nx
def build_karate_club_graph():
g = dgl.DGLGraph()
# add 34 nodes into the graph; nodes are labeled from 0~33
g.add_nodes(34)
# all 78 edges as a list of tuples
edge_list = [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2),
(4, 0), (5, 0), (6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
(7, 2), (7, 3), (8, 0), (8, 2), (9, 2), (10, 0), (10, 4),
(10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2),
(13, 3), (16, 5), (16, 6), (17, 0), (17, 1), (19, 0), (19, 1),
(21, 0), (21, 1), (25, 23), (25, 24), (27, 2), (27, 23),
(27, 24), (28, 2), (29, 23), (29, 26), (30, 1), (30, 8),
(31, 0), (31, 24), (31, 25), (31, 28), (32, 2), (32, 8),
(32, 14), (32, 15), (32, 18), (32, 20), (32, 22), (32, 23),
(32, 29), (32, 30), (32, 31), (33, 8), (33, 9), (33, 13),
(33, 14), (33, 15), (33, 18), (33, 19), (33, 20), (33, 22),
(33, 23), (33, 26), (33, 27), (33, 28), (33, 29), (33, 30),
(33, 31), (33, 32)]
# add edges two lists of nodes: src and dst
src, dst = tuple(zip(*edge_list))
g.add_edges(src, dst)
# edges are directional in DGL; make them bi-directional
g.add_edges(dst, src)
return g
G = build_karate_club_graph()
print('We have %d nodes.' % G.number_of_nodes())
print('We have %d edges.' % G.number_of_edges())
G.ndata['feat'] = torch.eye(34)
import torch.nn as nn
import torch.nn.functional as F
# 主要定义message方法和reduce方法
# NOTE: 为了易于理解,整个教程忽略了归一化的步骤
def gcn_message(edges):
# 参数:batch of edges
# 得到计算后的batch of edges的信息,这里直接返回边的源节点的feature.
return {'msg' : edges.src['h']}
def gcn_reduce(nodes):
# 参数:batch of nodes.
# 得到计算后batch of nodes的信息,这里返回每个节点mailbox里的msg的和
return {'h' : torch.sum(nodes.mailbox['msg'], dim=1)}
# Define the GCNLayer module
class GCNLayer(nn.Module):
def __init__(self, in_feats, out_feats):
super(GCNLayer, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
def forward(self, g, inputs):
# g 为图对象; inputs 为节点特征矩阵
# 设置图的节点特征
g.ndata['h'] = inputs
# 触发边的信息传递
g.send(g.edges(), gcn_message)
# 触发节点的聚合函数
g.recv(g.nodes(), gcn_reduce)
# 取得节点向量
h = g.ndata.pop('h')
# 线性变换
return self.linear(h)
class GCN(nn.Module):
def __init__(self, in_feats, hidden_size, num_classes):
super(GCN, self).__init__()
self.gcn1 = GCNLayer(in_feats, hidden_size)
self.gcn2 = GCNLayer(hidden_size, num_classes)
def forward(self, g, inputs):
h = self.gcn1(g, inputs)
h = torch.relu(h)
h = self.gcn2(g, h)
return h
# 以空手道俱乐部为例
# 第一层将34层的输入转化为隐层为8
# 第二层将隐层转化为最终的分类数2
net = GCN(34, 8, 2)
inputs = torch.eye(34)
labeled_nodes = torch.tensor([0, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1]) # their labels are different
# net = GCN(34, 8, 3)
# inputs = torch.eye(34)
# labeled_nodes = torch.tensor([0, 2, 33]) # only the instructor and the president nodes are labeled
# labels = torch.tensor([0, 1, 2]) # their labels are different
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
all_logits = []
import matplotlib.animation as animation
import matplotlib.pyplot as plt
nx_G = G.to_networkx().to_undirected()
# Kamada-Kawaii layout usually looks pretty for arbitrary graphs
pos = nx.kamada_kawai_layout(nx_G)
def draw(i):
# fig = plt.figure(dpi=150)
plt.clf()# 此时不能调用此函数,不然之前的点将被清空。
color = [ 'green', 'b', 'r', '#7FFFD4', '#FFC0CB', '#00022e','#F0F8FF']
# cls1color = '#00FFFF'
# cls2color = '#FF00FF'
pos = {}
colors = []
for v in range(34):
pos[v] = all_logits[i][v].numpy()
cls = pos[v].argmax()
colors.append(color[cls])
# colors.append(cls1color if cls else cls2color)
# ax = fig.subplots()
ax.cla()
ax.axis('off')
ax.set_title('Epoch: %d' % i)
nx.draw_networkx(nx_G.to_undirected(), pos, node_color=colors,
with_labels=True, node_size=200)
for epoch in range(40):
logits = net(G, inputs)
# we save the logits for visualization later
all_logits.append(logits.detach())
logp = F.log_softmax(logits, 1)
# we only compute loss for labeled nodes
loss = F.nll_loss(logp[labeled_nodes], labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print('Epoch %d | Loss: %.4f' % (epoch, loss.item()))
fig = plt.figure(dpi=150)
fig.clf()
ax = fig.subplots()
# draw(0) # draw the prediction of the first epoch
ani = animation.FuncAnimation(fig, draw, frames=len(all_logits), interval=200)
plt.pause(30)
# plt.close()
半监督学习动态过程:
六、改进
将整个网络节点分为3类,原来[0, 33]两个分家的人变为,[0, 2, 33]这三个人分家。三者对应的标签为[0, 1, 2],修改这部分代码:
net = GCN(34, 8, 3)
inputs = torch.eye(34)
labeled_nodes = torch.tensor([0, 2, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1, 2]) # their labels are different
直接使用上面的画图,是有错误的,2D的左边不能绘制三维的数据,这里有两种方法绘制分类后的节点:
1、使用PCA降维,在二位坐标中显示出来,但是会有一点的信息损失;
2、直接使用三维的坐标展示;
(1)PCA降维后展示分类结果
将原来画图的那部分注释掉,然后之间显示最后的分类点
# fig = plt.figure(dpi=150)
# fig.clf()
# ax = fig.subplots()
# # draw(0) # draw the prediction of the first epoch
# ani = animation.FuncAnimation(fig, draw, frames=len(all_logits), interval=200)
# plt.pause(30)
# # plt.close()
import matplotlib.pyplot as plt #加载matplotlib用于数据的可视化
from sklearn.decomposition import PCA #加载PCA算法包
x, y= [], []
for i in range(34):
x.append(all_logits[39][i].numpy())
y.append(all_logits[39][i].numpy().argmax())
print("{} {}".format(all_logits[39][i].numpy(), all_logits[39][i].numpy().argmax()))
pca=PCA(n_components=2) #加载PCA算法,设置降维后主成分数目为2
reduced_x=pca.fit_transform(x)#对样本进行降维
print(reduced_x)
# #可视化
color = ['b', 'r', '#7FFFD4', '#FFC0CB', '#00022e','#F0F8FF', 'green']
for index, item in enumerate(reduced_x):
plt.scatter(item[0], item[1], c= color[y[index]])
plt.show()
GCN训练的最后一步的结果与对应的分类值:
3D绘图代码:
# 三维显示
import math
import numpy as np
import matplotlib.pyplot as plt
import time
from mpl_toolkits.mplot3d import Axes3D #绘制3D坐标的函数
x, c= [], []
color = ['b', 'r','#7FFFD4','#FFC0CB', '#00022e','#F0F8FF', 'green']
for i in range(34):
x.append(all_logits[39][i].numpy())
c.append(all_logits[39][i].numpy().argmax())
print("{} {}".format(all_logits[39][i].numpy(), all_logits[39][i].numpy().argmax()))
x1 = [i[0] for i in x]
x2 = [i[1] for i in x]
y = [i[2] for i in x]
d = []
for i in range(len(x1)):
d.append(color[c[i]])
fig1=plt.figure()#创建一个绘图对象
ax=Axes3D(fig1)#用这个绘图对象创建一个Axes对象(有3D坐标)
ax.scatter(x1, x2, y, c = d)#用取样点(x,y,z)去构建曲面
plt.show()#显示模块中的所有绘图对象
六、所有代码:
import dgl
import matplotlib
import torch
# https://zhuanlan.zhihu.com/p/93828551
import networkx as nx
def build_karate_club_graph():
g = dgl.DGLGraph()
# add 34 nodes into the graph; nodes are labeled from 0~33
g.add_nodes(34)
# all 78 edges as a list of tuples
edge_list = [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2),
(4, 0), (5, 0), (6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
(7, 2), (7, 3), (8, 0), (8, 2), (9, 2), (10, 0), (10, 4),
(10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2),
(13, 3), (16, 5), (16, 6), (17, 0), (17, 1), (19, 0), (19, 1),
(21, 0), (21, 1), (25, 23), (25, 24), (27, 2), (27, 23),
(27, 24), (28, 2), (29, 23), (29, 26), (30, 1), (30, 8),
(31, 0), (31, 24), (31, 25), (31, 28), (32, 2), (32, 8),
(32, 14), (32, 15), (32, 18), (32, 20), (32, 22), (32, 23),
(32, 29), (32, 30), (32, 31), (33, 8), (33, 9), (33, 13),
(33, 14), (33, 15), (33, 18), (33, 19), (33, 20), (33, 22),
(33, 23), (33, 26), (33, 27), (33, 28), (33, 29), (33, 30),
(33, 31), (33, 32)]
# add edges two lists of nodes: src and dst
src, dst = tuple(zip(*edge_list))
g.add_edges(src, dst)
# edges are directional in DGL; make them bi-directional
g.add_edges(dst, src)
return g
G = build_karate_club_graph()
print('We have %d nodes.' % G.number_of_nodes())
print('We have %d edges.' % G.number_of_edges())
G.ndata['feat'] = torch.eye(34)
import torch.nn as nn
import torch.nn.functional as F
# 主要定义message方法和reduce方法
# NOTE: 为了易于理解,整个教程忽略了归一化的步骤
def gcn_message(edges):
# 参数:batch of edges
# 得到计算后的batch of edges的信息,这里直接返回边的源节点的feature.
return {'msg' : edges.src['h']}
def gcn_reduce(nodes):
# 参数:batch of nodes.
# 得到计算后batch of nodes的信息,这里返回每个节点mailbox里的msg的和
return {'h' : torch.sum(nodes.mailbox['msg'], dim=1)}
# Define the GCNLayer module
class GCNLayer(nn.Module):
def __init__(self, in_feats, out_feats):
super(GCNLayer, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
def forward(self, g, inputs):
# g 为图对象; inputs 为节点特征矩阵
# 设置图的节点特征
g.ndata['h'] = inputs
# 触发边的信息传递
g.send(g.edges(), gcn_message)
# 触发节点的聚合函数
g.recv(g.nodes(), gcn_reduce)
# 取得节点向量
h = g.ndata.pop('h')
# 线性变换
return self.linear(h)
class GCN(nn.Module):
def __init__(self, in_feats, hidden_size, num_classes):
super(GCN, self).__init__()
self.gcn1 = GCNLayer(in_feats, hidden_size)
self.gcn2 = GCNLayer(hidden_size, num_classes)
def forward(self, g, inputs):
h = self.gcn1(g, inputs)
h = torch.relu(h)
h = self.gcn2(g, h)
return h
# 以空手道俱乐部为例
# 第一层将34层的输入转化为隐层为8
# 第二层将隐层转化为最终的分类数2
# net = GCN(34, 8, 2)
# inputs = torch.eye(34)
# labeled_nodes = torch.tensor([0, 33]) # only the instructor and the president nodes are labeled
# labels = torch.tensor([0, 1]) # their labels are different
net = GCN(34, 8, 3)
inputs = torch.eye(34)
labeled_nodes = torch.tensor([0, 2, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1, 2]) # their labels are different
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
all_logits = []
import matplotlib.animation as animation
import matplotlib.pyplot as plt
nx_G = G.to_networkx().to_undirected()
# Kamada-Kawaii layout usually looks pretty for arbitrary graphs
pos = nx.kamada_kawai_layout(nx_G)
def draw(i):
# fig = plt.figure(dpi=150)
plt.clf()# 此时不能调用此函数,不然之前的点将被清空。
color = [ 'green', 'b', 'r', '#7FFFD4', '#FFC0CB', '#00022e','#F0F8FF']
# cls1color = '#00FFFF'
# cls2color = '#FF00FF'
pos = {}
colors = []
for v in range(34):
pos[v] = all_logits[i][v].numpy()
cls = pos[v].argmax()
colors.append(color[cls])
# colors.append(cls1color if cls else cls2color)
# ax = fig.subplots()
ax.cla()
ax.axis('off')
ax.set_title('Epoch: %d' % i)
nx.draw_networkx(nx_G.to_undirected(), pos, node_color=colors,
with_labels=True, node_size=200)
for epoch in range(40):
logits = net(G, inputs)
# we save the logits for visualization later
all_logits.append(logits.detach())
logp = F.log_softmax(logits, 1)
# we only compute loss for labeled nodes
loss = F.nll_loss(logp[labeled_nodes], labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print('Epoch %d | Loss: %.4f' % (epoch, loss.item()))
# 动态显示
# fig = plt.figure(dpi=150)
# fig.clf()
# ax = fig.subplots()
# # draw(0) # draw the prediction of the first epoch
# ani = animation.FuncAnimation(fig, draw, frames=len(all_logits), interval=200)
# plt.pause(30)
# # plt.close()
# 降维二维显示
# import matplotlib.pyplot as plt #加载matplotlib用于数据的可视化
# from sklearn.decomposition import PCA #加载PCA算法包
#
# x, y= [], []
# for i in range(34):
# x.append(all_logits[39][i].numpy())
# y.append(all_logits[39][i].numpy().argmax())
# print("{} {}".format(all_logits[39][i].numpy(), all_logits[39][i].numpy().argmax()))
#
#
# pca=PCA(n_components=2) #加载PCA算法,设置降维后主成分数目为2
# reduced_x=pca.fit_transform(x)#对样本进行降维
# print(reduced_x)
#
# # #可视化
# color = ['b', 'r', '#7FFFD4', '#FFC0CB', '#00022e','#F0F8FF', 'green']
# for index, item in enumerate(reduced_x):
# plt.scatter(item[0], item[1], c= color[y[index]])
# plt.show()
# 三维显示
import math
import numpy as np
import matplotlib.pyplot as plt
import time
from mpl_toolkits.mplot3d import Axes3D #绘制3D坐标的函数
x, c= [], []
color = ['b', 'r','#7FFFD4','#FFC0CB', '#00022e','#F0F8FF', 'green']
for i in range(34):
x.append(all_logits[39][i].numpy())
c.append(all_logits[39][i].numpy().argmax())
print("{} {}".format(all_logits[39][i].numpy(), all_logits[39][i].numpy().argmax()))
x1 = [i[0] for i in x]
x2 = [i[1] for i in x]
y = [i[2] for i in x]
d = []
for i in range(len(x1)):
d.append(color[c[i]])
fig1=plt.figure()#创建一个绘图对象
ax=Axes3D(fig1)#用这个绘图对象创建一个Axes对象(有3D坐标)
ax.scatter(x1, x2, y, c = d)#用取样点(x,y,z)去构建曲面
plt.show()#显示模块中的所有绘图对象
本机上所有库的版本,供参考
absl-py==0.9.0
appdirs==1.4.4
astartool==0.0.5
astor==0.8.1
astroid==2.4.2
astunparse==1.6.3
async-exit-stack==1.0.1
async-generator==1.10
async-service==0.1.0a8
atomicwrites==1.3.0
attrs==19.3.0
autograd==1.3
backcall==0.1.0
baidu-aip==2.2.18.0
baidu-api==0.0.2
base58==1.0.3
beautifulsoup4==4.8.0
bison==0.1.2
bleach==3.1.5
blockchain==1.4.4
blurhash==1.1.4
cachetools==4.1.0
certifi==2019.6.16
cffi==1.12.3
cftime==1.4.1
chardet==3.0.4
Click==7.0
cma==2.7.0
colorama==0.4.3
comtypes==1.1.7
contextvars==2.4
cryptography==2.9.2
cuckoopy==0.1.1
cycler==0.10.0
Cython==0.29.22
cytoolz==0.10.1
d2lzh==1.0.0
dataclasses==0.8
deap==1.3.1
decorator==4.4.2
defusedxml==0.6.0
dgl==0.4.3.post2
dill==0.3.1.1
distlib==0.3.1
Django==2.2.5
docopt==0.6.2
docx==0.2.4
dominate==2.6.0
entrypoints==0.3
enum-compat==0.0.3
et-xmlfile==1.0.1
eth-hash==0.2.0
eth-typing==2.2.1
eth-utils==1.8.4
filelock==3.0.12
Flask==1.1.1
Flask-Bootstrap==3.3.7.1
Flask-Cors==3.0.8
Flask-WTF==0.14.3
flex==6.14.1
future==0.18.2
gast==0.3.3
gcn==0.0.1
geatpy==2.5.1
gitdb==4.0.5
GitPython==3.1.3
gmpy2==2.1.0a1
google-auth==1.14.2
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
graphviz==0.5.1
greenlet==1.0.0
grpcio==1.28.1
gTTS==2.1.1
gTTS-token==1.1.3
h5py==2.10.0
hdfs==2.5.8
html5lib==1.0.1
hypothesis==6.13.11
idna==2.6
immutables==0.14
importlib-metadata==4.0.1
importlib-resources==3.0.0
ipykernel==5.3.0
ipython==7.14.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
isort==5.6.4
itsdangerous==1.1.0
jdcal==1.4.1
jedi==0.17.0
jieba==0.39
Jinja2==2.10.1
jmetalpy==1.5.5
joblib==0.14.0
jsonpointer==2.0
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.1.3
jupyter-console==6.1.0
jupyter-core==4.6.3
Keras==2.2.0
Keras-Applications==1.0.2
Keras-Preprocessing==1.0.1
kiwisolver==1.1.0
lazy-object-proxy==1.4.3
libsvm==3.23
llvmlite==0.30.0
lxml==4.4.1
Markdown==3.2.2
MarkupSafe==1.1.1
matplotlib==2.2.3
mccabe==0.6.1
mistune==0.8.4
mlogger==1.0.1a0
mock==4.0.2
more-itertools==8.2.0
multiaddr==0.0.9
multiprocess==0.70.9
mxnet==1.6.0
mypy==0.782
mypy-extensions==0.4.3
mysql==0.0.2
mysql-connector==2.2.9
mysql-replication==0.21
mysqlclient==1.4.4
nbconvert==5.6.1
nbformat==5.0.6
netaddr==0.8.0
netCDF4==1.5.6
networkx==2.1
neupy==0.8.2
nltk==3.5
noiseprotocol==0.3.1
notebook==6.0.3
numba==0.46.0
numpy==1.19.4
oauthlib==3.1.0
opencv-python==4.1.1.26
openpyxl==3.0.3
opt-einsum==3.2.1
outcome==1.0.1
packaging==20.3
paillierlib==0.0.2
pandas==0.25.1
pandocfilters==1.4.2
parso==0.7.1
patsy==0.5.1
pdfminer==20191125
pdfminer3k==1.3.1
perceptron==1.1.0
pexpect==4.8.0
phe==1.4.1.dev0
pickleshare==0.7.5
Pillow==8.0.1
pipenv==2020.11.15
plotly==4.14.3
pluggy==0.13.1
ply==3.11
prettytable==0.7.2
progressbar2==3.34.3
prometheus-client==0.8.0
prompt-toolkit==3.0.8
protobuf==3.11.3
ptyprocess==0.6.0
py==1.8.1
py-ecc==2.0.0
py4j==0.10.7
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybas==0.0.2
pycparser==2.19
pycryptodome==3.9.7
pyDes==2.0.1
PyECCArithmetic==1.1.0
pyecharts==1.6.0
pygal==2.4.0
pygame==1.9.6
Pygments==2.7.3
pygsl==2.3.0
PyHDFS==0.3.1
pylint==2.6.0
pymc3==3.7
pymoo==0.4.1
pymultihash==0.8.2
PyMySQL==0.10.1
pyparsing==2.1.5
pypinyin==0.38.1
pypiwin32==223
pyrsistent==0.16.0
pyspark==2.4.4
pytest==5.4.1
python-dateutil==2.8.1
python-docx==0.8.10
python-utils==2.4.0
pyttsx==1.1
pyttsx3==2.71
pytz==2020.4
pyunit-prime==2020.3.22
PyWavelets==1.0.3
pywin32==227
pywinpty==0.5.7
PyYAML==5.3.1
pyzmq==20.0.0
qtconsole==4.7.4
QtPy==1.9.0
rarfile==3.1
redis==3.3.11
reedsolo==0.3
regex==2020.6.8
requests==2.24.0
requests-oauthlib==1.3.0
retrying==1.3.3
rfc3987==1.3.8
rsa==4.0
scapy==2.4.3
scikit-learn==0.23.2
scipy==1.5.4
secretsharing==0.2.6
Send2Trash==1.5.0
shamir-mnemonic==0.1.0
simplejson==3.16.0
siphash==0.0.1
six==1.15.0
smmap==3.0.4
sniffio==1.1.0
snowland-algorithm==0.0.7
snowland-smx==0.3.1
some-package==0.1
sortedcontainers==2.2.2
soupsieve==1.9.3
source==1.2.0
spark-parser==1.8.9
sqlparse==0.3.0
statsmodels==0.12.2
strict-rfc3339==0.7
style==1.1.0
tensorboard==1.13.1
tensorboard-plugin-wit==1.6.0.post3
tensorboardX==2.0
tensorflow==1.13.2
tensorflow-estimator==1.13.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
tf==1.0.0
Theano==1.0.4
threadpoolctl==2.1.0
toml==0.10.2
toolz==0.10.0
torch==1.5.0+cu92
torchvision==0.6.0+cu92
tornado==6.1
tp==0.2.0
tqdm==4.36.1
traitlets==4.3.3
trio==0.16.0
trio-typing==0.3.0
typed-ast==1.4.1
typing-extensions==3.10.0.0
uncompyle6==3.7.4
update==0.0.1
urllib3==1.22
validate-email==1.3
varint==1.0.2
virtualenv==20.4.6
visitor==0.1.3
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wrapt==1.12.1
WTForms==2.3.3
xdis==5.0.6
xlrd==1.2.0
xlwt==1.3.0
xmltodict==0.12.0
yapf==0.30.0
z3-solver==4.8.5.0
zipp==3.4.1
本代码参考博客
https://zhuanlan.zhihu.com/p/93828551
很好的博客
https://blog.csdn.net/liuweiyuxiang/article/details/98957612
GCN作者的博客
http://tkipf.github.io/graph-convolutional-networks/
GitHub图卷积网络的代码:https : //github.com/tkipf/gcn
其他代码
-
图神经网络库DGL零基础上手指南_01(半监督学习) https://zhuanlan.zhihu.com/p/93828551
我的小店
欢迎访问我的小店
整套系统、详细设计文档,优惠多多选择多多,欢迎大家进店看看!
小店地址:https://mianbaoduo.com/o/author-aWaVlmpkYw==/work
详细链接:https://www.aliyundrive.com/s/Vccrm9mPUEb
- 基于联邦学习的图片预测系统
- 基于局域网IP的考勤系统
- 基于二维码的设备巡检的设计与实现
- 教务管理系统