Bootstrap

2021_WWW_Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation

[论文阅读笔记]2021_WWW_Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation

论文下载地址: https://doi.org/10.1145/3442381.3449844
发表期刊:IW3C2 (International World Wide Web Conference Committee)
Publish time: 2021
作者及单位:

数据集:

代码:

其他人写的文章

简要概括创新点:(有很多实现的细节,值得去读)

  • (1) 本文一直在强调high-order
  • (2) motif不太好翻译
  • (3) 从超图的角度
    • 设计了很多细节,使得能够把框架缝合成功
  • (4) 自监督学习
  • (5)极有可能是根据 DHCF(DHCF是一种最新的基于超图卷积网络的方法,对用户和项目之间的高阶相关性进行建模,用于一般推荐。)改的,把DHCF搬到social Recommendation这个领域,再加上自己的创新。
    • DHCF–>MHCN
    • 都是超图卷积
    • 都是关注high-order correlations
    • Shuyi Ji, Yifan Feng, Rongrong Ji, Xibin Zhao, Wanwan Tang, and Yue Gao. 2020. Dual Channel Hypergraph Collaborative Filtering. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020–2029.

Abstract

  • (1) Most existing social recommendation models exploit pairwise relations to mine potential user preferences. However, real-life interactions among users are very complicated and user relations can be high-order. (大多数现有的社交推荐模型都利用成对关系来挖掘潜在的用户偏好。然而,现实生活中用户之间的交互非常复杂,用户关系可能是高阶的。)
  • (2) Hypergraph provides a natural way to model complex high-order relations, while its potentials for improving social recommendation are under-explored.(超图为复杂的高阶关系建模提供了一种自然的方法,但它在提高社会推荐方面的潜力尚未得到充分开发)
  • (3) In this paper, we fill this gap and propose a multi-channel hypergraph convolutional network to enhance social recommendation by leveraging high-order user relations(在本文中,我们填补了这一空白,并提出了一个多通道超图卷积网络来利用高阶用户关系来增强社交推荐)
  • (4) Technically, each channel in the network encodes a hypergraph that depicts a common high-order user relation pattern via hypergraph convolution.(在技术上,网络中的每个通道通过超图卷积编码一个描述普通高阶用户关系模式的超图。)
  • (5) By aggregating the embeddings learned through multiple channels, we obtain comprehensive user representations to generate recommendation results.(通过聚合通过多个渠道学习到的嵌入,我们获得全面的用户表示,从而产生推荐结果)
  • (6) However, the aggregation operation might also obscure the inherent characteristics of different types of high-order connectivity information(然而,聚合操作也可能会掩盖不同类型的高阶连通性信息的固有特征)( 操作层面的难点)
  • (7) To compensate for the aggregating loss, we innovatively integrate self-supervised learning into the training of the hypergraph convolutional network to regain the connectivity information with hierarchical mutual information maximization.(为了弥补聚合损失,我们创新性地将自我监督学习融入超图卷积网络的训练中,以层次互信息最大化的方式重新获得连通信息。)

CCS Concepts

Information systems→ Recommender systems; Social recommendation.

Keywords

  • Social Recommendation,
  • Self-supervised Learning,
  • Hypergraph Learning,
  • Graph Convolutional Network,
  • Recommender System

1 Introduction

  • (1) It has been revealed that people may alter their attitudes and behaviors in response to what they perceive their friends might do or think, which is known as the social influence.

  • (2) Meanwhile, there are also studies [25] showing that people tend to build connections with others who have similar preferences with them, which is called the homophily.

  • (3) However, a key limitation of these GNNs-based social recommendation models is that they only exploit the simple pairwise user relations and ignore the ubiquitous high-order relations among users.(然而,这些基于gnns的社交推荐模型的一个关键缺陷是,它们只利用了简单的两两用户关系,而忽略了用户之间普遍存在的高阶关系。)

  • (4) Although the long-range dependencies of relations (i.e. transitivity of friendship), which are also considered
    high-order, can be captured by using k k k graph neural layers to incorporate features from k k k-hop social neighbors, these GNNs-based models are unable to formulate and capture the complex high-order user relation patterns (as shown in Fig. 1) beyond pairwise relations.(虽然长期依赖关系(即友谊的传递性)也被认为是高阶的,通过使用 k k k个图神经层结合 k k k-hop社会邻居的特征来捕获,这些基于GNNs的模型无法表达和捕获成对关系之外的复杂的高阶用户关系模式(如图1所示))
    在这里插入图片描述

  • (5) For example, it is natural to think that two users who are socially connected and also purchased the same item have a stronger relationship than those who are only socially connected, whereas the common purchase information in the former is often neglected in previous social recommendation models.(例如,人们很自然地会认为,两个具有社交联系的用户同时购买了同一件商品,他们之间的关系比那些只具有社交联系的用户更强,而前者的共同购买信息在之前的社交推荐模型中往往被忽视。 对前人不足之处做出解释,这种情形下就是不足的)

  • (6) Hypergraph [4], which generalizes the concept of edge to make it connect more than two nodes, provides a natural way to model complex high-order relations among users (超图[4]对边的概念进行了概括,使其能够连接两个以上的节点,为用户间复杂的高阶关系建模提供了一种自然的方法) (超图的合适之处)

  • (7) In this paper, we fill this gap by investigating the potentials of fusing hypergraph modeling and graph convolutional networks, and propose a Multi-channel Hypergraph Convolutional Network (MHCN) to enhance social recommendation by exploiting high-order user relations(在本文中,我们通过研究融合超图建模和图卷积网络的潜力来填补这一空白,并提出了一个多通道超图卷积网络(MHCN),通过利用高阶用户关系来增强社会推荐)

  • (8) Technically, we construct hypergraphs by unifying nodes that form specific triangular relations, which are instances of a set of carefully designed triangular motifs with underlying semantics (shown in Fig. 2) 从技术上讲,我们通过统一节点来构建超图,这些节点形成特定的三角关系,这些三角关系是一组精心设计的具有潜在语义的三角图案的实例(如图2所示)。
    在这里插入图片描述

  • (9) As we define multiple categories of motifs which concretize different types of high-order relations such as‘having a mutual friend’,‘friends purchasing the same item’, and‘strangers but purchasing the same item’ in social recommender systems, each channel of the proposed hypergraph convolutional network under-takes the task of encoding a different motif-induced hypergraph (我们定义了多个主题类别,这些主题将不同类型的高阶关系具象化,比如“拥有共同的朋友”、“朋友购买相同的物品”、以及社交推荐系统中的“陌生人购买相同的物品”,所提出的超图卷积网络的每个通道都承担着编码不同主题诱导的超图的任务)(解释一下三角形对应的真实例子)

  • (10) By aggregating multiple user embeddings learned through multiple channels, we can obtain the comprehensive user representations which are considered to contain multiple types of high-order relation information and have the great potentials to generate better recommendation results with the item embeddings.(将通过多种渠道学习到的多个用户嵌入信息进行聚合,可以得到综合的用户表示形式,该表示形式被认为包含了多种类型的高阶关系信息,有很大潜力通过项目嵌入产生更好的推荐结果。)

  • (11) However, despite the benefits of the multi-channel setting, the ag- gregation operation might also obscure the inherent characteristics of different types of high-order connectivity information [54], as different channels would learn embeddings with varying distributions on different hypergraphs.(然而,尽管多通道设置的好处,聚合操作也可能掩盖了不同类型的高阶连通性信息[54]的固有特性,因为不同的通道在不同超图上学习的嵌入分布不同)(聚合面临的难点,引出解决方案)

  • (12) To address this issue and fully inherit the rich information in the hypergraphs, we innovatively integrate a self-supervised task [15, 37] into the training of the multi-channel hypergraph convolutional network(我们创新性地将一个自监督任务[15,37]集成到多通道超图卷积网络的训练中)

  • (13) Unlike existing studies which enforce perturbations on graphs to augment the ground-truth [53], we propose to construct self-supervision signals by exploiting the hypergraph structures, with the intuition that the comprehensive user representation should reflect the user node’s local and global high-order connectivity patterns in different hypergraphs.(现有的研究通过对图施加扰动来增强基本真理[53],与之不同的是,我们提出利用超图结构构造自我监督信号全面用户表示应该反映不同超图中用户节点的局部和全局高阶连接模式的直觉)

    • Concretely, we leverage the hierarchy in the hypergraph structures and hierarchically maximizes the mutual information between representations of the user, the user-centered sub-hypergraph, and the global hypergraph. (具体来说,我们利用超图结构中的层次结构,分层地最大化用户、以用户为中心的子超图和全局超图表示之间的互信息。)
    • The mutual information here measures the structural informativeness of the sub- and the whole hypergraph towards inferring the user features through the reduction in local and global structure uncertainty.(这里的互信息通过减少局部和全局的结构不确定性来 度量子超图和整体超图的结构信息量,从而推断用户特征。)
    • Finally, we unify the recommendation task and the self-supervised task under a p r i m a r y & a u x i l i a r y primary \& auxiliary primary&auxiliary learning framework. By jointly optimizing the two tasks and leveraging the interplay of all the components, the performance of the recommendation task achieves significant gains. (最后,我们将推荐任务和自我监督任务统一在一个初级和辅助学习框架下。通过共同优化两个任务,并利用所有组件的相互作用,推荐任务的性能得到显著提高)
  • (14) The major contributions of this paper are summarized as follows:

    • We investigate the potentials of fusing hypergraph modeling and graph neural networks in social recommendation by exploiting multiple types of high-order user relations under a multi-channel setting.(我们研究了融合超图建模和图神经网络在社交推荐中的潜力,利用多通道设置下的多种高阶用户关系。)
    • We innovatively integrate self-supervised learning into the training of the hypergraph convolutional network and show that a self-supervised auxiliary task can significantly improve the social recommendation task.(我们创新性地将自我监督学习整合到超图卷积网络的训练中,并表明自我监督辅助任务可以显著提高社会推荐任务。)
    • We conduct extensive experiments on multiple real-world datasets to demonstrate the superiority of the proposed model and thoroughly ablate the model to investigate the effectiveness of each component with an ablation study.(做了消融实验)

2 Related Work

2.1 Social Recommendation

  • (1) Early exploration of social recommender systems mostly focuses on matrix factorization (MF), which has a nice probabilistic interpretation with Gaussian prior and is the most used technique in social recommendation regime.(社会推荐系统的早期探索主要集中在矩阵分解(MF),它具有良好的高斯先验概率解释,是社会推荐系统中最常用的技术)

  • (2) The common ideas of MF-based social recommendation algorithms can be categorized into three groups:

    • co-factorization methods [22, 46], (协因子分解方法)
    • ensemble methods [20], (集成方法)
    • and regularization methods [23]. (正则化方法)

    Besides, there are also studies using
    - socially-aware MF to model point-of-interest [48, 51, 52], (使用社会感知的MF建模兴趣点)
    - preference evolution [39], (偏好进化)
    - item ranking [55, 61], (项目排名)
    - and relation generation [11, 57].(关系生成)

  • (3) Many research efforts demonstrate that deep neural models are more capable of capturing high-level latent preferences [49, 50]

    • Specifically, graph neural networks (GNNs) [63] have achieved great success in this area, owing to their strong capability to model graph data.
    • GraphRec [9] is the first to introduce GNNs to social recommendation by modeling the user-item and user-user interactions as graph data
    • DiffNet [41] and its extension DiffNet++ [40] model the recursive dynamic social diffusion in social recommendation with a layer-wise propagation structure.
    • Wu et al. [42] propose a dual graph attention network to collaboratively learn representations for two-fold social effects
    • Song et al. develop DGRec [34] to model both users’ session-based interests as well as dynamic social influences.
    • Yu et al. [58] propose a deep adversarial framework based on GCNs to address the common issues in social recommendation (一个基于GCNs的深度对抗框架来解决社会推荐中的共同问题)
    • In summary, the common idea of these works is to model the user-user and user-item interactions as simple graphs with pairwise connections and then use multiple graph neural layers to capture the node dependencies.(这些工作的思想是将用户-用户和用户-项目交互建模为具有成对连接的简单图,然后使用多个图神经层来捕获节点依赖关系)

2.2 Hypergraph in Recommender Systems

  • (1) Hypergraph [4] provides a natural way to model complex high-order relations and has been extensively employed to tackle various problems. With the development of deep learning, some studies combine GNNs and hypergraphs to enhance representation learning.(超图[4]为复杂的高阶关系建模提供了一种自然的方法,并被广泛应用于解决各种问题。随着深度学习的发展,一些研究结合gnn和超图来增强表示学习。)

    • HGNN [10] is the first work that designs a hyperedge convolution operation to handle complex data correlation in representation learning from a spectral perspective. (HGNN[10]是第一个从谱的角度设计超边缘卷积运算来处理表示学习中的复杂数据相关性的作品)
    • Bai et al. [2] introduce hyper-graph attention to hypergraph convolutional networks to improve their capacity. However, despite the great capacity in modeling complex data, the potentials of hypergraph for improving recommender systems have been rarely explored.
  • (2) There are only several studies focusing on the combination of these two topics

    • Bu et al. [5] introduce hypergraph learning to music recommender systems, which is the earliest attempt
    • The most recent combinations are HyperRec [38] and DHCF [16], which borrow the strengths of hypergraph neural networks to model the short-term user preference for next-item recommendation and the high-order correlations among users and items for general collaborative filtering, respectively.
    • As for the applications in social recommendation, HMF [62] uses hypergraph topology to describe and analyze the interior relation of social network in recommender systems, but it does not fully exploit high-order social relations since HMF is a hybrid recommendation model.
    • LBSN2Vec [47] is a social-aware POI recommendation model that builds hyperedges by jointly sampling friendships and check-ins with random walk, but it focuses on connecting different types of entities instead of exploiting the high-order social network structures

2.3 Self-Supervised Learning

  • (1) Self-supervised learning [15] is an emerging paradigm to learn with the ground-truth samples obtained from the raw data. (自我监督学习[15]是一种新兴的学习范式,利用从原始数据中获得的ground-truth样本进行学习。)

  • (2) It was firstly used in the image domain [1, 59] by rotating, cropping and colorizing the image to create auxiliary supervision signals

    • The latest advances in this area extend self-supervised learning to graph representation learning [28, 29, 35, 37]. These studies mainly develop self-supervision tasks from the perspective of investigating graph structure.
    • Node properties such as degree, proximity, and attributes, which are seen as local structure information, are often used as the ground truth to fully exploit the unlabeled data [17](节点属性如度、邻近度和属性等被视为局部结构信息,通常被用作ground truth来充分利用未标记数据[17])
      • For example, InfoMotif [31] models attribute correlations in motif structures with mutual information maximization to regularize graph neural networks. (使用互信息最大化方法对motif结构中的属性相关性进行建模,以正则化图神经网络)
      • Meanwhile, global structure information like node pair distance is also harnessed to facilitate representation learning [35]. (同时,还利用节点对距离等全局结构信息来促进表示学习[35]。)
  • (3) Besides, contrasting congruent and incongruent views of graphs with mutual information maximization [29, 37] is another way to set up a self-supervised task, which has also shown promising results. (此外,对比具有互信息最大化的图的一致视图和不一致视图是另一种建立自我监督任务的方法,该方法也取得了良好的效果)

  • (4) As the research of self-supervised learning is still in its infancy, there are only several works combining it with recommender systems [24, 44, 45, 64]. (由于对自我监督学习的研究还处于起步阶段,将其与推荐系统相结合的研究只有几篇[24,44,45,64)

    • These efforts either mine self-supervision signals from future/surrounding sequential data [24, 45], or mask attributes of items/users to learn correlations of the raw data [64].(这些努力要么从未来/现在的序列数据中挖掘自我监督信号,要么从物品/用户的掩码属性中学习原始数据的相关性)
    • However, these thoughts cannot be easily adopted to social recommendation where temporal factors and attributes may not be available.(但是,由于时间因素和属性的限制,这些思想很难应用到社会推荐中。)
    • The most relevant work to ours is GroupIM [32], which maximizes mutual information between representations of groups and group members to overcome the sparsity problem of group interactions.(与我们最相关的工作是GroupIM,它最大化了群体和群体成员表示之间的互信息,以克服群体交互的稀疏性问题)
  • (5) As the group can be seen as a special social clique, this work can be a corroboration of the effectiveness of social self-
    supervision signals. (由于群体可以看作是一个特殊的社交小圈子,这项工作可以证实社会自我监督信号的有效性。)

3 Proposed Model

3.1 Preliminaries

  • (1) 阐述一下符号表示
    • Let U = { u 1 , u 2 , . . . , u m } U = \{u_1, u_2, ..., u_m\} U={ u1,u2,...,um} denote the user set ( ∣ U ∣ = m |U|=m U=m)
    • and I = { i 1 , i 2 , . . . , i n } I = \{i_1, i_2, ..., i_n\} I={ i1,i2,...,in} denote the item( ∣ I = n ∣ |I=n| I=n)
    • I ( u ) I(u) I(u) is the set of user consumption in which items consumed by user u u u are included.
    • R ∈ R m × n R\in R^{m\times n} RRm×n is a binary matrix that stores user-item interactions.
      • For each pair ( u , i ) , r u i = 1 (u, i), r_{ui}=1 (u,i),rui=1 indicates that user u u u consumed item i i i
      • while r u i = 0 r_{ui}=0 rui=0 means that item i i i is unexposed to user u u u, or user u u u is not interested in item i i i
  • (2) In this paper, we focus on top- K K K recommendation, and r ^ u i \hat{r}_{ui} r^ui denotes the probability of item i i i to be recommended to user u u u
  • (3) As for the social relations, we use S ∈ R m × m S\in R^{m\times m} SRm×m to denote the relation matrix which is asymmetric because we work on directed social networks.(表示非对称的关系矩阵,因为我们的社交网络是有向的)
  • (4) In our model, we have multiple convolutional layers, and we use { P ( 1 ) , P ( 2 ) , . . . , P ( l ) } ∈ R m × d \{P^{(1)}, P^{(2)}, ..., P^{(l)}\} \in R^{m\times d} { P(1),P(2),...,P(l)}Rm×d and { Q ( 1 ) , Q ( 2 ) , . . . . , Q ( l ) } ∈ R n × d \{Q^{(1)}, Q^{(2)}, ...., Q^{(l)} \} \in R^{n\times d} { Q(1),Q(2),....,Q(l)}Rn×d to denote the user and item embeddings of size d d d learned at each layer, respectively.(分别表示在每一层中学习到的大小为d维的用户和项目的嵌入)
  • (5) In this paper, we use bold capital letters to denote matrices and bold lowercase letters to denote vectors.(在本文中,我们用粗体大写字母表示矩阵,用粗体小写字母表示向量。)

Defininton 1

  • (1) Let G = ( V , E ) G=(V, E) G=(V,E) denote a hypergraph, where V V V is the vertex set containing N N N unique vertices and E E E is the edge set containing M M M hyperedges.(设𝐺=(𝑉,𝐸)表示超图,其中𝑉是包含𝑁个顶点的顶点集,𝐸是包含𝑀个超边的边集)
  • (2) Each hyperedge ε ∈ E \varepsilon\in E εE can contain any number of vertices and is assigned a positive weight W ε ε W_{\varepsilon \varepsilon} Wεε , and all the weights formulate a diagonal matrix W ∈ R M × M W\in R^{M\times M} WRM×M.(每个超边𝜖∈𝐸都可以包含任意数量的顶点,并被分配了一个正向的权重𝑊𝜖𝜖。所有的权重都形成了一个对角矩阵)
  • The hypergraph can be represented by an incidence matrix H ∈ R N × M H\in R^{N\times M} HRN×M where H i ε H_{i\varepsilon} Hiε = 1 if the hyperedge ε ∈ E \varepsilon\in E εE contains a vertex v i ∈ V v_i \in V viV , otherwise 0.(超图可以用入射矩阵来表示)
  • The vertex and edge degree matrices are diagonal matrices denoted by D D D and L L L, respectively.
    • where D i i = ∑ ε = 1 M W ε ε H i ε D_{ii} =\sum^{M}_{\varepsilon=1}W_{\varepsilon \varepsilon}H_{i\varepsilon} Dii=ε=1MWεεHiε; L ε ε = ∑ i = 1 N H i ε L_{\varepsilon \varepsilon}=\sum^{N}_{i=1}H_{i\varepsilon} Lεε=i=1N
;