图嵌入技术

图嵌入(Graph Embedding)就是把图中的每个顶点对应一个向量,两个顶点挨的越近(或者联系越紧密、或者共同的边越多),顶点向量在顶点向量空间里也就越近。

deepwalk

随机游走(deepwalk),即从任意一个顶点出发,随机选择和该顶点有边的一个顶点作为下一个顶点,不断重复这个过程,便生成了一个顶点序列。可以想象,采样的顶点序列多了以后,联系得越紧密的顶点,就越能共现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import numpy as np

def deepwalk(graph, num, deep, current_node):
'''
随机游走生成顶点序列
'''
corpus = []
for i in range(num):
sentence = [current_node]
current_node = current_node
count = 0
while count < deep:
count += 1
node_list = []
weight_list = []
for node, weight in graph[current_node].items():
node_list.append(node)
weight_list.append(weight)
ps = [float(weight) / sum(weight_list) for weight in weight_list]
if node_list == []:
sel_node = current_node
else:
sel_node = np.random.choice(node_list, p=ps)
sentence.append(sel_node)
current_word = sel_node
corpus.append(sentence)
return corpus

if __name__ == '__main__':
nodes = ['A', 'B', 'C', 'D', 'E', 'F', 'G']

Graph = {'A': {'B': 1, 'C':2, 'D':1},
'B': {'E': 2},
'C': {'D': 1, 'F': 1},
'D': {'B': 2, 'E': 1, 'G': 2},
'E': {},
'F': {'D': 1, 'G': 1},
'G': {'E': 1}}

num = 10 # 每个顶点开始生成num条顶点序列
deep = 20 # 每个序列深度为deep

for node in nodes:
corpus = deepwalk(Graph, num, deep, node)
# print(corpus)
0%