Bootstrap

NTM中的nn.LSTM用法

参考文献

01

方法解析

nn.LSTM(in_dim, hidden_dim, num_layer, batch_first = True)   # LSTM循环神经网络
  • in_dim: 输入的特征维度;
  • hidden_dim: 输出的特征维度;
  • num_layder: 叠加的LSTM的层数, 默认为1;
  • batch_first: True or False; 因为nn.LSTM()接受的数据输入的shape是(序列长度, batch_size, in_dim), 如果batch_first是True, 我们就可以把输入变为(batch_size, 序列长度, in_dim);
  • dropout: 表示除了最后一层之外都引入一个dropout;
  • bidirectional: 默认是false,代表不用双向LSTM,双向LSTM,也就是序列从左往右算一次,从右往左又算一次,这样就可以两倍的输出.

例子

import torch
import torch.nn as nn

in_dim = 10; hidden_dim = 20; sequence_length = 5
batch_size = 3; num_layer = 2

rnn = nn.LSTM(input_size=in_dim,hidden_size=hidden_dim,num_layers=num_layer)
inputs = torch.randn(sequence_length, batch_size, in_dim)
h_0 = torch.randn(num_layer, batch_size, hidden_dim)
c_0 = torch.randn(num_layer, batch_size, hidden_dim)     #c_0和h_0的形状相同,它包含的是在当前这个batch_size中的每个句子的初始细胞状态。h_0,c_0如果不提供,那么默认是0


#输出格式为(output,(h_n,c_n))
output,(h_n,c_n) = rnn(inputs,(h_0,c_0))     #输入格式为 lstm(input,(h_0, c_0))
print("out:", output.shape)
print("h_n:", h_n.shape)
print("c_n:", c_n.shape)
;