×

PyTorch教程16.2之情感分析:使用递归神经网络

消耗积分:0 | 格式:pdf | 大小:0.20 MB | 2023-06-05

王秀珍

分享资料个

与词相似度和类比任务一样,我们也可以将预训练词向量应用于情感分析。由于第 16.1 节中的 IMDb 评论数据集 不是很大,使用在大规模语料库上预训练的文本表示可能会减少模型的过度拟合。作为图 16.2.1所示的具体示例 ,我们将使用预训练的 GloVe 模型表示每个标记,并将这些标记表示输入多层双向 RNN 以获得文本序列表示,并将其转换为情感分析输出 Maas,2011对于相同的下游应用程序,我们稍后会考虑不同的架构选择。

https://file.elecfans.com/web2/M00/A9/CD/poYBAGR9PJyAB1TZAAKGTdnYvUk151.svg

图 16.2.1本节将预训练的 GloVe 提供给基于 RNN 的架构进行情绪分析。

import torch
from torch import nn
from d2l import torch as d2l

batch_size = 64
train_iter, test_iter, vocab = d2l.load_data_imdb(batch_size)
from mxnet import gluon, init, np, npx
from mxnet.gluon import nn, rnn
from d2l import mxnet as d2l

npx.set_np()

batch_size = 64
train_iter, test_iter, vocab = d2l.load_data_imdb(batch_size)

16.2.1。用 RNN 表示单个文本

在文本分类任务中,例如情感分析,变长的文本序列将被转换为固定长度的类别。在下面的BiRNN类中,虽然文本序列的每个标记都通过嵌入层 ( self.embedding) 获得其单独的预训练 GloVe 表示,但整个序列由双向 RNN ( self.encoder) 编码。更具体地说,双向 LSTM 在初始和最终时间步的隐藏状态(在最后一层)被连接起来作为文本序列的表示。然后通过具有两个输出(“正”和“负”)的全连接层 ( self.decoder) 将该单一文本表示转换为输出类别。

class BiRNN(nn.Module):
  def __init__(self, vocab_size, embed_size, num_hiddens,
         num_layers, **kwargs):
    super(BiRNN, self).__init__(**kwargs)
    self.embedding = nn.Embedding(vocab_size, embed_size)
    # Set `bidirectional` to True to get a bidirectional RNN
    self.encoder = nn.LSTM(embed_size, num_hiddens, num_layers=num_layers,
                bidirectional=True)
    self.decoder = nn.Linear(4 * num_hiddens, 2)

  def forward(self, inputs):
    # The shape of `inputs` is (batch size, no. of time steps). Because
    # LSTM requires its input's first dimension to be the temporal
    # dimension, the input is transposed before obtaining token
    # representations. The output shape is (no. of time steps, batch size,
    # word vector dimension)
    embeddings = self.embedding(inputs.T)
    self.encoder.flatten_parameters()
    # Returns hidden states of the last hidden layer at different time
    # steps. The shape of `outputs` is (no. of time steps, batch size,
    # 2 * no. of hidden units)
    outputs, _ = self.encoder(embeddings)
    # Concatenate the hidden states at the initial and final time steps as
    # the input of the fully connected layer. Its shape is (batch size,
    # 4 * no. of hidden units)
    encoding = torch.cat((outputs[0], outputs[-1]), dim=1)
    outs = self.decoder(encoding)
    return outs
class BiRNN(nn.Block):
  def __init__(self, vocab_size, embed_size, num_hiddens,
         num_layers, **kwargs):
    super(BiRNN, self).__init__(**kwargs)
    self.embedding = nn.Embedding(vocab_size, embed_size)
    # Set `bidirectional` to True to get a bidirectional RNN
    self.encoder = rnn.LSTM(num_hiddens, num_layers=num_layers,
                bidirectional=True, input_size=embed_size)
    self.decoder = nn.Dense(2)

  def forward(self, inputs):
    # The shape of `inputs` is (batch size, no. of time steps). Because
    # LSTM requires its input's first dimension to be the temporal
    # dimension, the input is transposed before obtaining token
    # representations. The output shape is (no. of time steps, batch size,
    # word vector dimension)
    embeddings = self.embedding(inputs.T)
    # Returns hidden states of the last hidden layer at different time
    # steps. The shape of `outputs` is (no. of time steps, batch size,
    # 2 * no. of hidden units)
    outputs = self.encoder(embeddings)
    # Concatenate the hidden states at the initial and final time steps as
    # the input of the fully connected layer. Its shape is (batch size,
    # 4 * no. of hidden units)
    encoding = np.concatenate((outputs[0], outputs[-1]), axis=1)
    outs = self.decoder(encoding)
    return outs

让我们构建一个具有两个隐藏层的双向 RNN 来表示用于情感分析的单个文本。

embed_size, num_hiddens, num_layers, devices = 100, 100, 2, d2l.try_all_gpus()
net = BiRNN(len(vocab), embed_size, num_hiddens, num_layers)

def init_weights(module):
  if type(module) == nn.Linear:
    nn.init.xavier_uniform_(module.weight)
  if type(module) == nn.LSTM:
    for param in module._flat_weights_names:
      if "weight" in param:
        nn.init.xavier_uniform_(module._parameters[param])
net.apply(init_weights);
embed_size, num_hiddens, num_layers, devices = 100, 100, 2, d2l.try_all_gpus()
net = BiRNN(len(vocab), embed_size, num_hiddens, num_layers)

net.initialize(init.Xavier(), ctx=devices)

16.2.2。加载预训练词向量

embed_size下面我们为词汇表中的标记加载预训练的 100 维(需要与 一致)GloVe 嵌入。

glove_embedding = d2l.TokenEmbedding('glove.6b.100d')
Downloading ../data/glove.6B.100d.zip from http://d2l-data.s3-accelerate.amazonaws.com/glove.6B.100d.zip...
glove_embedding = d2l.TokenEmbedding('glove.6b.100d')

打印词汇表中所有标记的向量形状。

embeds = glove_embedding[vocab.idx_to_token]
embeds.shape
torch.Size([49346, 100])
embeds = glove_embedding[vocab.idx_to_token]
embeds.shape
(49346, 100)

我们使用这些预训练的词向量来表示评论中的标记,并且不会在训练期间更新这些向量。

net.embedding.weight.data.copy_(embeds)
net.embedding.weight.requires_grad = False
net.embedding.weight.set_data(embeds)
net.embedding.collect_params().setattr('grad_req', 'null')

16.2.3。训练和评估模型

现在我们可以训练双向 RNN 进行情感分析。

lr, num_epochs = 0.01, 5
trainer = torch.optim.Adam(net.parameters(), lr=lr)
loss = nn.CrossEntropyLoss(reduction="none")
d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices)
loss 0.311, train acc 0.872, test acc 0.850
574.5 examples/sec on [device(type='cuda', index=0), device(type='cuda', index=1)]
https://file.elecfans.com/web2/M00/A9/CD/poYBAGR9PJ6AJIk8AAECA4Wy71Y322.svg
lr, num_epochs = 0.01, 5
trainer = gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': lr})
loss = gluon.loss.SoftmaxCrossEntropyLoss()
d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices)
loss 0.428, train acc 0.806, test acc 0.791
488.5 examples/sec on [gpu(0), gpu(1)]
https://file.elecfans.com/web2/M00/AA/48/pYYBAGR9PKGAE9v0AAEB8Qpd38M668.svg

我们定义了以下函数来使用经过训练的模型预测文本序列的情绪net


声明:本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人,不代表电子发烧友网立场。文章及其配图仅供工程师学习之用,如有内容侵权或者其他违规问题,请联系本站处理。 举报投诉

评论(0)
发评论

下载排行榜

全部0条评论

快来发表一下你的评论吧 !