PyTorch教程4.3之基本分类模型-电子发烧友网

您可能已经注意到，在回归的情况下，从头开始的实现和使用框架功能的简洁实现非常相似。分类也是如此。由于本书中的许多模型都处理分类，因此值得添加专门支持此设置的功能。本节为分类模型提供了一个基类，以简化以后的代码。

						import torch
from d2l import torch as d2l

						from mxnet import autograd, gluon, np, npx
from d2l import mxnet as d2l

npx.set_np()

						from functools import partial
import jax
import optax
from jax import numpy as jnp
from d2l import jax as d2l

						 

						No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

					

						import tensorflow as tf
from d2l import tensorflow as d2l

4.3.1. 类`Classifier`_

我们在下面定义Classifier类。在中，validation_step我们报告了验证批次的损失值和分类准确度。我们为每个批次绘制一个更新num_val_batches 。这有利于在整个验证数据上生成平均损失和准确性。如果最后一批包含的示例较少，则这些平均数并不完全正确，但我们忽略了这一微小差异以保持代码简单。

							class Classifier(d2l.Module): #@save
  """The base class of classification models."""
  def validation_step(self, batch):
    Y_hat = self(*batch[:-1])
    self.plot('loss', self.loss(Y_hat, batch[-1]), train=False)
    self.plot('acc', self.accuracy(Y_hat, batch[-1]), train=False)

							 

We define the Classifier class below. In the validation_step we report both the loss value and the classification accuracy on a validation batch. We draw an update for every num_val_batches batches. This has the benefit of generating the averaged loss and accuracy on the whole validation data. These average numbers are not exactly correct if the last batch contains fewer examples, but we ignore this minor difference to keep the code simple.

							class Classifier(d2l.Module): #@save
  """The base class of classification models."""
  def validation_step(self, batch):
    Y_hat = self(*batch[:-1])
    self.plot('loss', self.loss(Y_hat, batch[-1]), train=False)
    self.plot('acc', self.accuracy(Y_hat, batch[-1]), train=False)

							 

We also redefine the training_step method for JAX since all models that will subclass Classifier later will have a loss that returns auxiliary data. This auxiliary data can be used for models with batch normalization (to be explained in Section 8.5), while in all other cases we will make the loss also return a placeholder (empty dictionary) to represent the auxiliary data.

							class Classifier(d2l.Module): #@save
  """The base class of classification models."""
  def training_step(self, params, batch, state):
    # Here value is a tuple since models with BatchNorm layers require
    # the loss to return auxiliary data
    value, grads = jax.value_and_grad(
      self.loss, has_aux=True)(params, batch[:-1], batch[-1], state)
    l, _ = value
    self.plot("loss", l, train=True)
    return value, grads

  def validation_step(self, params, batch, state):
    # Discard the second returned value. It is used for training models
    # with BatchNorm layers since loss also returns auxiliary data
    l, _ = self.loss(params, batch[:-1], batch[-1], state)
    self.plot('loss', l, train=False)
    self.plot('acc', self.accuracy(params, batch[:-1], batch[-1], state),
         train=False)

							 

							class Classifier(d2l.Module): #@save
  """The base class of classification models."""
  def validation_step(self, batch):
    Y_hat = self(*batch[:-1])
    self.plot('loss', self.loss(Y_hat, batch[-1]), train=False)
    self.plot('acc', self.accuracy(Y_hat, batch[-1]), train=False)

							 

默认情况下，我们使用随机梯度下降优化器，在小批量上运行，就像我们在线性回归的上下文中所做的那样。

							@d2l.add_to_class(d2l.Module) #@save
def configure_optimizers(self):
  return torch.optim.SGD(self.parameters(), lr=self.lr)

							 

							@d2l.add_to_class(d2l.Module) #@save
def configure_optimizers(self):
  params = self.parameters()
  if isinstance(params, list):
    return d2l.SGD(params, self.lr)
  return gluon.Trainer(params, 'sgd', {'learning_rate': self.lr})

							 

							@d2l.add_to_class(d2l.Module) #@save
def configure_optimizers(self):
  return optax.sgd(self.lr)

							 

							@d2l.add_to_class(d2l.Module) #@save
def configure_optimizers(self):
  return tf.keras.optimizers.SGD(self.lr)

							 

4.3.2. 准确性

给定预测概率分布y_hat，每当我们必须输出硬预测时，我们通常会选择预测概率最高的类别。事实上，许多应用程序需要我们做出选择。例如，Gmail 必须将电子邮件分类为“主要”、“社交”、“更新”、“论坛”或“垃圾邮件”。它可能会在内部估计概率，但最终它必须在类别中选择一个。

当预测与标签 class 一致时y，它们是正确的。分类准确度是所有正确预测的分数。尽管直接优化精度可能很困难（不可微分），但它通常是我们最关心的性能指标。它通常是基准测试中的相关数量。因此，我们几乎总是在训练分类器时报告它。

准确度计算如下。首先，如果y_hat是一个矩阵，我们假设第二个维度存储每个类别的预测分数。我们使用argmax每行中最大条目的索引来获取预测类。然后我们将预测的类别与真实的元素进行比较y。由于相等运算符== 对数据类型敏感，因此我们转换的y_hat数据类型以匹配的数据类型y。结果是一个包含条目 0（假）和 1（真）的张量。求和得出正确预测的数量。

							@d2l.add_to_class(Classifier) #@save
def accuracy(self, Y_hat, Y, averaged=True):
  """Compute the number of correct predictions."""
  Y_hat = Y_hat.reshape((-1, Y_hat.shape[-1]))
  preds = Y_hat.argmax(axis=1).type(Y.dtype)
  compare = (preds == Y.reshape(-1)).type(torch.float32)
  return compare.mean() if averaged else compare

							 

							@d2l.add_to_class(Classifier) #@save
def accuracy(self, Y_hat, Y, averaged=True):
  """Compute the number of correct predictions."""
  Y_hat = Y_hat.reshape((-1, Y_hat.shape[-1]))
  preds = Y_hat.argmax(axis=1).astype(Y.dtype)
  compare = (preds == Y.reshape(-1)).astype(np.float32)
  return compare.mean() if averaged else compare

@d2l.add_to_class(d2l.Module) #@save
def get_scratch_params(self):
  params = []
  for attr in dir(self):
    a = getattr(self, attr)
    if isinstance(a, np.ndarray):
      params.append(a)
    if isinstance(a, d2l.Module):
      params.extend(a.get_scratch_params())
  return params

@d2l.add_to_class(d2l.Module) #@save
def parameters(self):
  params = self.collect_params()
  return params if isinstance(params, gluon.parameter.ParameterDict) and
						

PyTorch教程4.3之基本分类模型

4.3.1. 类Classifier_

4.3.2. 准确性

雷达的基本分类方法

PyTorch教程23.2之使用亚马逊SageMaker

PyTorch教程23.8之API

PyTorch教程4.1之Softmax回归

PyTorch教程3.6之概括

PyTorch教程4.2之图像分类数据集

PyTorch教程4.6之分类中的泛化

PyTorch教程12.2之凸度

PyTorch教程13.4之硬件

PyTorch教程13.2之异步计算

PyTorch教程14.2之微调

PyTorch教程6.7之显卡

PyTorch教程2.5之自动微分

PyTorch教程9.3.之语言模型

基于注意力机制的新闻文本分类模型

基于LSTM的表示学习-文本分类模型

一种特征假期朴素贝叶斯文本分类算法

基于主题分布优化的模糊文本分类方法

基于神经网络与隐含狄利克雷分配的文本分类

基于双通道词向量的卷积胶囊网络文本分类算法

基于不同神经网络的文本分类方法研究对比

基于主题相似度聚类的文本分类算法综述

融合文本分类和摘要的多任务学习摘要模型

基于BERT+Bo-LSTM+Attention的病历短文分类模型

一种基于BERT模型的社交电商文本分类算法

依据待分类实例显著局部特征的懒惰式分类模型

集成WL-CNN和SL-Bi-LSTM的旅游问句文本分类算法

一种基于神经网络的短文本分类模型

结合BERT模型的中文文本分类算法

基于深度神经网络的文本分类分析

如何在 PyTorch 中训练模型

使用PyTorch在英特尔独立显卡上训练模型

在PyTorch中搭建一个最简单的模型

利用TensorFlow实现基于深度神经网络的文本分类模型

pytorch中有神经网络模型吗

PyTorch神经网络模型构建过程

解读PyTorch模型训练过程

如何使用PyTorch建立网络模型

使用PyTorch搭建Transformer模型

卷积神经网络在文本分类领域的应用

人工智能中文本分类的基本原理和关键技术

如何加速生成2 PyTorch扩散模型

深度学习框架pytorch介绍

没有“中间商赚差价”， OpenVINO™ 直接支持 PyTorch 模型对象

NLP中的迁移学习：利用预训练模型进行文本分类

如何将Pytorch自训练模型变成OpenVINO IR模型形式

PyTorch教程-4.3. 基本分类模型

PyTorch文本分类任务的基本流程

文本分类中处理样本不均衡和提升模型鲁棒性的trick

带你从头构建文本分类器

胶囊网络在小样本做文本分类中的应用（下）

基于PyTorch的深度学习入门教程之PyTorch简单知识

基于PyTorch的深度学习入门教程之训练一个神经网络分类器

基于PyTorch的深度学习入门教程之使用PyTorch构建一个神经网络

文本分类的一个大型“真香现场”来了

一种处理多标签文本分类的新颖推理机制

textCNN论文与原理——短文本分类

使用Cortex将PyTorch模型部署到生产中

深度学习应用的服务端部署—PyTorch模型部署

如何让PyTorch模型训练变得飞快？

下载排行榜

瑞芯微RK3588系列开发板-产品资料更新-2026.06

ZS7606XY同步整流使用规格书

LZC9300A 高性能恒压恒流碳化硅驱动控制芯片

ZS73XGaN 高性能 PWM+D-GaN 芯片数据手册

瑞芯微RK3572开发板-产品资料更新-2026.06

OK3506-S12 Mini开发板产品资料-2025.10

4.3.1. 类`Classifier`_