PyTorch教程-20.2. 深度卷积生成对抗网络

电子说

1.3w人已加入

描述

20.1 节中,我们介绍了 GAN 工作原理背后的基本思想。我们展示了他们可以从一些简单的、易于采样的分布中抽取样本,比如均匀分布或正态分布,并将它们转换成看起来与某些数据集的分布相匹配的样本。虽然我们匹配 2D 高斯分布的示例说明了要点,但它并不是特别令人兴奋。

在本节中,我们将演示如何使用 GAN 生成逼真的图像。我们的模型将基于 Radford等人介绍的深度卷积 GAN (DCGAN)。2015 年我们将借用已经证明在判别计算机视觉问题上非常成功的卷积架构,并展示如何通过 GAN 来利用它们来生成逼真的图像。

import warnings
import torch
import torchvision
from torch import nn
from d2l import torch as d2l
from mxnet import gluon, init, np, npx
from mxnet.gluon import nn
from d2l import mxnet as d2l

npx.set_np()
import tensorflow as tf
from d2l import tensorflow as d2l

20.2.1。口袋妖怪数据集

我们将使用的数据集是从pokemondb获得的 Pokemon 精灵的集合 首先下载、提取和加载此数据集。

#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
pokemon = torchvision.datasets.ImageFolder(data_dir)
Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...
#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
pokemon = gluon.data.vision.datasets.ImageFolderDataset(data_dir)
Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...
#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
batch_size = 256
pokemon = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir, batch_size=batch_size, image_size=(64, 64))
Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...
Found 40597 files belonging to 721 classes.

我们将每个图像调整为64×64. 变换ToTensor 会将像素值投影到[0,1],而我们的生成器将使用 tanh 函数获取输出 [−1,1]. 因此我们用0.5意味着和0.5标准偏差以匹配值范围。

batch_size = 256
transformer = torchvision.transforms.Compose([
  torchvision.transforms.Resize((64, 64)),
  torchvision.transforms.ToTensor(),
  torchvision.transforms.Normalize(0.5, 0.5)
])
pokemon.transform = transformer
data_iter = torch.utils.data.DataLoader(
  pokemon, batch_size=batch_size,
  shuffle=True, num_workers=d2l.get_dataloader_workers())
batch_size = 256
transformer = gluon.data.vision.transforms.Compose([
  gluon.data.vision.transforms.Resize(64),
  gluon.data.vision.transforms.ToTensor(),
  gluon.data.vision.transforms.Normalize(0.5, 0.5)
])
data_iter = gluon.data.DataLoader(
  pokemon.transform_first(transformer), batch_size=batch_size,
  shuffle=True, num_workers=d2l.get_dataloader_workers())
def transform_func(X):
  X = X / 255.
  X = (X - 0.5) / (0.5)
  return X

# For TF>=2.4 use `num_parallel_calls = tf.data.AUTOTUNE`
data_iter = pokemon.map(lambda x, y: (transform_func(x), y),
            num_parallel_calls=tf.data.experimental.AUTOTUNE)
data_iter = data_iter.cache().shuffle(buffer_size=1000).prefetch(
  buffer_size=tf.data.experimental.AUTOTUNE)
WARNING:tensorflow:From /home/d2l-worker/miniconda3/envs/d2l-en-release-1/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089

让我们想象一下前 20 张图像。

warnings.filterwarnings('ignore')
d2l.set_figsize((4, 4))
for X, y in data_iter:
  imgs = X[:20,:,:,:].permute(0, 2, 3, 1)/2+0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)
  break
pytorch
d2l.set_figsize((4, 4))
for X, y in data_iter:
  imgs = X[:20,:,:,:].transpose(0, 2, 3, 1)/2+0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)
  break
pytorch
d2l.set_figsize(figsize=(4, 4))
for X, y in data_iter.take(1):
  imgs = X[:20, :, :, :] / 2 + 0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)
pytorch

20.2.2。发电机

生成器需要映射噪声变量 z∈Rd, 长度-d向量,到具有宽度和高度的 RGB 图像64×64. 14.11 节中我们介绍了全卷积网络,它使用转置卷积层(参考 14.10 节)来扩大输入尺寸。生成器的基本块包含一个转置卷积层,然后是批量归一化和 ReLU 激活。

class G_block(nn.Module):
  def __init__(self, out_channels, in_channels=3, kernel_size=4, strides=2,
         padding=1, **kwargs):
    super(G_block, self).__init__(**kwargs)
    self.conv2d_trans = nn.ConvTranspose2d(in_channels, out_channels,
                kernel_size, strides, padding, bias=False)
    self.batch_norm = nn.BatchNorm2d(out_channels)
    self.activation = nn.ReLU()

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))
class G_block(nn.Block):
  def __init__(self, channels, kernel_size=4,
         strides=2, padding=1, **kwargs):
    super(G_block, self).__init__(**kwargs)
    self.conv2d_trans = nn.Conv2DTranspose(
      channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = nn.BatchNorm()
    self.activation = nn.Activation('relu')

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))
class G_block(tf.keras.layers.Layer):
  def __init__(self, out_channels, kernel_size=4, strides=2, padding="same",
         **kwargs):
    super().__init__(**kwargs)
    self.conv2d_trans = tf.keras.layers.Conv2DTranspose(
      out_channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = tf.keras.layers.BatchNormalization()
    self.activation = tf.keras.layers.ReLU()

  def call(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))

默认情况下,转置卷积层使用 kh=kw=4内核,一个sh=sw=2大步前进,一个 ph=pw=1填充。输入形状为 nh′×nw′=16×16,生成器块将使输入的宽度和高度加倍。

(20.2.1)nh′×nw′=[(nhkh−(nh−1)(kh−sh)−2ph]×[(nwkw−(nw−1)(kw−sw)−2pw]=[(kh+sh(nh−1)−2ph]×[(kw+sw(nw−1)−2pw]=[(4+2×(16−1)−2×1]×[(4+2×(16−1)−2×1]=32×32.
x = torch.zeros((2, 3, 16, 16))
g_blk = G_block(20)
g_blk(x).shape
torch.Size([2, 20, 32, 32])
x = np.zeros((2, 3, 16, 16))
g_blk = G_block(20)
g_blk.initialize()
g_blk(x).shape
(2, 20, 32, 32)
x = tf.zeros((2, 16, 16, 3)) # Channel last convention
g_blk = G_block(20)
g_blk(x).shape
TensorShape([2, 32, 32, 20])

如果将转置卷积层更改为4×4 核心,1×1步幅和零填充。输入大小为 1×1,输出的宽度和高度将分别增加 3。

x = torch.zeros((2, 3, 1, 1))
g_blk = G_block(20, strides=1, padding=0)
g_blk(x).shape
torch.Size([2, 20, 4, 4])
x = np.zeros((2, 3, 1, 1))
g_blk = G_block(20, strides=1, padding=0)
g_blk.initialize()
g_blk(x).shape
(2, 20, 4, 4)
x = tf.zeros((2, 1, 1, 3))
# `padding="valid"` corresponds to no padding
g_blk = G_block(20, strides=1, padding="valid")
g_blk(x).shape
TensorShape([2, 4, 4, 20])

生成器由四个基本块组成,将输入的宽度和高度从 1 增加到 32。同时,它首先将潜在变量投影到64×8通道,然后每次将通道减半。最后,使用转置卷积层生成输出。它进一步加倍宽度和高度以匹配所需的64×64形状,并将通道尺寸减小到 3. tanh 激活函数用于将输出值投影到(−1,1)范围。

n_G = 64
net_G = nn.Sequential(
  G_block(in_channels=100, out_channels=n_G*8,
      strides=1, padding=0),         # Output: (64 * 8, 4, 4)
  G_block(in_channels=n_G*8, out_channels=n_G*4), # Output: (64 * 4, 8, 8)
  G_block(in_channels=n_G*4, out_channels=n_G*2), # Output: (64 * 2, 16, 16)
  G_block(in_channels=n_G*2, out_channels=n_G),  # Output: (64, 32, 32)
  nn.ConvTranspose2d(in_channels=n_G, out_channels=3,
            kernel_size=4, stride=2, padding=1, bias=False),
  nn.Tanh()) # Output: (3, 64, 64)
n_G = 64
net_G = nn.Sequential()
net_G.add(G_block(n_G*8, strides=1, padding=0), # Output: (64 * 8, 4, 4)
     G_block(n_G*4), # Output: (64 * 4, 8, 8)
     G_block(n_G*2), # Output: (64 * 2, 16, 16)
     G_block(n_G),  # Output: (64, 32, 32)
     nn.Conv2DTranspose(
       3, kernel_size=4, strides=2, padding=1, use_bias=False,
       activation='tanh')) # Output: (3, 64, 64)
n_G = 64
net_G = tf.keras.Sequential([
  # Output: (4, 4, 64 * 8)
  G_block(out_channels=n_G*8, strides=1, padding="valid"),
  G_block(out_channels=n_G*4), # Output: (8, 8, 64 * 4)
  G_block(out_channels=n_G*2), # Output: (16, 16, 64 * 2)
  G_block(out_channels=n_G), # Output: (32, 32, 64)
  # Output: (64, 64, 3)
  tf.keras.layers.Conv2DTranspose(
    3, kernel_size=4, strides=2, padding="same", use_bias=False,
    activation="tanh")
])

生成一个 100 维的潜在变量来验证生成器的输出形状。

x = torch.zeros((1, 100, 1, 1))
net_G(x).shape
torch.Size([1, 3, 64, 64])
x = np.zeros((1, 100, 1, 1))
net_G.initialize()
net_G(x).shape
(1, 3, 64, 64)
x = tf.zeros((1, 1, 1, 100))
net_G(x).shape
TensorShape([1, 64, 64, 3])

20.2.3。判别器

判别器是一个普通的卷积网络,除了它使用一个 leaky ReLU 作为它的激活函数。鉴于 α∈[0,1], 它的定义是

(20.2.2)leaky ReLU(x)={xif x>0αxotherwise.

可以看出,如果α=0,以及一个身份函数,如果α=1. 为了α∈(0,1),leaky ReLU 是一个非线性函数,它为负输入提供非零输出。它旨在解决“垂死的 ReLU”问题,即神经元可能始终输出负值,因此无法取得任何进展,因为 ReLU 的梯度为 0。

alphas = [0, .2, .4, .6, .8, 1]
x = torch.arange(-2, 1, 0.1)
Y = [nn.LeakyReLU(alpha)(x).detach().numpy() for alpha in alphas]
d2l.plot(x.detach().numpy(), Y, 'x', 'y', alphas)
pytorch
alphas = [0, .2, .4, .6, .8, 1]
x = np.arange(-2, 1, 0.1)
Y = [nn.LeakyReLU(alpha)(x).asnumpy() for alpha in alphas]
d2l.plot(x.asnumpy(), Y, 'x', 'y', alphas)
pytorch
alphas = [0, .2, .4, .6, .8, 1]
x = tf.range(-2, 1, 0.1)
Y = [tf.keras.layers.LeakyReLU(alpha)(x).numpy() for alpha in alphas]
d2l.plot(x.numpy(), Y, 'x', 'y', alphas)
pytorch

判别器的基本块是一个卷积层,然后是一个批量归一化层和一个 leaky ReLU 激活。卷积层的超参数类似于生成器块中的转置卷积层。

class D_block(nn.Module):
  def __init__(self, out_channels, in_channels=3, kernel_size=4, strides=2,
        padding=1, alpha=0.2, **kwargs):
    super(D_block, self).__init__(**kwargs)
    self.conv2d = nn.Conv2d(in_channels, out_channels, kernel_size,
                strides, padding, bias=False)
    self.batch_norm = nn.BatchNorm2d(out_channels)
    self.activation = nn.LeakyReLU(alpha, inplace=True)

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))
class D_block(nn.Block):
  def __init__(self, channels, kernel_size=4, strides=2,
         padding=1, alpha=0.2, **kwargs):
    super(D_block, self).__init__(**kwargs)
    self.conv2d = nn.Conv2D(
      channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = nn.BatchNorm()
    self.activation = nn.LeakyReLU(alpha)

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))
class D_block(tf.keras.layers.Layer):
  def __init__(self, out_channels, kernel_size=4, strides=2, padding="same",
         alpha=0.2, **kwargs):
    super().__init__(**kwargs)
    self.conv2d = tf.keras.layers.Conv2D(out_channels, kernel_size,
                       strides, padding, use_bias=False)
    self.batch_norm = tf.keras.layers.BatchNormalization()
    self.activation = tf.keras.layers.LeakyReLU(alpha)

  def call(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))

正如我们在第 7.3 节中演示的那样,具有默认设置的基本块会将输入的宽度和高度减半例如,给定一个输入形状nh=nw=16, 具有内核形状 kh=kw=4, 步幅sh=sw=2和填充形状ph=pw=1,输出形状将是:

(20.2.3)nh′×nw′=⌊(nh−kh+2ph+sh)/sh⌋×⌊(nw−kw+2pw+sw)/sw⌋=⌊(16−4+2×1+2)/2⌋×⌊(16−4+2×1+2)/2⌋=8×8.
x = torch.zeros((2, 3, 16, 16))
d_blk = D_block(20)
d_blk(x).shape
torch.Size([2, 20, 8, 8])
x = np.zeros((2, 3, 16, 16))
d_blk = D_block(20)
d_blk.initialize()
d_blk(x).shape
(2, 20, 8, 8)
x = tf.zeros((2, 16, 16, 3))
d_blk = D_block(20)
d_blk(x).shape
TensorShape([2, 8, 8, 20])

鉴别器是生成器的镜像。

n_D = 64
net_D = nn.Sequential(
  D_block(n_D), # Output: (64, 32, 32)
  D_block(in_channels=n_D, out_channels=n_D*2), # Output: (64 * 2, 16, 16)
  D_block(in_channels=n_D*2, out_channels=n_D*4), # Output: (64 * 4, 8, 8)
  D_block(in_channels=n_D*4, out_channels=n_D*8), # Output: (64 * 8, 4, 4)
  nn.Conv2d(in_channels=n_D*8, out_channels=1,
       kernel_size=4, bias=False)) # Output: (1, 1, 1)
n_D = 64
net_D = nn.Sequential()
net_D.add(D_block(n_D),  # Output: (64, 32, 32)
     D_block(n_D*2), # Output: (64 * 2, 16, 16)
     D_block(n_D*4), # Output: (64 * 4, 8, 8)
     D_block(n_D*8), # Output: (64 * 8, 4, 4)
     nn.Conv2D(1, kernel_size=4, use_bias=False)) # Output: (1, 1, 1)
n_D = 64
net_D = tf.keras.Sequential([
  D_block(n_D), # Output: (32, 32, 64)
  D_block(out_channels=n_D*2), # Output: (16, 16, 64 * 2)
  D_block(out_channels=n_D*4), # Output: (8, 8, 64 * 4)
  D_block(out_channels=n_D*8), # Outupt: (4, 4, 64 * 64)
  # Output: (1, 1, 1)
  tf.keras.layers.Conv2D(1, kernel_size=4, use_bias=False)
])

它使用带有输出通道的卷积层1作为获得单个预测值的最后一层。

x = torch.zeros((1, 3, 64, 64))
net_D(x).shape
torch.Size([1, 1, 1, 1])
x = np.zeros((1, 3, 64, 64))
net_D.initialize()
net_D(x).shape
(1, 1, 1, 1)
x = tf.zeros((1, 64, 64, 3))
net_D(x).shape
TensorShape([1, 1, 1, 1])

20.2.4。训练

第 20.1 节中的基本 GAN 相比,我们对生成器和鉴别器使用相同的学习率,因为它们彼此相似。此外,我们改变β1在 Adam 中(第 12.10 节)来自0.90.5. 它降低了动量的平滑度,即过去梯度的指数加权移动平均值,以处理快速变化的梯度,因为生成器和鉴别器相互争斗。此外,随机生成的噪声Z是一个 4-D 张量,我们正在使用 GPU 来加速计算。

def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = nn.BCEWithLogitsLoss(reduction='sum')
  for w in net_D.parameters():
    nn.init.normal_(w, 0, 0.02)
  for w in net_G.parameters():
    nn.init.normal_(w, 0, 0.02)
  net_D, net_G = net_D.to(device), net_G.to(device)
  trainer_hp = {'lr': lr, 'betas': [0.5,0.999]}
  trainer_D = torch.optim.Adam(net_D.parameters(), **trainer_hp)
  trainer_G = torch.optim.Adam(net_G.parameters(), **trainer_hp)
  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)
  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = torch.normal(0, 1, size=(batch_size, latent_dim, 1, 1))
      X, Z = X.to(device), Z.to(device)
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, trainer_D),
            d2l.update_G(Z, net_D, net_G, loss, trainer_G),
            batch_size)
    # Show generated examples
    Z = torch.normal(0, 1, size=(21, latent_dim, 1, 1), device=device)
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z).permute(0, 2, 3, 1) / 2 + 0.5
    imgs = torch.cat(
      [torch.cat([
        fake_x[i * 7 + j].cpu().detach() for j in range(7)], dim=1)
       for i in range(len(fake_x)//7)], dim=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs)
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device)}')
def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = gluon.loss.SigmoidBCELoss()
  net_D.initialize(init=init.Normal(0.02), force_reinit=True, ctx=device)
  net_G.initialize(init=init.Normal(0.02), force_reinit=True, ctx=device)
  trainer_hp = {'learning_rate': lr, 'beta1': 0.5}
  trainer_D = gluon.Trainer(net_D.collect_params(), 'adam', trainer_hp)
  trainer_G = gluon.Trainer(net_G.collect_params(), 'adam', trainer_hp)
  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)
  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = np.random.normal(0, 1, size=(batch_size, latent_dim, 1, 1))
      X, Z = X.as_in_ctx(device), Z.as_in_ctx(device),
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, trainer_D),
            d2l.update_G(Z, net_D, net_G, loss, trainer_G),
            batch_size)
    # Show generated examples
    Z = np.random.normal(0, 1, size=(21, latent_dim, 1, 1), ctx=device)
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z).transpose(0, 2, 3, 1) / 2 + 0.5
    imgs = np.concatenate(
      [np.concatenate([fake_x[i * 7 + j] for j in range(7)], axis=1)
       for i in range(len(fake_x)//7)], axis=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs.asnumpy())
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device)}')
def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = tf.keras.losses.BinaryCrossentropy(
    from_logits=True, reduction=tf.keras.losses.Reduction.SUM)

  for w in net_D.trainable_variables:
    w.assign(tf.random.normal(mean=0, stddev=0.02, shape=w.shape))
  for w in net_G.trainable_variables:
    w.assign(tf.random.normal(mean=0, stddev=0.02, shape=w.shape))

  optimizer_hp = {"lr": lr, "beta_1": 0.5, "beta_2": 0.999}
  optimizer_D = tf.keras.optimizers.Adam(**optimizer_hp)
  optimizer_G = tf.keras.optimizers.Adam(**optimizer_hp)

  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)

  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = tf.random.normal(mean=0, stddev=1,
                 shape=(batch_size, 1, 1, latent_dim))
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, optimizer_D),
            d2l.update_G(Z, net_D, net_G, loss, optimizer_G),
            batch_size)

    # Show generated examples
    Z = tf.random.normal(mean=0, stddev=1, shape=(21, 1, 1, latent_dim))
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z) / 2 + 0.5
    imgs = tf.concat([tf.concat([fake_x[i * 7 + j] for j in range(7)],
                  axis=1)
             for i in range(len(fake_x) // 7)], axis=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs)
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device._device_name)}')

我们用少量的 epochs 训练模型只是为了演示。为了获得更好的性能,可以将变量num_epochs设置为更大的数字。

latent_dim, lr, num_epochs = 100, 0.005, 20
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)
loss_D 0.030, loss_G 7.203, 1026.4 examples/sec on cuda:0
pytorch
latent_dim, lr, num_epochs = 100, 0.005, 20
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)
loss_D 0.224, loss_G 6.386, 2260.7 examples/sec on gpu(0)
pytorch
latent_dim, lr, num_epochs = 100, 0.0005, 40
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)
loss_D 0.112, loss_G 4.952, 1968.2 examples/sec on /GPU:0
pytorch

20.2.5。概括

  • DCGAN 架构有四个用于鉴别器的卷积层和四个用于生成器的“分数步”卷积层。

  • 鉴别器是一个 4 层跨步卷积,具有批量归一化(除了它的输入层)和 leaky ReLU 激活。

  • Leaky ReLU 是一种非线性函数,可为负输入提供非零输出。它旨在解决“垂死的 ReLU”问题,并帮助梯度更容易地通过架构。

20.2.6. 练习

  1. 如果我们使用标准 ReLU 激活而不是 leaky ReLU 会发生什么?

  2. 在 Fashion-MNIST 上应用 DCGAN,看看哪个类别效果好,哪个效果不好。


打开APP阅读更多精彩内容
声明:本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人,不代表电子发烧友网立场。文章及其配图仅供工程师学习之用,如有内容侵权或者其他违规问题,请联系本站处理。 举报投诉

全部0条评论

快来发表一下你的评论吧 !

×
20
完善资料,
赚取积分