×

PyTorch教程14.4之锚箱

消耗积分:0 | 格式:pdf | 大小:0.40 MB | 2023-06-05

Petc

分享资料个

物体检测算法通常在输入图像中采样大量区域,判断这些区域是否包含感兴趣的物体,并调整区域的边界,从而更准确地预测物体的真实边界 不同的模型可能采用不同的区域采样方案。在这里,我们介绍其中一种方法:它生成多个以每个像素为中心的具有不同比例和纵横比的边界框。这些边界框称为锚框我们将在14.7 节设计一个基于锚框的目标检测模型

首先,让我们修改打印精度以获得更简洁的输出。

%matplotlib inline
import torch
from d2l import torch as d2l

torch.set_printoptions(2) # Simplify printing accuracy
%matplotlib inline
from mxnet import gluon, image, np, npx
from d2l import mxnet as d2l

np.set_printoptions(2) # Simplify printing accuracy
npx.set_np()

14.4.1。生成多个锚框

假设输入图像的高度为h和宽度 w. 我们以图像的每个像素为中心生成具有不同形状的锚框。规模成为s∈(0,1]纵横比(宽高比)r>0. 那么anchor box的宽高分别是hsrhs/r, 分别。请注意,当中心位置给定时,将确定一个已知宽度和高度的锚框。

为了生成多个不同形状的锚框,让我们设置一系列尺度s1,…,sn和一系列纵横比 r1,…,rm. 当以每个像素为中心使用这些尺度和纵横比的所有组合时,输入图像将总共有whnm锚箱。虽然这些anchor boxes可能会覆盖所有的ground-truth bounding boxes,但是计算复杂度很容易过高。在实践中,我们只能考虑那些包含s1或者r1:

(14.4.1)(s1,r1),(s1,r2),…,(s1,rm),(s2,r1),(s3,r1),…,(sn,r1).

也就是说,以同一个像素为中心的anchor boxes的个数为 n+m−1. 对于整个输入图像,我们将生成总共 wh(n+m−1)锚箱。

上面生成anchor boxes的方法是在下面的multibox_prior函数中实现的。我们指定输入图像、比例列表和纵横比列表,然后此函数将返回所有锚框。

#@save
def multibox_prior(data, sizes, ratios):
  """Generate anchor boxes with different shapes centered on each pixel."""
  in_height, in_width = data.shape[-2:]
  device, num_sizes, num_ratios = data.device, len(sizes), len(ratios)
  boxes_per_pixel = (num_sizes + num_ratios - 1)
  size_tensor = torch.tensor(sizes, device=device)
  ratio_tensor = torch.tensor(ratios, device=device)
  # Offsets are required to move the anchor to the center of a pixel. Since
  # a pixel has height=1 and width=1, we choose to offset our centers by 0.5
  offset_h, offset_w = 0.5, 0.5
  steps_h = 1.0 / in_height # Scaled steps in y axis
  steps_w = 1.0 / in_width # Scaled steps in x axis

  # Generate all center points for the anchor boxes
  center_h = (torch.arange(in_height, device=device) + offset_h) * steps_h
  center_w = (torch.arange(in_width, device=device) + offset_w) * steps_w
  shift_y, shift_x = torch.meshgrid(center_h, center_w, indexing='ij')
  shift_y, shift_x = shift_y.reshape(-1), shift_x.reshape(-1)

  # Generate `boxes_per_pixel` number of heights and widths that are later
  # used to create anchor box corner coordinates (xmin, xmax, ymin, ymax)
  w = torch.cat((size_tensor * torch.sqrt(ratio_tensor[0]),
          sizes[0] * torch.sqrt(ratio_tensor[1:])))\
          * in_height / in_width # Handle rectangular inputs
  h = torch.cat((size_tensor / torch.sqrt(ratio_tensor[0]),
          sizes[0] / torch.sqrt(ratio_tensor[1:])))
  # Divide by 2 to get half height and half width
  anchor_manipulations = torch.stack((-w, -h, w, h)).T.repeat(
                    in_height * in_width, 1) / 2

  # Each center point will have `boxes_per_pixel` number of anchor boxes, so
  # generate a grid of all anchor box centers with `boxes_per_pixel` repeats
  out_grid = torch.stack([shift_x, shift_y, shift_x, shift_y],
        dim=1).repeat_interleave(boxes_per_pixel, dim=0)
  output = out_grid + anchor_manipulations
  return output.unsqueeze(0)
#@save
def multibox_prior(data, sizes, ratios):
  """Generate anchor boxes with different shapes centered on each pixel."""
  in_height, in_width = data.shape[-2:]
  device, num_sizes, num_ratios = data.ctx, len(sizes), len(ratios)
  boxes_per_pixel = (num_sizes + num_ratios - 1)
  size_tensor = np.array(sizes, ctx=device)
  ratio_tensor = np.array(ratios, ctx=device)
  # Offsets are required to move the anchor to the center of a pixel. Since
  # a pixel has height=1 and width=1, we choose to offset our centers by 0.5
  offset_h, offset_w = 0.5, 0.5
  steps_h = 1.0 / in_height # Scaled steps in y-axis
  steps_w = 1.0 / in_width # Scaled steps in x-axis

  # Generate all center points for the anchor boxes
  center_h = (np.arange(in_height, ctx=device) + offset_h) * steps_h
  center_w = (np.arange(in_width, ctx=device) + offset_w) * steps_w
  shift_x, shift_y = np.meshgrid(center_w, center_h)
  shift_x, shift_y = shift_x.reshape(-1), shift_y.reshape(-1)

  # Generate `boxes_per_pixel` number of heights and widths that are later
  # used to create anchor box corner coordinates (xmin, xmax, ymin, ymax)
  w = np.concatenate((size_tensor * np.sqrt(ratio_tensor[0]),
            sizes[0] * np.sqrt(ratio_tensor[1:]))) \
            * 

声明:本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人,不代表电子发烧友网立场。文章及其配图仅供工程师学习之用,如有内容侵权或者其他违规问题,请联系本站处理。 举报投诉

评论(0)
发评论

下载排行榜

全部0条评论

快来发表一下你的评论吧 !