ShuffleNet v2网络结构复现(Pytorch版)

简介: ShuffleNet v2网络结构复现(Pytorch版)

ShuffleNet v2网络结构复现


3ebfc1253bd442b2a667fa04462651c8.png

from torch import nn
from torch.nn import functional
import torch
from torchsummary import summary
# ---------------------------- ShuffleBlock start -------------------------------
# 通道重排,跨group信息交流
def channel_shuffle(x, groups):
    batchsize, num_channels, height, width = x.data.size()
    channels_per_group = num_channels // groups
    # reshape
    x = x.view(batchsize, groups,
               channels_per_group, height, width)
    x = torch.transpose(x, 1, 2).contiguous()
    # flatten
    x = x.view(batchsize, -1, height, width)
    return x
class CBRM(nn.Module):
    def __init__(self, c1, c2):  # ch_in, ch_out
        super(CBRM, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(c1, c2, kernel_size=3, stride=2, padding=1, bias=False),
            nn.BatchNorm2d(c2),
            nn.ReLU(inplace=True),
        )
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    def forward(self, x):
        return self.maxpool(self.conv(x))
class Shuffle_Block(nn.Module):
    def __init__(self, inp, oup, stride):
        super(Shuffle_Block, self).__init__()
        if not (1 <= stride <= 3):
            raise ValueError('illegal stride value')
        self.stride = stride
        branch_features = oup // 2
        assert (self.stride != 1) or (inp == branch_features << 1)
        if self.stride > 1:
            self.branch1 = nn.Sequential(
                self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
                nn.BatchNorm2d(inp),
                nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(branch_features),
                nn.ReLU(inplace=True),
            )
        self.branch2 = nn.Sequential(
            nn.Conv2d(inp if (self.stride > 1) else branch_features,
                      branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
            self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
            nn.BatchNorm2d(branch_features),
            nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
        )
    @staticmethod
    def depthwise_conv(i, o, kernel_size, stride=1, padding=0, bias=False):
        return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)
    def forward(self, x):
        if self.stride == 1:
            x1, x2 = x.chunk(2, dim=1)  # 按照维度1进行split
            out = torch.cat((x1, self.branch2(x2)), dim=1)
        else:
            out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)
        out = channel_shuffle(out, 2)
        return out
class ShuffleNetV2(nn.Module):
    def __init__(self):
        super(ShuffleNetV2, self).__init__()
        self.MobileNet_01 = nn.Sequential(
            CBRM(3, 32),                 # 160x160
            Shuffle_Block(32, 128, 2),   # 80x80
            Shuffle_Block(128, 128, 1),  # 80x80
            Shuffle_Block(128, 256, 2),  # 40x40
            Shuffle_Block(256, 256, 1),  # 40x40
            Shuffle_Block(256, 512, 2),  # 20x20
            Shuffle_Block(512, 512, 1),  # 20x20
        )
    def forward(self, x):
        x = self.MobileNet_01(x)
        return x
if __name__ == '__main__':
    shufflenetv2 = ShuffleNetV2()
    device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
    inputs = shufflenetv2.to(device)
    summary(inputs, (3, 640, 640), batch_size=1, device="cuda")  # 分别是输入数据的三个维度
    #print(shufflenetv2)
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1          [1, 32, 320, 320]             864
       BatchNorm2d-2          [1, 32, 320, 320]              64
              ReLU-3          [1, 32, 320, 320]               0
         MaxPool2d-4          [1, 32, 160, 160]               0
              CBRM-5          [1, 32, 160, 160]               0
            Conv2d-6            [1, 32, 80, 80]             288
       BatchNorm2d-7            [1, 32, 80, 80]              64
            Conv2d-8            [1, 64, 80, 80]           2,048
       BatchNorm2d-9            [1, 64, 80, 80]             128
             ReLU-10            [1, 64, 80, 80]               0
           Conv2d-11          [1, 64, 160, 160]           2,048
      BatchNorm2d-12          [1, 64, 160, 160]             128
             ReLU-13          [1, 64, 160, 160]               0
           Conv2d-14            [1, 64, 80, 80]             576
      BatchNorm2d-15            [1, 64, 80, 80]             128
           Conv2d-16            [1, 64, 80, 80]           4,096
      BatchNorm2d-17            [1, 64, 80, 80]             128
             ReLU-18            [1, 64, 80, 80]               0
    Shuffle_Block-19           [1, 128, 80, 80]               0
           Conv2d-20            [1, 64, 80, 80]           4,096
      BatchNorm2d-21            [1, 64, 80, 80]             128
             ReLU-22            [1, 64, 80, 80]               0
           Conv2d-23            [1, 64, 80, 80]             576
      BatchNorm2d-24            [1, 64, 80, 80]             128
           Conv2d-25            [1, 64, 80, 80]           4,096
      BatchNorm2d-26            [1, 64, 80, 80]             128
             ReLU-27            [1, 64, 80, 80]               0
    Shuffle_Block-28           [1, 128, 80, 80]               0
           Conv2d-29           [1, 128, 40, 40]           1,152
      BatchNorm2d-30           [1, 128, 40, 40]             256
           Conv2d-31           [1, 128, 40, 40]          16,384
      BatchNorm2d-32           [1, 128, 40, 40]             256
             ReLU-33           [1, 128, 40, 40]               0
           Conv2d-34           [1, 128, 80, 80]          16,384
      BatchNorm2d-35           [1, 128, 80, 80]             256
             ReLU-36           [1, 128, 80, 80]               0
           Conv2d-37           [1, 128, 40, 40]           1,152
      BatchNorm2d-38           [1, 128, 40, 40]             256
           Conv2d-39           [1, 128, 40, 40]          16,384
      BatchNorm2d-40           [1, 128, 40, 40]             256
             ReLU-41           [1, 128, 40, 40]               0
    Shuffle_Block-42           [1, 256, 40, 40]               0
           Conv2d-43           [1, 128, 40, 40]          16,384
      BatchNorm2d-44           [1, 128, 40, 40]             256
             ReLU-45           [1, 128, 40, 40]               0
           Conv2d-46           [1, 128, 40, 40]           1,152
      BatchNorm2d-47           [1, 128, 40, 40]             256
           Conv2d-48           [1, 128, 40, 40]          16,384
      BatchNorm2d-49           [1, 128, 40, 40]             256
             ReLU-50           [1, 128, 40, 40]               0
    Shuffle_Block-51           [1, 256, 40, 40]               0
           Conv2d-52           [1, 256, 20, 20]           2,304
      BatchNorm2d-53           [1, 256, 20, 20]             512
           Conv2d-54           [1, 256, 20, 20]          65,536
      BatchNorm2d-55           [1, 256, 20, 20]             512
             ReLU-56           [1, 256, 20, 20]               0
           Conv2d-57           [1, 256, 40, 40]          65,536
      BatchNorm2d-58           [1, 256, 40, 40]             512
             ReLU-59           [1, 256, 40, 40]               0
           Conv2d-60           [1, 256, 20, 20]           2,304
      BatchNorm2d-61           [1, 256, 20, 20]             512
           Conv2d-62           [1, 256, 20, 20]          65,536
      BatchNorm2d-63           [1, 256, 20, 20]             512
             ReLU-64           [1, 256, 20, 20]               0
    Shuffle_Block-65           [1, 512, 20, 20]               0
           Conv2d-66           [1, 256, 20, 20]          65,536
      BatchNorm2d-67           [1, 256, 20, 20]             512
             ReLU-68           [1, 256, 20, 20]               0
           Conv2d-69           [1, 256, 20, 20]           2,304
      BatchNorm2d-70           [1, 256, 20, 20]             512
           Conv2d-71           [1, 256, 20, 20]          65,536
      BatchNorm2d-72           [1, 256, 20, 20]             512
             ReLU-73           [1, 256, 20, 20]               0
    Shuffle_Block-74           [1, 512, 20, 20]               0
================================================================
Total params: 445,824
Trainable params: 445,824
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 4.69
Forward/backward pass size (MB): 270.31
Params size (MB): 1.70
Estimated Total Size (MB): 276.70
----------------------------------------------------------------
ShuffleNetV2(
  (MobileNet_01): Sequential(
    (0): CBRM(
      (conv): Sequential(
        (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    )
    (1): Shuffle_Block(
      (branch1): Sequential(
        (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=32, bias=False)
        (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): ReLU(inplace=True)
      )
      (branch2): Sequential(
        (0): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False)
        (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
    (2): Shuffle_Block(
      (branch2): Sequential(
        (0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=64, bias=False)
        (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
    (3): Shuffle_Block(
      (branch1): Sequential(
        (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=128, bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): ReLU(inplace=True)
      )
      (branch2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=128, bias=False)
        (4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
    (4): Shuffle_Block(
      (branch2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128, bias=False)
        (4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
    (5): Shuffle_Block(
      (branch1): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=256, bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): ReLU(inplace=True)
      )
      (branch2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=256, bias=False)
        (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
    (6): Shuffle_Block(
      (branch2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256, bias=False)
        (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (7): ReLU(inplace=True)
      )
    )
  )
)
相关文章
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【从零开始学习深度学习】32. 卷积神经网络之稠密连接网络(DenseNet)介绍及其Pytorch实现
【从零开始学习深度学习】32. 卷积神经网络之稠密连接网络(DenseNet)介绍及其Pytorch实现
|
25天前
|
机器学习/深度学习 PyTorch 算法框架/工具
PyTorch 中的动态计算图:实现灵活的神经网络架构
【8月更文第27天】PyTorch 是一款流行的深度学习框架,它以其灵活性和易用性而闻名。与 TensorFlow 等其他框架相比,PyTorch 最大的特点之一是支持动态计算图。这意味着开发者可以在运行时定义网络结构,这为构建复杂的模型提供了极大的便利。本文将深入探讨 PyTorch 中动态计算图的工作原理,并通过一些示例代码展示如何利用这一特性来构建灵活的神经网络架构。
50 1
|
2月前
|
机器学习/深度学习 计算机视觉 异构计算
【YOLOv8改进 - Backbone主干】ShuffleNet V2:卷积神经网络(CNN)架构
【YOLOv8改进 - Backbone主干】ShuffleNet V2:卷积神经网络(CNN)架构
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【从零开始学习深度学习】26.卷积神经网络之AlexNet模型介绍及其Pytorch实现【含完整代码】
【从零开始学习深度学习】26.卷积神经网络之AlexNet模型介绍及其Pytorch实现【含完整代码】
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【从零开始学习深度学习】28.卷积神经网络之NiN模型介绍及其Pytorch实现【含完整代码】
【从零开始学习深度学习】28.卷积神经网络之NiN模型介绍及其Pytorch实现【含完整代码】
|
3月前
|
机器学习/深度学习 自然语言处理 算法
【从零开始学习深度学习】49.Pytorch_NLP项目实战:文本情感分类---使用循环神经网络RNN
【从零开始学习深度学习】49.Pytorch_NLP项目实战:文本情感分类---使用循环神经网络RNN
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【从零开始学习深度学习】31. 卷积神经网络之残差网络(ResNet)介绍及其Pytorch实现
【从零开始学习深度学习】31. 卷积神经网络之残差网络(ResNet)介绍及其Pytorch实现
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【从零开始学习深度学习】30. 神经网络中批量归一化层(batch normalization)的作用及其Pytorch实现
【从零开始学习深度学习】30. 神经网络中批量归一化层(batch normalization)的作用及其Pytorch实现
|
1月前
|
机器学习/深度学习 人工智能 PyTorch
【深度学习】使用PyTorch构建神经网络:深度学习实战指南
PyTorch是一个开源的Python机器学习库,特别专注于深度学习领域。它由Facebook的AI研究团队开发并维护,因其灵活的架构、动态计算图以及在科研和工业界的广泛支持而受到青睐。PyTorch提供了强大的GPU加速能力,使得在处理大规模数据集和复杂模型时效率极高。
161 59
|
8天前
|
机器学习/深度学习
小土堆-pytorch-神经网络-损失函数与反向传播_笔记
在使用损失函数时,关键在于匹配输入和输出形状。例如,在L1Loss中,输入形状中的N代表批量大小。以下是具体示例:对于相同形状的输入和目标张量,L1Loss默认计算差值并求平均;此外,均方误差(MSE)也是常用损失函数。实战中,损失函数用于计算模型输出与真实标签间的差距,并通过反向传播更新模型参数。