RepVGG网络中重参化网络结构解读【附代码】上

简介: 笔记

    YOLOv7论文中会遇到一个词叫“重参化网络”或者“重参化卷积”,YOLOV7则是用到了这种网络结构,里面参考的论文是“RepVGG: Making VGG-style ConvNets Great Again”。该网络是在预测阶段采用了一种类似于VGG风格的结构,均有3X3卷积层与ReLU激活函数组成,而在训练阶段网络结构类似ResNet具有多分支。训练和推理时候的解耦合是通过结构重参数(re-parameterization)技术实现的,所以叫RepVGG,而这种方法不仅拥有很好的准确率,同时也可以降低计算开销,提升速度。所以本文章也算是进一步理解YOLOV7的补充。



       在训练阶段网络结构长这个下面这个样子,作者说到RepVGG有5个stage,下面展示的仅是其中之一,其实这个结构与ResNet很相似,有1X1的Conv,也有一个残差边identity分支,在每个stage开始的阶段,会采用一个stride=2的卷积进行下采样,同时该层仅有1X1卷积,也能看到最上面这一部分是没有identity分支,这个分支仅在in_channels=out_channels且stride=1才有,代码里有体现:

20.png

RepVGG training


        在正常推理阶段,网络结构又会变成下面的这个样子,与上面的图对比,在推理阶段少了一些分支网络但主体结构是一样的,而这种结构与VGG很相似。但这个分支去哪了呢?是直接删了么?【我刚开始看到这里就是这样人为,因为有其他论文重要做过】

21.png


                  RepVGG inference


       其实上面的实现原理也是很简单,方法也很直接,可以使用原始3×3核、identity和1×1分支以及批归一化(BN)层来构造单个3×3核。就是用一个3X3代替原来的3X3 、1X1以及identity。在推理阶段会将卷积层与BN层进行一个融合,identity、1X1卷积分支,可以将其通过padding的方式扩成3X3卷积,最后将三个3X3的卷积相加后形成一个新的3X3卷积即可,这就好像把原来的分支与主干进行了融合,变成了一个卷积层。

22.png

先来看一下RepVGG的卷积块代码【也是网络基本组成单元】。

class RepVGGBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size,
                 stride=1, padding=0, dilation=1, groups=1, padding_mode='zeros', deploy=False, use_se=False):
        super(RepVGGBlock, self).__init__()
        self.deploy = deploy
        self.groups = groups 
        self.in_channels = in_channels
        assert kernel_size == 3
        assert padding == 1
        padding_11 = padding - kernel_size // 2
        self.nonlinearity = nn.ReLU()  # 激活函数
        if use_se:  # 是否使用注意力机制
            self.se = SEBlock(out_channels, internal_neurons=out_channels // 16)
        else:  # 残差边
            self.se = nn.Identity()
        if deploy:
            self.rbr_reparam = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride,
                                      padding=padding, dilation=dilation, groups=groups, bias=True, padding_mode=padding_mode)
        else:
            self.rbr_identity = nn.BatchNorm2d(num_features=in_channels) if out_channels == in_channels and stride == 1 else None
            self.rbr_dense = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, groups=groups)
            self.rbr_1x1 = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride, padding=padding_11, groups=groups)
            print('RepVGG Block, identity = ', self.rbr_identity)
    def forward(self, inputs):
        if hasattr(self, 'rbr_reparam'):
            return self.nonlinearity(self.se(self.rbr_reparam(inputs)))
        if self.rbr_identity is None:  # 是否使用identity,当in_channels ≠ out_channels,且stride!=1 则不使用,实际就是每个stage开始的时候不用
            id_out = 0
        else:
            id_out = self.rbr_identity(inputs)
        """
        x-->conv_bn--------add--->ReLu
        |_________>1X1______|
        |_________>BN_______|
        """
        return self.nonlinearity(self.se(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out))

从上面代码中也能看出identity使用的先决条件是in_channels与out_channels相等,并且stride=1。可以根据return的返回结果去自己写一下网络结构【我已画出】。为了可以更直观的看一下代码到底和论文中的结构以及我们自己画的结构一不一样,可以直接将这个RepVGGBlock作为一个单元,保存一下结构图看一下。这里我假设的输入shape为【1,32,224,224】,即输入通道32,大小为224X224,同时RepVGGBlock的in_channels和out_channels均等于32,stride=1,padding=1。



23.png

   可以看到从输入出来后有三个分支,图中那个BN分支就是identity,然后是3X3的主干与1X1的分支。三个分支最终相加后再经过Relu通道维度也未改变,与论文中是一致的。上面的结构就是组成训练阶段的基本网络单元。


       在RepVGG官方代码中提供网络融合的函数:repvgg_model_convert()。

def repvgg_model_convert(model:torch.nn.Module, save_path=None, do_copy=True):
    if do_copy:
        model = copy.deepcopy(model)  # 深拷贝
    for module in model.modules():
        if hasattr(module, 'switch_to_deploy'):
            module.switch_to_deploy()
    if save_path is not None:
        torch.save(model.state_dict(), save_path)
    return model

参数model:是训练时的网络模型,也就是含有各个分支的。


返回值model:时最终我们要的推理时的结构,即去掉了各个分支。


在代码中可以看到有个module.switch_to_deploy(),这个是调用model中的该属性,这个函数在源码中RepVGGBlock有定义。


以RepVGG_A0为例,打印的网络模型如下,一共有5个stage【从0~4】:

RepVGG(
  (stage0): RepVGGBlock(
    (nonlinearity): ReLU()
    (se): Identity()
    (rbr_dense): Sequential(
      (conv): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (rbr_1x1): Sequential(
      (conv): Conv2d(3, 48, kernel_size=(1, 1), stride=(2, 2), bias=False)
      (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (stage1): Sequential(
    (0): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_dense): Sequential(
        (conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (stage2): Sequential(
    (0): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_dense): Sequential(
        (conv): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(48, 96, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (2): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (3): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (stage3): Sequential(
    (0): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_dense): Sequential(
        (conv): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(96, 192, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (2): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (3): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (4): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (5): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (6): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (7): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (8): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (9): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (10): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (11): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (12): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (13): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_identity): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (stage4): Sequential(
    (0): RepVGGBlock(
      (nonlinearity): ReLU()
      (se): Identity()
      (rbr_dense): Sequential(
        (conv): Conv2d(192, 1280, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (rbr_1x1): Sequential(
        (conv): Conv2d(192, 1280, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (gap): AdaptiveAvgPool2d(output_size=1)
  (linear): Linear(in_features=1280, out_features=1000, bias=True)
)


目录
相关文章
|
6月前
|
机器学习/深度学习 算法 网络架构
【YOLOv8改进 - Backbone主干】EfficientRep:一种旨在提高硬件效率的RepVGG风格卷积神经网络架构
【YOLOv8改进 - Backbone主干】EfficientRep:一种旨在提高硬件效率的RepVGG风格卷积神经网络架构
|
2月前
|
机器学习/深度学习 自然语言处理 语音技术
Python在深度学习领域的应用,重点讲解了神经网络的基础概念、基本结构、训练过程及优化技巧
本文介绍了Python在深度学习领域的应用,重点讲解了神经网络的基础概念、基本结构、训练过程及优化技巧,并通过TensorFlow和PyTorch等库展示了实现神经网络的具体示例,涵盖图像识别、语音识别等多个应用场景。
73 8
|
5月前
|
机器学习/深度学习 资源调度 自然语言处理
不同类型的循环神经网络结构
【8月更文挑战第16天】
59 0
|
3月前
|
机器学习/深度学习 计算机视觉 网络架构
【YOLO11改进 - C3k2融合】C3k2融合YOLO-MS的MSBlock : 分层特征融合策略,轻量化网络结构
【YOLO11改进 - C3k2融合】C3k2融合YOLO-MS的MSBlock : 分层特征融合策略,轻量化网络结构
|
3月前
|
边缘计算 自动驾驶 5G
5G的网络拓扑结构典型模式
5G的网络拓扑结构典型模式
388 4
|
3月前
|
机器学习/深度学习 算法
神经网络的结构与功能
神经网络是一种广泛应用于机器学习和深度学习的模型,旨在模拟人类大脑的信息处理方式。它们由多层不同类型的节点或“神经元”组成,每层都有特定的功能和责任。
123 0
|
4月前
|
编解码 人工智能 文件存储
卷积神经网络架构:EfficientNet结构的特点
EfficientNet是一种高效的卷积神经网络架构,它通过系统化的方法来提升模型的性能和效率。
82 1
|
5月前
|
机器学习/深度学习 算法 文件存储
【博士每天一篇文献-算法】 PNN网络启发的神经网络结构搜索算法Progressive neural architecture search
本文提出了一种名为渐进式神经架构搜索(Progressive Neural Architecture Search, PNAS)的方法,它使用顺序模型优化策略和替代模型来逐步搜索并优化卷积神经网络结构,从而提高了搜索效率并减少了训练成本。
72 9
|
6月前
|
机器学习/深度学习 自然语言处理
像生物网络一样生长,具备结构可塑性的自组织神经网络来了
【7月更文挑战第24天】Sebastian Risi团队发布的arXiv论文探讨了一种模仿生物神经网络生长与适应特性的新型神经网络。LNDP利用结构可塑性和经验依赖学习,能根据活动与奖励动态调整连接,展现自我组织能力。通过基于图变换器的机制,LNDP支持突触动态增删,预先通过可学习随机过程驱动网络发育。实验在Cartpole等任务中验证了LNDP的有效性,尤其在需快速适应的场景下。然而,LNDP在复杂环境下的可扩展性及训练优化仍面临挑战,且其在大规模网络和图像分类等领域的应用尚待探索
117 20
|
5月前
|
机器学习/深度学习 人工智能 PyTorch
AI智能体研发之路-模型篇(五):pytorch vs tensorflow框架DNN网络结构源码级对比
AI智能体研发之路-模型篇(五):pytorch vs tensorflow框架DNN网络结构源码级对比
91 1

热门文章

最新文章