通俗易懂理解模型微调全流程
机器学习特别是深度学习领域,预训练模型因其强大的泛化能力而被广泛应用。然而,直接使用预训练模型可能无法达到最佳效果,尤其是在特定领域或任务上。这时候就需要对模型进行微调(Fine-tuning),使其更好地适应新的数据集。本文将通过一个简单的例子带你一步步了解如何进行模型的微调,并附带Python代码实现。
假设我们有一个图像分类任务,想要利用已经训练好的ResNet-50模型来进行微调。首先确保你的环境中已安装PyTorch和相关依赖包。下面是一个从环境准备到模型训练、评估的完整流程示例。
第一步,导入必要的库:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
from torchvision import datasets, models
from torch.utils.data import DataLoader
第二步,定义数据预处理方法:
data_transforms = {
'train': transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, .456, .406], [0.229, .224, .225])
]),
'val': transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, .456, .406], [0.229, .224, .225])
]),
}
第三步,加载数据集。这里假设我们使用的是ImageNet的一个子集作为训练数据:
data_dir = 'path/to/dataset'
image_datasets = {
x: datasets.ImageFolder(root=data_dir + x, transform=data_transforms[x]) for x in ['train', 'val']}
dataloaders = {
x: DataLoader(image_datasets[x], batch_size=32, shuffle=True, num_workers=4) for x in ['train', 'val']}
dataset_sizes = {
x: len(image_datasets[x]) for x in ['train', 'val']}
class_names = image_datasets['train'].classes
第四步,加载预训练的ResNet-50模型:
model_ft = models.resnet50(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, len(class_names))
第五步,设置训练参数:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_ft = model_ft.to(device)
criterion = nn.CrossEntropyLoss()
optimizer_ft = torch.optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
第六步,定义训练函数:
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
for epoch in range(num_epochs):
print(f'Epoch {epoch}/{num_epochs - 1}')
print('-' * 10)
# Training phase
model.train()
running_loss = 0.0
running_corrects = 0
for inputs, labels in dataloaders['train']:
inputs = inputs.to(device)
labels = labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
if scheduler:
scheduler.step()
# Validation phase
model.eval()
val_loss = 0.0
val_corrects = 0
for inputs, labels in dataloaders['val']:
inputs = inputs.to(device)
labels = labels.to(device)
with torch.no_grad():
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)
val_loss += loss.item() * inputs.size(0)
val_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / dataset_sizes['train']
epoch_acc = running_corrects.double() / dataset_sizes['train']
val_epoch_loss = val_loss / dataset_sizes['val']
val_epoch_acc = val_corrects.double() / dataset_sizes['val']
print(f'Train Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')
print(f'Val Loss: {val_epoch_loss:.4f} Acc: {val_epoch_acc:.4f}')
train_model(model_ft, criterion, optimizer_ft, scheduler, num_epochs=2)
以上步骤完成了模型微调的基本过程。值得注意的是,在实际应用中,根据具体情况调整超参数如学习率、批量大小等是常见的做法。此外,还可以加入更多的技巧如早停策略(Early Stopping)来防止过拟合。希望这篇指南能帮助你更好地理解和实践模型微调。