基于LSTM的时间序列预测研究

2026-06-08 35

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS DuckDB + QuickBI 企业套餐，8核32GB + QuickBI 专业版

简介： 基于LSTM的时间序列预测研究

LSTM 时间序列分析预测目录

概述

使用LSTM神经网络进行时间序列数据预测分析
基于Tensorflow框架、Keras接口开发网络模型
包含：数据清洗、数据特征提取、数据建模、数据预测

项目资源

自回归(AR,ARIMA)模型时间序列预测合集：代码获取见底部卡片
深度学习模型时间序列预测合集：：代码获取见底部卡片
基于NLP的文本分析项目合集：：代码获取见底部卡片

第一部分：基础LSTM应用

一、LSTM单变量预测（shampoo-sales）

LSTM单变量基础

香皂销售预测案例

数据预处理

观测值缩放
时间序列转稳定数据
时间序列转监督学习数据

模型开发

LSTM模型构建
完整LSTM案例实现
健壮性优化案例

二、LSTM多变量预测（air_pollution）

数据准备

多变量数据输出
预处理流程

模型开发

LSTM数据预处理
模型定义与训练

三、Multi-Step LSTM预测

静态模型预测
多步预测LSTM网络实现

第二部分：LSTM进阶应用（airline-passengers）

LSTM回归网络（1→1）
移动窗口型回归（3→1）
时间步长型回归（3→1）
批次间具有记忆的LSTM
批次间具有堆叠的LSTM

第三部分：LSTM核心特性

一、编码器-解码器架构

回声随机序列案例

数据准备
序列预测
模型实现
简化版可观测数据

输入输出模式

一对一LSTM
多对一LSTM
多对多LSTM（TimeDistributed）

有状态网络预测

输入输出对配置
数据重塑方法
完整实现案例

二、Keras LSTM生命周期

5步操作流程
代码实现解析

第四部分：数据准备技术

一、缺失值处理

序列缺失值学习
忽略缺失值策略
删除缺失数据
替换缺失数据

二、数据标准化

标准化方法
归一化方法

三、数据变换

差分消除季节性
差分消除趋势

四、特征编码

One-hot编码实现

Keras实现
Scikit-learn实现
手动实现

五、数据重塑

单输入样本处理
多输入特征处理
单变量时间序列准备

第五部分：LSTM建模技术

一、网络架构

堆叠LSTM实现

2D输出版本
3D输出版本

二、模型管理

模型保存与加载

三、模型诊断

欠拟合识别（训练周期不足）
合格模型标准
过拟合识别
多次拟合评估方法

第六部分：完整案例

案例1：空气质量预测（多变量）

数据准备与可视化
监督学习数据转换
单日预测模型
三日预测模型

案例2：洗发水销量（单步预测）

数据集分析
滞后模型构建
监督数据结构
差分法与缩放法
LSTM实现与评估
股票数据测试

原始数据预测
验证集损失分析

案例3：洗发水销量（多步预测）

监督数据准备
静态预测效果
神经网络预测实现

核心代码

基于LSTM的时间序列预测研究
# coding=utf-8
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
#——————————————————导入数据——————————————————————
f = open('dataset_1.csv')
df = pd.read_csv(f)         #读入股票数据
data = np.array(df['max'])  #获取最高价序列
data = data[::-1]#反转，使数据按照日期先后顺序排列
#以折线图展示data
# plt.figure()
# plt.plot(data)
# plt.show()
normalize_data = (data - np.mean(data)) / np.std(data)#标准化
normalize_data = normalize_data[:, np.newaxis]#增加1个维度
#———————————————————形成训练集—————————————————————
time_step = 20      #时间步
rnn_unit = 10       #hidden layer units
lstm_layers = 2     #每一批次训练多少个样例
batch_size = 60     #输入层维度  #每一批次训练多少个样例
input_size = 1      #输入层维度
output_size = 1     #输出层维度
lr = 0.0006         #学习率
train_x, train_y = [], []#训练集
for i in range(len(normalize_data) - time_step - 1):
    x = normalize_data[i:i + time_step]
    y = normalize_data[i + 1:i + time_step + 1]
    train_x.append(x.tolist())
    train_y.append(y.tolist())
# 定义每个X sample的形状(?, time_step, input_size)
X = tf.placeholder(tf.float32, [None, time_step, input_size])
# 定义每个Y sample的形状(?, time_step, output_size)
Y = tf.placeholder(tf.float32, [None, time_step, output_size])
#——————————————————定义神经网络变量——————————————————
#输入层、输出层权重、偏置
weights = {
    'in': tf.Variable(tf.random_normal([input_size, rnn_unit])),
    'out': tf.Variable(tf.random_normal([rnn_unit, 1]))
}
print(weights)
biases = {
    'in': tf.Variable(tf.constant(0.1, shape=[rnn_unit, ])),
    'out': tf.Variable(tf.constant(0.1, shape=[1, ]))
}
print(biases)
#参数：输入网络批次数目
def lstm(batch):#参数：输入网络批次数目
    w_in = weights['in']
    b_in = biases['in']
    print(X)
    input = tf.reshape(X, [-1, input_size])#需要将tensor转成2维进行计算，计算后的结果作为隐藏层的输入
    print(input)
    input_rnn = tf.matmul(input, w_in) + b_in
    input_rnn = tf.reshape(input_rnn, [-1, time_step, rnn_unit])#将tensor转成3维，作为lstm cell的输入
    cell = tf.nn.rnn_cell.MultiRNNCell([tf.nn.rnn_cell.BasicLSTMCell(rnn_unit) for i in range(lstm_layers)])
    init_state = cell.zero_state(batch, dtype=tf.float32)
    output_rnn, final_states = tf.nn.dynamic_rnn(cell, input_rnn, initial_state=init_state, dtype=tf.float32)
    output = tf.reshape(output_rnn, [-1, rnn_unit])#作为输出层的输入
    w_out = weights['out']
    b_out = biases['out']
    pred = tf.matmul(output, w_out) + b_out
    return pred, final_states
def train_lstm():
    global batch_size
    with tf.variable_scope("sec_lstm"):
        pred, _ = lstm(batch_size)
    # 损失函数
    loss = tf.reduce_mean(tf.square(tf.reshape(pred, [-1]) - tf.reshape(Y, [-1])))
    train_op = tf.train.AdamOptimizer(lr).minimize(loss)
    saver = tf.train.Saver(tf.global_variables())
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        # 重复训练10000次
        for i in range(100):  # We can increase the number of iterations to gain better result.
            step = 0
            start = 0
            end = start + batch_size
            print("i = ",i)
            while (end < len(train_x)):
                _, loss_ = sess.run([train_op, loss], feed_dict={X: train_x[start:end], Y: train_y[start:end]})
                start += batch_size
                end = start + batch_size
                if step % 100 == 0:  #每10步保存一次参数
                    print("Number of iterations:", i, " loss:", loss_)
                    print("model_save", saver.save(sess, 'model_save1\\modle.ckpt'))
                    # I run the code in windows 10,so use  'model_save1\\modle.ckpt'
                    # if you run it in Linux,please use  'model_save1/modle.ckpt'
                step += 1
        print("The train has finished")
train_lstm()
    input()
prediction()

基于LSTM的时间序列预测研究

基于LSTM的时间序列预测研究

LSTM 时间序列分析预测目录

概述

项目资源

第一部分：基础LSTM应用

一、LSTM单变量预测（shampoo-sales）

二、LSTM多变量预测（air_pollution）

三、Multi-Step LSTM预测

第二部分：LSTM进阶应用（airline-passengers）

第三部分：LSTM核心特性

一、编码器-解码器架构

二、Keras LSTM生命周期

第四部分：数据准备技术

一、缺失值处理

二、数据标准化

三、数据变换

四、特征编码

五、数据重塑

第五部分：LSTM建模技术

一、网络架构

二、模型管理

三、模型诊断

第六部分：完整案例

案例1：空气质量预测（多变量）

案例2：洗发水销量（单步预测）

案例3：洗发水销量（多步预测）

核心代码

大数据与机器学习

热门文章

最新文章

相关电子书

基于LSTM的时间序列预测研究

LSTM 时间序列分析预测 目录

概述

项目资源

第一部分：基础LSTM应用

一、LSTM单变量预测（shampoo-sales）

二、LSTM多变量预测（air_pollution）

三、Multi-Step LSTM预测

第二部分：LSTM进阶应用（airline-passengers）

第三部分：LSTM核心特性

一、编码器-解码器架构

二、Keras LSTM生命周期

第四部分：数据准备技术

一、缺失值处理

二、数据标准化

三、数据变换

四、特征编码

五、数据重塑

第五部分：LSTM建模技术

一、网络架构

二、模型管理

三、模型诊断

第六部分：完整案例

案例1：空气质量预测（多变量）

案例2：洗发水销量（单步预测）

案例3：洗发水销量（多步预测）

核心代码

大数据与机器学习

热门文章

最新文章

相关电子书

LSTM 时间序列分析预测目录