一起架构-某实时分析项目云原生 serverless 架构的设计思路和poc代码实现

本文涉及的产品
函数计算FC,每月15万CU 3个月
简介: 一起架构-某实时分析项目云原生 serverless 架构的设计思路和poc代码实现

1. 前言 - 云原生与多云混合云的部署架构

大家好,我是明哥!

在数字化转型的大背景下,越来越多的企业不断将越来越多的应用部署到云上,应用的架构也更加倾向云原生,以支持多云和混合云的部署架构。

前段时间,笔者参与了某个实时分析项目在 AWS 上的架构设计和 POC 开发,该项目使用了 serverless 的云原生架构,在此跟大家分享下架构设计和 poc 代码的细节,希望大家喜欢。

2. 项目背景和目标 Background and goals

整个项目的背景和目标如下:

640.png


经提炼和概括,项目的背景,基本目标和额外目标如下:

  • 背景:Ingest, transform and prepare the netCDF data provided by UK Met Office, make them available for secure querying by our customer, as soon as it arrives in the S3 bucket.
  • 基本目标:Core capabilities include:
  • high availability (no downtime)
  • quick response
  • timely availability of new data.
  • 额外目标:Extra Goals:
  • Security
  • cost effectiveness

3. 整体架构图 Architecture overview

最终设计的完整的架构图如下:

640.png


4. 架构设计和技术选型 Architecture details and thought process

4.1 How to discover new available data ASAP? - SQS

  • UK Met Office prepares the original data in netCDF format and uploaded them to a S3 bucket, but as listing a bucket is both expensive and slow (file system vs object store), we can’t take this approach for quickly discover of new available data in s3;
  • We noticed that UK Met Office will also send a message to a SNS topic once new data is available in the S3 bucket, so we can use a SQS to scribe to the SNS topic, and got notified when new objects are created, this solution is latency-efficient, cost-effective and scalable;

4.2 Can we use the original S3 bucket used by the UK Met Office?

  • We noticed that the original data will be held in the bucket for 7 days after the notification is sent, then they will be deleted;
  • We can use our own S3 bucket to store the data, so we have full control of the data, including the data lifecycle, the data security policies, etc;

4.3 How to server our end users, with quick response and high availability? – API gateway + DynamoDB

  • Our end users typically ask questions like “how will the weather/humidity/temperature be like in city C1, at time T1? how about city C2 and C3? How about time T2?”, to answer that question, we have to first figure out which files in the S3 bucket contains forecast results for that specific time (all the files contains forecast results for all the cities in UK, so place should not be a problem);
  • So we can use a RDS or DynamoDB to store the metadata “which s3 file contains forecast results for which time”, then when we receive a specific question from our customer, we can first query the RDS/DynamoDB to find out the corresponding S3 file, then they can query the s3 file to get all the forecast details, including weather/humidity/temperature etc, for all the UK cities;
  • RDS is a relational database and is typically for well-formed structured data, while DynamoDB is a fully-managed Key-value NoSql data store, both can fulfill our functional requirements, but considering that we don’t have highly-structured data, and DynamoDB shines in Availability. Scalability and Performance, so we will go with DynamoDB;
  • We can use an API gateway as a proxy to the DynamoDB and answer the end user’s request directly, with out an extra lambda layer between API gateway and DynamoDB, hence the whole data pipeline is shorter, which will be more time-effective, cost-effective, and less issues will occur; Also API gateway provides many security mechanisms, including authentication, authorization,audit and encryption;
  • With api gateway, DynamoDB and S3, the whole serving layer will response quickly with high availability, and is also cost-effective and secured;

4.4 How to ingest, transform and prepare the original data - Lambda!

  • To consume messages in SQS queues, we normally follow the event-driven architecture and use streaming processing frameworks like spark streaming/flink/kafka stream, but to use them, you need to first provision ec2 servers and possibly use ecs/eks, but you need to deploy, monitor and scale(both up and down) your app all by yourself, this is cumbersome and not cost-effective;
  • You can consider using serverless Fargate, but you have to deal with the event-driven by yourself;
  • Lambda is both serverless and event-driven, it automatically scales according to your data volume, it integrates with other aws services like sqs, s3, DynamoDB, api gateway well, and it allow you to pay for what you use, so it is a perfect match for our case!
  • we can use lambda and create a sqs trigger, so right after events arrived in sqs, it will trigger the execution of lamba where we can do the transform and load into downstream DynamoDB table;

5 技术组件细节和示例代码 Component details and code samples

5.1 Component details and code samples – sqs and lambda

  • Sqs type:as there is no need for First-in-first-out message delivery and Exactly-once processing, we can stay with the standard type sqs,which offers better scalability;
  • Sqs encryption: Amazon SQS provides in-transit encryption by default, we also added at-rest encryption to our queue by enable server-side encryption (uses Amazon SQS key (SSE-SQS));
  • Lambda: lambda has an sqs trigger, and for performance consideration, we are using batch to writer into dynamodb;
  • Labmda permission: to follow the least-access polity, we created a new IAM role with basic Lambda permissions (with just polices like AWSLambdaSQSQueueExecutionRole/AWSLambdaExecute/AWSLambdaDynamoDBExecutionRole)

5.2 Component details and code samples - dynamoDB

  • dynamoDB is serverless and will auto scale based on data volume and query, so to avoid hot spot bottleneck, we used forecast_period as partition key/hash key and forecast_time as sort key;(forecast_period is the difference between forecast_reference_time and forecast_time);
  • As end users typically query based on time, so we created a secondary global indexes sgi, with partition key on the time field forecast_time;
  • Encryption: we turned on Encryption at rest, and used encryption keys stored in AWS Key Management Service, whch is managed by DynamoDB at no extra cost;
  • permission: for apis to query the dynamoDB, we followed the least-access polity and created an access control policy with only read policy on the table and index;

5.3 Component details and code samples – api gateway

  • Api: I created two methods and resources, and configured the integration request and integration response’s mapping template, to full fill the scan and query on the dynamoDB, with paths like /times and /times/{time}, the latter one will use the sgi we created for the table;
  • Api key: I configured the method request to use API Key;
  • permission: to follow the least-access polity, we created a new IAM role with only necessary permissions (with just polices like AmazonAPIGatewayPushToCloudWatchLogs, and the dynamodb read-only policy we created earlier)

5.4 Component details and code samples – lambda codes

640.png

5.5 Component details and code samples – api codes

640.png

6. 脚本与自动化 automation using script - cloudFormation

  • I believe in IaC (infrustructure As Code) and GitOps, humans will make mistakes and automation helps us on this (plus automation is more efficient and script is more repeatable);
  • So I tried to use cloudFormation template to simplify the infrastructure management (due to time constraint, I only finished the dynamodb template);
  • Below are part of the cloudFormation script for the dynamodb table creation;

640.png

7. 终端用户模拟访问效果 End user query simulation results

640.png

  • IAM user with read only permission – IAM user name: arn:aws:iam::000435319421:user/demo
  • IAM user with read only permission – IAM user password: demo123@aws
  • End user request url: https://jye2m0pw20.execute-api.us-east2.amazonaws.com/v1/times/2022-04-16T22:45:00Z
  • End user request sample path parameter: 2022-04-17T22:30:00Z/2022-04-16T22:45:00Z, etc;
  • End user request type: get
  • End user request Authorization Type: api key
  • Key: x-api-key
  • Value: kNKmXfQGNx802XU1f75Mu9vRAFBvWIdM5uT7NmHa
  • Add to: header

8. 总结 Wrap up

  • high availability (no downtime): The solution used components like sns,sqs,lambda,dynamodb,api gateway and s3, all of which are managed services which scaled well and scaled automatically, to ensure high availability (no downtime);
  • quick response: The solution used dynamoDB in the serving layer, which scales well and scales automatically, and with the careful design of hashkey,sortkey and sgi, it offers quick response time to end users;
  • timely availability of new data: The solution followed the event driven architecture, with sqs and lambda, and ensured the timely availability of new data;
  • cost effectiveness:The solution followed the server-less architecture and used aws serveless services, so we can pay only what we use, and hence is cost effective;
  • security:
  • Encryption:aws service used TLS to provide encryption between user application and the AWS service which offered data-in-motion/transit encryption, and we enabled data-at-rest encryption;
  • Authentication and Authorization:we also followed the least-access policy to create IAM roles and policyes. we also used an api key to protect our api gateway from malicious attacks
  • Audit: CloudWatch is used for the audit;
相关实践学习
【文生图】一键部署Stable Diffusion基于函数计算
本实验教你如何在函数计算FC上从零开始部署Stable Diffusion来进行AI绘画创作,开启AIGC盲盒。函数计算提供一定的免费额度供用户使用。本实验答疑钉钉群:29290019867
建立 Serverless 思维
本课程包括: Serverless 应用引擎的概念, 为开发者带来的实际价值, 以及让您了解常见的 Serverless 架构模式
相关文章
|
2天前
|
机器学习/深度学习 安全 算法
十大主流联邦学习框架:技术特性、架构分析与对比研究
联邦学习(FL)是保障数据隐私的分布式模型训练关键技术。业界开发了多种开源和商业框架,如TensorFlow Federated、PySyft、NVFlare、FATE、Flower等,支持模型训练、数据安全、通信协议等功能。这些框架在灵活性、易用性、安全性和扩展性方面各有特色,适用于不同应用场景。选择合适的框架需综合考虑开源与商业、数据分区支持、安全性、易用性和技术生态集成等因素。联邦学习已在医疗、金融等领域广泛应用,选择适配具体需求的框架对实现最优模型性能至关重要。
129 78
十大主流联邦学习框架:技术特性、架构分析与对比研究
|
2月前
|
运维 Cloud Native 持续交付
深入理解云原生架构及其在现代企业中的应用
随着数字化转型的浪潮席卷全球,企业正面临着前所未有的挑战与机遇。云计算技术的迅猛发展,特别是云原生架构的兴起,正在重塑企业的IT基础设施和软件开发模式。本文将深入探讨云原生的核心概念、关键技术以及如何在企业中实施云原生策略,以实现更高效的资源利用和更快的市场响应速度。通过分析云原生架构的优势和面临的挑战,我们将揭示它如何助力企业在激烈的市场竞争中保持领先地位。
|
2月前
|
Kubernetes Cloud Native 微服务
探索云原生技术:容器化与微服务架构的融合之旅
本文将带领读者深入了解云原生技术的核心概念,特别是容器化和微服务架构如何相辅相成,共同构建现代软件系统。我们将通过实际代码示例,探讨如何在云平台上部署和管理微服务,以及如何使用容器编排工具来自动化这一过程。文章旨在为开发者和技术决策者提供实用的指导,帮助他们在云原生时代中更好地设计、部署和维护应用。
|
15天前
|
测试技术 双11 开发者
一文分析架构思维之建模思维
软件里的要素不是凭空出现的,都是源于实际的业务。本文从软件设计本源到建模案例系统的介绍了作者对于建模的思维和思考。
|
2月前
|
机器学习/深度学习 存储 人工智能
基于AI的实时监控系统:技术架构与挑战分析
AI视频监控系统利用计算机视觉和深度学习技术,实现实时分析与智能识别,显著提升高风险场所如监狱的安全性。系统架构包括数据采集、预处理、行为分析、实时决策及数据存储层,涵盖高分辨率视频传输、图像增强、目标检测、异常行为识别等关键技术。面对算法优化、实时性和系统集成等挑战,通过数据增强、边缘计算和模块化设计等方法解决。未来,AI技术的进步将进一步提高监控系统的智能化水平和应对复杂安全挑战的能力。
|
2月前
|
运维 Cloud Native 持续交付
云原生技术深度探索:重塑现代IT架构的无形之力####
本文深入剖析了云原生技术的核心概念、关键技术组件及其对现代IT架构变革的深远影响。通过实例解析,揭示云原生如何促进企业实现敏捷开发、弹性伸缩与成本优化,为数字化转型提供强有力的技术支撑。不同于传统综述,本摘要直接聚焦于云原生技术的价值本质,旨在为读者构建一个宏观且具体的技术蓝图。 ####
|
2月前
|
Cloud Native API 持续交付
云原生架构下的微服务治理策略与实践####
本文旨在探讨云原生环境下微服务架构的治理策略,通过分析当前面临的挑战,提出一系列实用的解决方案。我们将深入讨论如何利用容器化、服务网格(Service Mesh)等先进技术手段,提升微服务系统的可管理性、可扩展性和容错能力。此外,还将分享一些来自一线项目的经验教训,帮助读者更好地理解和应用这些理论到实际工作中去。 ####
60 0
|
2月前
|
Cloud Native 持续交付 云计算
云原生架构的崛起:企业数字化转型的加速器
在当今快速发展的技术环境中,企业正面临着前所未有的变革压力。本文深入探讨了云原生架构如何成为推动企业数字化转型的关键力量。通过分析其核心概念、优势以及实施策略,本文旨在为读者提供对云原生技术的全面理解,展示其在现代企业中不可或缺的作用。
36 0
|
2月前
|
弹性计算 API 持续交付
后端服务架构的微服务化转型
本文旨在探讨后端服务从单体架构向微服务架构转型的过程,分析微服务架构的优势和面临的挑战。文章首先介绍单体架构的局限性,然后详细阐述微服务架构的核心概念及其在现代软件开发中的应用。通过对比两种架构,指出微服务化转型的必要性和实施策略。最后,讨论了微服务架构实施过程中可能遇到的问题及解决方案。
|
3月前
|
Cloud Native Devops 云计算
云计算的未来:云原生架构与微服务的革命####
【10月更文挑战第21天】 随着企业数字化转型的加速,云原生技术正迅速成为IT行业的新宠。本文深入探讨了云原生架构的核心理念、关键技术如容器化和微服务的优势,以及如何通过这些技术实现高效、灵活且可扩展的现代应用开发。我们将揭示云原生如何重塑软件开发流程,提升业务敏捷性,并探索其对企业IT架构的深远影响。 ####
72 3