使用Python脚本快速批量删除OSS Bucket

本文涉及的产品
对象存储 OSS,20GB 3个月
对象存储 OSS,恶意文件检测 1000次 1年
对象存储 OSS,内容安全 1000次 1年
简介: 要用Python删除OSS Bucket,似乎直接调用delete_bucket()方法就可以了。然而,在实际删除时,常常会遇到各种报错。这是因为OSS为了防止误操作,要求在删除Bucket之前必须清空其中的所有数据,包括对象(Objects)、多版本对象(Multi-version Objects)、碎片(Parts)、LiveChannels。针对需要快速批量删除OSS Bucket的场景,本文提供了一个Python脚本,用于先批量清除Bucket中的上述资源,然后再删除Bucket。

声明

本脚本用于自动化删除对象存储(OSS)中的存储桶(Bucket)。删除存储桶是一项高风险操作,可能会导致存储在桶中的所有数据永久丢失。在执行该脚本之前,请确保您已经理解了所有潜在的风险,并且做出了明智的决定。此外,请确保已经采取了必要的数据备份措施,以避免不可逆转的数据损失。您必须自行承担使用本脚本导致的所有后果。脚本作者或提供者不对任何因使用或滥用本脚本而导致的直接或间接损失、数据丢失、财务损失或任何种类的损害负责。使用本脚本即表明您已阅读并同意本免责声明。如果您不同意这些条件,或者对脚本的功能和风险有任何疑问,请不要执行此脚本。

限制

  • 处于保留策略生效期间的Bucket。必需保留策略失效后,才能删除。
  • 创建了接入点的Bucket。可以在OSS控制台手动删除掉接入点,然后使用脚本删除。
  • 开启了OSS-HDFS服务的Bucket。仅支持在OSS控制台删除。
  • 单个Bucket的删除任务超时时间为1小时。经测试,删除150 GB的文件约消耗半个小时。因此,对于数据量特别大的情况,建议在OSS控制台配置生命周期进行T+1自动删除。

脚本

importoss2importtimeimportsignalfromtqdmimporttqdmclassTimeoutException(Exception):
passdeftimeout_handler(signum, frame):
raiseTimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
defwrite_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets):
withopen(result_filename, 'w') asf:
f.write(f"Total buckets listed: {bucket_count}\n")
f.write("Bucket Details (Name, Region, Status):\n")
forbucketinbuckets_detail:
status='Included'ifbucket['bucket_name'] ininclude_bucketselse'Excluded'ifbucket['bucket_name'] inexclude_bucketselse'Not Specified'f.write(f"{bucket['bucket_name']}: {bucket['region']} (status: {status})\n")
f.write(f"\nTotal buckets successfully deleted: {len(delete_success)}\n")
f.write("Successfully Deleted Buckets:\n")
forbucket_nameindelete_success:
f.write(f"{bucket_name} (time taken: {delete_timings.get(bucket_name, 'N/A'):.2f}s)\n")
total_failed=len(delete_failed)
f.write(f"\nTotal buckets failed to delete: {total_failed}\n")
f.write("Failed to Delete Buckets:\n")
forfailed_bucketindelete_failed:
bucket_name=failed_bucket['bucket_name']
error_details=failed_bucket['error']
time_taken=delete_timings.get(bucket_name, 'N/A')
# Customize the error message based on the error code (EC)if'0024-00000001'inerror_details:
reason="Data lake storage is disabled. Please delete using the console."elif'0055-00000011'inerror_details:
reason="Bucket still binding access points. Please delete access points, then delete the bucket."elif"WORM Locked state"inerror_details:
reason="WORM enabled; not allowed to delete before expiration."else:
reason=error_details# Default reason if none of the specific EC codes are present# Write the failure reason to the filef.write(f"{bucket_name} (time taken: {time_taken}s, reason: {reason})\n")
defcheck_bucket_worm_status(bucket):
try:
worm_info=bucket.get_bucket_worm()
ifworm_info.state=='Locked':
print(f"Bucket '{bucket.bucket_name}' is in WORM state 'Locked', skipping deletion.")
returnFalseexceptoss2.exceptions.NoSuchWORMConfiguration:
print(f"Bucket '{bucket.bucket_name}' does not have a WORM configuration.")
exceptoss2.exceptions.RequestErrorase:
# 处理网络相关的异常print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to network error: {e}")
returnFalseexceptoss2.exceptions.OssErrorase:
# 处理其他 OSS API 错误print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to OSS error: {e}")
returnFalsereturnTruedefdelete_all_objects(bucket):
objects_to_delete= []
object_count=0# 初始化对象计数器forobjinoss2.ObjectIterator(bucket):
objects_to_delete.append(obj.key)
object_count+=1# 如果积累了足够多的对象,进行批量删除iflen(objects_to_delete) >=1000:
print(f"Deleting batch of 1000 objects...")  # 添加此行以打印正在执行的批量删除操作bucket.batch_delete_objects(objects_to_delete)
objects_to_delete= []
print(f"Deleted 1000 objects, continuing...")  # 添加此行以显示当前进度# 执行剩余的删除操作(如果有)ifobject_count>0andobjects_to_delete:
print(f"Deleting final batch of {len(objects_to_delete)} objects...")  # 添加此行以打印最后一批对象的删除操作bucket.batch_delete_objects(objects_to_delete)
elifobject_count==0:
print(f"No objects to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_live_channels(bucket):
live_channel_count=0forlive_channel_infoinoss2.LiveChannelIterator(bucket):
name=live_channel_info.namebucket.delete_live_channel(name)
live_channel_count+=1iflive_channel_count>0:
print(f"All live channels deleted in bucket '{bucket.bucket_name}'.")
else:
print(f"No live channels to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_multipart_uploads(bucket):
multipart_upload_count=0forupload_infoinoss2.MultipartUploadIterator(bucket):
key=upload_info.keyupload_id=upload_info.upload_idbucket.abort_multipart_upload(key, upload_id)
multipart_upload_count+=1ifmultipart_upload_count>0:
print(f"All multipart uploads aborted in bucket '{bucket.bucket_name}'.")
else:
print(f"No multipart uploads to abort in bucket '{bucket.bucket_name}'.")
defdelete_all_object_versions(bucket):
next_key_marker=Nonenext_versionid_marker=NonewhileTrue:
result=bucket.list_object_versions(key_marker=next_key_marker, versionid_marker=next_versionid_marker)
# 判断是否存在任何版本或删除标记ifnotresult.versionsandnotresult.delete_marker:
print(f"No object versions or delete markers to delete in bucket '{bucket.bucket_name}'.")
breakversions_to_delete=oss2.models.BatchDeleteObjectVersionList()
# 追加待删除的对象版本和删除标记forversion_infoinresult.versions:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(version_info.key, version_info.versionid))
fordel_maker_infoinresult.delete_marker:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(del_maker_info.key, del_maker_info.versionid))
# 执行批量删除操作versions_count=len(result.versions) +len(result.delete_marker)  # 计算将要删除的数量print(f"Deleting {versions_count} object versions and/or delete markers...")  # 使用计算出的数量bucket.delete_object_versions(versions_to_delete)
# 更新下一轮迭代的标记next_key_marker=result.next_key_markernext_versionid_marker=result.next_versionid_marker# 如果没有更多的版本或删除标记,则退出循环ifnotresult.is_truncated:
print(f"All object versions and delete markers deleted.")
breakcurrent_timestamp=time.strftime("%Y%m%d%H%M%S")
# 添加 include_buckets 集合,列出不删除的 bucket。exclude_buckets=set(['<your-bucket>'])
# 添加 include_buckets 集合,列出要删除的 bucket。如果为空,则应用 exclude。如果不为空,则只应用 include。include_buckets=set([])
auth=oss2.ProviderAuth(EnvironmentVariableCredentialsProvider())
service=oss2.Service(auth, 'oss-cn-hangzhou.aliyuncs.com')
bucket_list=list(oss2.BucketIterator(service))
ifinclude_buckets:
buckets_to_process= [bforbinbucket_listifb.nameininclude_buckets]
else:
buckets_to_process= [bforbinbucket_listifb.namenotinexclude_buckets]
bucket_count=len(buckets_to_process)
buckets_detail= [{'bucket_name': b.name, 'region': b.location} forbinbuckets_to_process]
delete_success= []
delete_failed= []
delete_timings= {b.name: 'Not Attempted'forbinbuckets_to_process}
pbar=tqdm(total=bucket_count, desc="Deleting Buckets", unit="bucket", leave=False)
try:
forbucket_infoinbuckets_to_process:
bucket_name=bucket_info.nameregion=bucket_info.locationendpoint=f'https://{region}.aliyuncs.com'bucket=oss2.Bucket(auth, endpoint, bucket_name)
# 调用 check_bucket_worm_status 函数来检查 WORM 状态can_delete=check_bucket_worm_status(bucket)
ifnotcan_delete:
delete_failed.append({
'bucket_name': bucket_name,
'error': "Skipped due to WORM Locked state or unable to check status."            })
delete_timings[bucket_name] ='Skipped'pbar.update(1)
continue# 如果 WORM 状态允许,继续执行删除操作print(f"Processing bucket '{bucket_name}'...")
try:
signal.alarm(3600)
start_time=time.time()
# 删除所有直播频道delete_all_live_channels(bucket)
# 删除所有分片上传delete_all_multipart_uploads(bucket)
# 删除所有对象delete_all_objects(bucket)
# 删除所有对象版本delete_all_object_versions(bucket)
# 尝试删除bucketprint(f"Deleting bucket '{bucket_name}'...")
bucket.delete_bucket()
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takendelete_success.append(bucket_name)
print(f"Bucket '{bucket_name}' deleted successfully in {time_taken:.2f}s")
pbar.update(1)
exceptTimeoutException:
print(f"Deleting bucket '{bucket_name}' timed out after 300 seconds.")
delete_failed.append({'bucket_name': bucket_name, 'error': 'Timeout after 300 seconds'})
delete_timings[bucket_name] ='Timeout'pbar.update(1)
exceptoss2.exceptions.OssErrorase:
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takenerror_message=str(e)
print(f"Failed to delete bucket '{bucket_name}' in {time_taken:.2f}s: {error_message}")
delete_failed.append({'bucket_name': bucket_name, 'error': error_message})
pbar.update(1)
exceptKeyboardInterrupt:
print("\nOperation interrupted by user.")
finally:
pbar.close()
result_filename=f"{current_timestamp}_bucket_delete_result.txt"write_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets)
print(f"Script execution completed. Check {result_filename} for details.")

运行

  1. 在脚本中设置 exclude_buckets 变量为您不想删除的存储桶名称集合。
  2. (可选)设置 include_buckets 变量为您想要删除的存储桶名称集合。当设置了 include_buckets 变量时,exclude_buckets 将被忽略。
  3. 运行以下命令以执行脚本:
pip install oss2
pip install tqdm
exportOSS_ACCESS_KEY_ID=<your_ak_id>
exportOSS_ACCESS_KEY_SECRET=<your_ak_secret>
python3 delete_buckets.py

输出

Total buckets listed: 19
Bucket Details (Name, Region, Status):
bucket1: oss-region1 (status: Not Specified)
bucket2: oss-region1 (status: Not Specified)
bucket3: oss-region1 (status: Not Specified)
bucket4: oss-region1 (status: Not Specified)
bucket5: oss-region2 (status: Not Specified)
bucket6: region1 (status: Not Specified)
bucket7: oss-region3 (status: Not Specified)
bucket8: oss-region3 (status: Not Specified)
bucket9: oss-region4 (status: Not Specified)
bucket10: oss-region2 (status: Not Specified)
bucket11: oss-region1 (status: Not Specified)
bucket12: oss-region5 (status: Not Specified)
bucket13: oss-region6 (status: Not Specified)
bucket14: oss-region3 (status: Not Specified)
bucket15: oss-region4 (status: Not Specified)
bucket16: oss-region1 (status: Not Specified)
bucket17: oss-region1 (status: Not Specified)
bucket18: oss-region4 (status: Not Specified)
bucket19: oss-region7 (status: Not Specified)
Total buckets successfully deleted: 0
Successfully Deleted Buckets:
Total buckets failed to delete: 19
Failed to Delete Buckets:
bucket1 (time taken: 0.082s, reason: Data lake storage is disabled. Please delete using the console.)
bucket2 (time taken: 0.116s, reason: Data lake storage is disabled. Please delete using the console.)
bucket3 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket4 (time taken: 0.070s, reason: Data lake storage is disabled. Please delete using the console.)
bucket5 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket6 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket7 (time taken: 1.126s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket8 (time taken: 1.024s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket9 (time taken: 0.213s, reason: Data lake storage is disabled. Please delete using the console.)
bucket10 (time taken: 0.108s, reason: Data lake storage is disabled. Please delete using the console.)
bucket11 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket12 (time taken: 0.156s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket13 (time taken: 1.533s, reason: Data lake storage is disabled. Please delete using the console.)
bucket14 (time taken: 1.021s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket15 (time taken: 0.212s, reason: Data lake storage is disabled. Please delete using the console.)
bucket16 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket17 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket18 (time taken: 0.203s, reason: Data lake storage is disabled. Please delete using the console.)
bucket19 (time taken: 0.320s, reason: Data lake storage is disabled. Please delete using the console.)



相关实践学习
借助OSS搭建在线教育视频课程分享网站
本教程介绍如何基于云服务器ECS和对象存储OSS,搭建一个在线教育视频课程分享网站。
目录
相关文章
|
18天前
|
安全 网络安全 文件存储
思科设备巡检命令Python脚本大集合
【10月更文挑战第18天】
55 1
思科设备巡检命令Python脚本大集合
|
14天前
|
关系型数据库 MySQL 数据库连接
python脚本:连接数据库,检查直播流是否可用
【10月更文挑战第13天】本脚本使用 `mysql-connector-python` 连接MySQL数据库,检查 `live_streams` 表中每个直播流URL的可用性。通过 `requests` 库发送HTTP请求,输出每个URL的检查结果。需安装 `mysql-connector-python` 和 `requests` 库,并配置数据库连接参数。
114 68
|
5天前
|
缓存 运维 NoSQL
python常见运维脚本_Python运维常用脚本
python常见运维脚本_Python运维常用脚本
11 3
|
5天前
|
数据采集 JSON 数据安全/隐私保护
Python常用脚本集锦
Python常用脚本集锦
10 2
|
6天前
|
运维 监控 应用服务中间件
自动化运维:如何利用Python脚本提升工作效率
【10月更文挑战第30天】在快节奏的IT行业中,自动化运维已成为提升工作效率和减少人为错误的关键技术。本文将介绍如何使用Python编写简单的自动化脚本,以实现日常运维任务的自动化。通过实际案例,我们将展示如何用Python脚本简化服务器管理、批量配置更新以及监控系统性能等任务。文章不仅提供代码示例,还将深入探讨自动化运维背后的理念,帮助读者理解并应用这一技术来优化他们的工作流程。
|
7天前
|
运维 监控 Linux
自动化运维:如何利用Python脚本优化日常任务##
【10月更文挑战第29天】在现代IT运维中,自动化已成为提升效率、减少人为错误的关键技术。本文将介绍如何通过Python脚本来简化和自动化日常的运维任务,从而让运维人员能够专注于更高层次的工作。从备份管理到系统监控,再到日志分析,我们将一步步展示如何编写实用的Python脚本来处理这些任务。 ##
|
27天前
|
Linux 区块链 Python
Python实用记录(十三):python脚本打包exe文件并运行
这篇文章介绍了如何使用PyInstaller将Python脚本打包成可执行文件(exe),并提供了详细的步骤和注意事项。
48 1
Python实用记录(十三):python脚本打包exe文件并运行
|
12天前
|
JSON 测试技术 持续交付
自动化测试与脚本编写:Python实践指南
自动化测试与脚本编写:Python实践指南
16 1
|
18天前
|
对象存储
一个通过 GitHub Action 将 GitHub 仓库与阿里云 OSS 完全同步的脚本
一种将 GitHub 仓库完全同步到阿里云 OSS 的方法。
|
28天前
|
Java Python
如何通过Java程序调用python脚本
如何通过Java程序调用python脚本
25 0

相关产品

  • 对象存储
  • 下一篇
    无影云桌面