使用Python脚本快速批量删除OSS Bucket

本文涉及的产品
对象存储 OSS,OSS 加速器 50 GB 1个月
简介: 要用Python删除OSS Bucket,似乎直接调用delete_bucket()方法就可以了。然而,在实际删除时,常常会遇到各种报错。这是因为OSS为了防止误操作,要求在删除Bucket之前必须清空其中的所有数据,包括对象(Objects)、多版本对象(Multi-version Objects)、碎片(Parts)、LiveChannels。针对需要快速批量删除OSS Bucket的场景,本文提供了一个Python脚本,用于先批量清除Bucket中的上述资源,然后再删除Bucket。

声明

本脚本用于自动化删除对象存储(OSS)中的存储桶(Bucket)。删除存储桶是一项高风险操作,可能会导致存储在桶中的所有数据永久丢失。在执行该脚本之前,请确保您已经理解了所有潜在的风险,并且做出了明智的决定。此外,请确保已经采取了必要的数据备份措施,以避免不可逆转的数据损失。您必须自行承担使用本脚本导致的所有后果。脚本作者或提供者不对任何因使用或滥用本脚本而导致的直接或间接损失、数据丢失、财务损失或任何种类的损害负责。使用本脚本即表明您已阅读并同意本免责声明。如果您不同意这些条件,或者对脚本的功能和风险有任何疑问,请不要执行此脚本。

限制

  • 处于保留策略生效期间的Bucket。必需保留策略失效后,才能删除。
  • 创建了接入点的Bucket。可以在OSS控制台手动删除掉接入点,然后使用脚本删除。
  • 开启了OSS-HDFS服务的Bucket。仅支持在OSS控制台删除。
  • 单个Bucket的删除任务超时时间为1小时。经测试,删除150 GB的文件约消耗半个小时。因此,对于数据量特别大的情况,建议在OSS控制台配置生命周期进行T+1自动删除。

脚本

importoss2importtimeimportsignalfromtqdmimporttqdmclassTimeoutException(Exception):
passdeftimeout_handler(signum, frame):
raiseTimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
defwrite_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets):
withopen(result_filename, 'w') asf:
f.write(f"Total buckets listed: {bucket_count}\n")
f.write("Bucket Details (Name, Region, Status):\n")
forbucketinbuckets_detail:
status='Included'ifbucket['bucket_name'] ininclude_bucketselse'Excluded'ifbucket['bucket_name'] inexclude_bucketselse'Not Specified'f.write(f"{bucket['bucket_name']}: {bucket['region']} (status: {status})\n")
f.write(f"\nTotal buckets successfully deleted: {len(delete_success)}\n")
f.write("Successfully Deleted Buckets:\n")
forbucket_nameindelete_success:
f.write(f"{bucket_name} (time taken: {delete_timings.get(bucket_name, 'N/A'):.2f}s)\n")
total_failed=len(delete_failed)
f.write(f"\nTotal buckets failed to delete: {total_failed}\n")
f.write("Failed to Delete Buckets:\n")
forfailed_bucketindelete_failed:
bucket_name=failed_bucket['bucket_name']
error_details=failed_bucket['error']
time_taken=delete_timings.get(bucket_name, 'N/A')
# Customize the error message based on the error code (EC)if'0024-00000001'inerror_details:
reason="Data lake storage is disabled. Please delete using the console."elif'0055-00000011'inerror_details:
reason="Bucket still binding access points. Please delete access points, then delete the bucket."elif"WORM Locked state"inerror_details:
reason="WORM enabled; not allowed to delete before expiration."else:
reason=error_details# Default reason if none of the specific EC codes are present# Write the failure reason to the filef.write(f"{bucket_name} (time taken: {time_taken}s, reason: {reason})\n")
defcheck_bucket_worm_status(bucket):
try:
worm_info=bucket.get_bucket_worm()
ifworm_info.state=='Locked':
print(f"Bucket '{bucket.bucket_name}' is in WORM state 'Locked', skipping deletion.")
returnFalseexceptoss2.exceptions.NoSuchWORMConfiguration:
print(f"Bucket '{bucket.bucket_name}' does not have a WORM configuration.")
exceptoss2.exceptions.RequestErrorase:
# 处理网络相关的异常print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to network error: {e}")
returnFalseexceptoss2.exceptions.OssErrorase:
# 处理其他 OSS API 错误print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to OSS error: {e}")
returnFalsereturnTruedefdelete_all_objects(bucket):
objects_to_delete= []
object_count=0# 初始化对象计数器forobjinoss2.ObjectIterator(bucket):
objects_to_delete.append(obj.key)
object_count+=1# 如果积累了足够多的对象,进行批量删除iflen(objects_to_delete) >=1000:
print(f"Deleting batch of 1000 objects...")  # 添加此行以打印正在执行的批量删除操作bucket.batch_delete_objects(objects_to_delete)
objects_to_delete= []
print(f"Deleted 1000 objects, continuing...")  # 添加此行以显示当前进度# 执行剩余的删除操作(如果有)ifobject_count>0andobjects_to_delete:
print(f"Deleting final batch of {len(objects_to_delete)} objects...")  # 添加此行以打印最后一批对象的删除操作bucket.batch_delete_objects(objects_to_delete)
elifobject_count==0:
print(f"No objects to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_live_channels(bucket):
live_channel_count=0forlive_channel_infoinoss2.LiveChannelIterator(bucket):
name=live_channel_info.namebucket.delete_live_channel(name)
live_channel_count+=1iflive_channel_count>0:
print(f"All live channels deleted in bucket '{bucket.bucket_name}'.")
else:
print(f"No live channels to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_multipart_uploads(bucket):
multipart_upload_count=0forupload_infoinoss2.MultipartUploadIterator(bucket):
key=upload_info.keyupload_id=upload_info.upload_idbucket.abort_multipart_upload(key, upload_id)
multipart_upload_count+=1ifmultipart_upload_count>0:
print(f"All multipart uploads aborted in bucket '{bucket.bucket_name}'.")
else:
print(f"No multipart uploads to abort in bucket '{bucket.bucket_name}'.")
defdelete_all_object_versions(bucket):
next_key_marker=Nonenext_versionid_marker=NonewhileTrue:
result=bucket.list_object_versions(key_marker=next_key_marker, versionid_marker=next_versionid_marker)
# 判断是否存在任何版本或删除标记ifnotresult.versionsandnotresult.delete_marker:
print(f"No object versions or delete markers to delete in bucket '{bucket.bucket_name}'.")
breakversions_to_delete=oss2.models.BatchDeleteObjectVersionList()
# 追加待删除的对象版本和删除标记forversion_infoinresult.versions:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(version_info.key, version_info.versionid))
fordel_maker_infoinresult.delete_marker:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(del_maker_info.key, del_maker_info.versionid))
# 执行批量删除操作versions_count=len(result.versions) +len(result.delete_marker)  # 计算将要删除的数量print(f"Deleting {versions_count} object versions and/or delete markers...")  # 使用计算出的数量bucket.delete_object_versions(versions_to_delete)
# 更新下一轮迭代的标记next_key_marker=result.next_key_markernext_versionid_marker=result.next_versionid_marker# 如果没有更多的版本或删除标记,则退出循环ifnotresult.is_truncated:
print(f"All object versions and delete markers deleted.")
breakcurrent_timestamp=time.strftime("%Y%m%d%H%M%S")
# 添加 include_buckets 集合,列出不删除的 bucket。exclude_buckets=set(['<your-bucket>'])
# 添加 include_buckets 集合,列出要删除的 bucket。如果为空,则应用 exclude。如果不为空,则只应用 include。include_buckets=set([])
auth=oss2.ProviderAuth(EnvironmentVariableCredentialsProvider())
service=oss2.Service(auth, 'oss-cn-hangzhou.aliyuncs.com')
bucket_list=list(oss2.BucketIterator(service))
ifinclude_buckets:
buckets_to_process= [bforbinbucket_listifb.nameininclude_buckets]
else:
buckets_to_process= [bforbinbucket_listifb.namenotinexclude_buckets]
bucket_count=len(buckets_to_process)
buckets_detail= [{'bucket_name': b.name, 'region': b.location} forbinbuckets_to_process]
delete_success= []
delete_failed= []
delete_timings= {b.name: 'Not Attempted'forbinbuckets_to_process}
pbar=tqdm(total=bucket_count, desc="Deleting Buckets", unit="bucket", leave=False)
try:
forbucket_infoinbuckets_to_process:
bucket_name=bucket_info.nameregion=bucket_info.locationendpoint=f'https://{region}.aliyuncs.com'bucket=oss2.Bucket(auth, endpoint, bucket_name)
# 调用 check_bucket_worm_status 函数来检查 WORM 状态can_delete=check_bucket_worm_status(bucket)
ifnotcan_delete:
delete_failed.append({
'bucket_name': bucket_name,
'error': "Skipped due to WORM Locked state or unable to check status."            })
delete_timings[bucket_name] ='Skipped'pbar.update(1)
continue# 如果 WORM 状态允许,继续执行删除操作print(f"Processing bucket '{bucket_name}'...")
try:
signal.alarm(3600)
start_time=time.time()
# 删除所有直播频道delete_all_live_channels(bucket)
# 删除所有分片上传delete_all_multipart_uploads(bucket)
# 删除所有对象delete_all_objects(bucket)
# 删除所有对象版本delete_all_object_versions(bucket)
# 尝试删除bucketprint(f"Deleting bucket '{bucket_name}'...")
bucket.delete_bucket()
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takendelete_success.append(bucket_name)
print(f"Bucket '{bucket_name}' deleted successfully in {time_taken:.2f}s")
pbar.update(1)
exceptTimeoutException:
print(f"Deleting bucket '{bucket_name}' timed out after 300 seconds.")
delete_failed.append({'bucket_name': bucket_name, 'error': 'Timeout after 300 seconds'})
delete_timings[bucket_name] ='Timeout'pbar.update(1)
exceptoss2.exceptions.OssErrorase:
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takenerror_message=str(e)
print(f"Failed to delete bucket '{bucket_name}' in {time_taken:.2f}s: {error_message}")
delete_failed.append({'bucket_name': bucket_name, 'error': error_message})
pbar.update(1)
exceptKeyboardInterrupt:
print("\nOperation interrupted by user.")
finally:
pbar.close()
result_filename=f"{current_timestamp}_bucket_delete_result.txt"write_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets)
print(f"Script execution completed. Check {result_filename} for details.")

运行

  1. 在脚本中设置 exclude_buckets 变量为您不想删除的存储桶名称集合。
  2. (可选)设置 include_buckets 变量为您想要删除的存储桶名称集合。当设置了 include_buckets 变量时,exclude_buckets 将被忽略。
  3. 运行以下命令以执行脚本:
pip install oss2
pip install tqdm
exportOSS_ACCESS_KEY_ID=<your_ak_id>
exportOSS_ACCESS_KEY_SECRET=<your_ak_secret>
python3 delete_buckets.py

输出

Total buckets listed: 19
Bucket Details (Name, Region, Status):
bucket1: oss-region1 (status: Not Specified)
bucket2: oss-region1 (status: Not Specified)
bucket3: oss-region1 (status: Not Specified)
bucket4: oss-region1 (status: Not Specified)
bucket5: oss-region2 (status: Not Specified)
bucket6: region1 (status: Not Specified)
bucket7: oss-region3 (status: Not Specified)
bucket8: oss-region3 (status: Not Specified)
bucket9: oss-region4 (status: Not Specified)
bucket10: oss-region2 (status: Not Specified)
bucket11: oss-region1 (status: Not Specified)
bucket12: oss-region5 (status: Not Specified)
bucket13: oss-region6 (status: Not Specified)
bucket14: oss-region3 (status: Not Specified)
bucket15: oss-region4 (status: Not Specified)
bucket16: oss-region1 (status: Not Specified)
bucket17: oss-region1 (status: Not Specified)
bucket18: oss-region4 (status: Not Specified)
bucket19: oss-region7 (status: Not Specified)
Total buckets successfully deleted: 0
Successfully Deleted Buckets:
Total buckets failed to delete: 19
Failed to Delete Buckets:
bucket1 (time taken: 0.082s, reason: Data lake storage is disabled. Please delete using the console.)
bucket2 (time taken: 0.116s, reason: Data lake storage is disabled. Please delete using the console.)
bucket3 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket4 (time taken: 0.070s, reason: Data lake storage is disabled. Please delete using the console.)
bucket5 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket6 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket7 (time taken: 1.126s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket8 (time taken: 1.024s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket9 (time taken: 0.213s, reason: Data lake storage is disabled. Please delete using the console.)
bucket10 (time taken: 0.108s, reason: Data lake storage is disabled. Please delete using the console.)
bucket11 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket12 (time taken: 0.156s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket13 (time taken: 1.533s, reason: Data lake storage is disabled. Please delete using the console.)
bucket14 (time taken: 1.021s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket15 (time taken: 0.212s, reason: Data lake storage is disabled. Please delete using the console.)
bucket16 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket17 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket18 (time taken: 0.203s, reason: Data lake storage is disabled. Please delete using the console.)
bucket19 (time taken: 0.320s, reason: Data lake storage is disabled. Please delete using the console.)



相关实践学习
对象存储OSS快速上手——如何使用ossbrowser
本实验是对象存储OSS入门级实验。通过本实验,用户可学会如何用对象OSS的插件,进行简单的数据存、查、删等操作。
目录
相关文章
|
8月前
|
JSON 算法 API
深度分析小红书城API接口,用Python脚本实现
小红书作为以UGC内容为核心的生活方式平台,其非官方API主要通过移动端抓包解析获得,涵盖内容推荐、搜索、笔记详情、用户信息和互动操作等功能。本文分析了其接口体系、认证机制及请求规范,并提供基于Python的调用框架,涉及签名生成、登录态管理与数据解析。需注意非官方接口存在稳定性与合规风险,使用时应遵守平台协议及法律法规。
|
8月前
|
JSON API 数据安全/隐私保护
【干货满满】分享微店API接口到手价,用python脚本实现
微店作为知名社交电商平台,其开放平台提供商品查询、订单管理等API接口。本文介绍如何通过微店API获取商品到手价(含优惠、券等),涵盖认证机制、Python实现及关键说明。
|
8月前
|
JSON API 数据安全/隐私保护
【干货满满】分享淘宝API接口到手价,用python脚本实现
淘宝开放平台通过API可获取商品到手价,结合商品详情与联盟接口实现优惠计算。需使用AppKey、AppSecret及会话密钥认证,调用taobao.tbk.item.info.get接口获取最终价格。代码示例展示签名生成与数据解析流程。
|
8月前
|
JSON API 数据格式
深度分析大麦网API接口,用Python脚本实现
大麦网为国内领先演出票务平台,提供演唱会、话剧、体育赛事等票务服务。本文基于抓包分析其非官方接口,并提供Python调用方案,涵盖演出列表查询、详情获取及城市列表获取。需注意非官方接口存在稳定性风险,使用时应遵守平台规则,控制请求频率,防范封禁与法律风险。适用于个人学习、演出信息监控等场景。
|
8月前
|
JSON API 开发者
深度分析阿里妈妈API接口,用Python脚本实现
阿里妈妈是阿里巴巴旗下营销平台,提供淘宝联盟、直通车等服务,支持推广位管理、商品查询等API功能。本文详解其API调用方法,重点实现商品推广信息(佣金、优惠券)获取,并提供Python实现方案。
|
8月前
|
JSON API 数据安全/隐私保护
深度分析虾皮城API接口,用Python脚本实现
虾皮开放平台提供丰富的API接口,支持商品管理、订单处理及促销信息查询等功能。本文详解API认证机制与调用方法,基于Python实现商品价格及到手价获取方案,适用于电商数据分析与运营。
|
8月前
|
JSON API 数据安全/隐私保护
【干货满满】分享拼多多API接口到手价,用python脚本实现
拼多多开放平台提供商品价格查询API,通过“pdd.ddk.goods.detail”接口可获取商品基础价、优惠券、拼团价等信息。结合client_id、client_secret及签名机制实现身份认证,支持推广位ID获取专属优惠。本文提供完整Python实现,涵盖签名生成、接口调用与价格解析逻辑,适用于比价工具、导购平台等场景。
|
8月前
|
API 数据安全/隐私保护 开发者
深度分析苏宁API接口,用Python脚本实现
深度分析苏宁API接口,用Python脚本实现
|
8月前
|
前端开发 Shell API
深度分析58同城API接口,用Python脚本实现
58同城为国内知名分类信息平台,涵盖房产、招聘、二手车等多领域。本文基于网页抓包与解析,分享其非官方接口的Python实现方案,分析核心接口特性与反爬应对策略,适用于数据学习与信息聚合。注意:非官方接口存在风险,使用需遵守平台规则。
|
8月前
|
JSON API 数据安全/隐私保护
【干货满满】分享京东API接口到手价,用python脚本实现
淘宝开放平台提供丰富API,通过商品详情接口与淘宝联盟接口,可获取含优惠券、满减后的商品到手价。本文介绍基于Python的实现方案,涵盖签名生成、接口调用、价格解析及错误处理,适用于比价工具、导购平台等场景。

热门文章

最新文章

相关产品

  • 对象存储
  • 推荐镜像

    更多