如何使用 Boto3 通过 AWS 资源从 S3 下载对象?
问题陈述 − 使用 Python 中的 boto3 库从给定本地路径/默认路径的 S3 下载对象,并将覆盖现有文件设置为 true。例如,从 S3 的 Bucket_1/testfolder 下载 test.zip。
解决此问题的方法/算法
步骤 1 − 导入 boto3 和 botocore 异常来处理异常。
步骤 2 − 从 pathlib 导入路径以检查文件名
步骤 3 − s3_path、localpath 和 overwrite_existing_file 是函数 download_object_from_s3
中的三个参数步骤 4 − 验证 s3_path 是否以 AWS 格式传递为 s3://bucket_name/key。默认情况下,localpath = None 和 overwrite_existing_file = True。用户也可以传递这些值以在给定的本地路径中下载
步骤 5 − 使用 boto3 库创建 AWS 会话。
步骤 6 − 为 S3 创建 AWS 资源。
步骤 7 −拆分 S3 路径并执行操作以分离根存储桶名称和要下载的对象路径。
步骤 8 − 检查 overwrite_existing_file 是否设置为 False,并且文件是否已存在于给定的本地路径中;如果是,则不要执行任何操作。
步骤 9 − 否则(如果这些条件中的任何一个不成立),则下载对象。如果给出了 localpath,则在那里下载;否则下载到默认路径。
步骤 10 − 根据响应代码处理异常以验证文件是否已下载。
步骤 11 − 如果下载文件时出现问题,则处理通用异常。
示例
使用以下代码从 AWS S3 下载文件 −
import boto3 from botocore.exceptions import ClientError from pathlib import Path def download_object_from_s3(s3path, localPath=None, overwrite_existing_file=True): if 's3://' not in s3path: print('Given path is not a valid s3 path.') raise Exception('Given path is not a valid s3 path.') session = boto3.session.Session() s3_resource = session.resource('s3') s3_tokens = s3path.split('/') bucket_name = s3_tokens[2] object_path = "" filename = s3_tokens[len(s3_tokens) - 1] print('Filename: ' + filename) if len(s3_tokens) > 4: for tokn in range(3, len(s3_tokens) - 1): object_path += s3_tokens[tokn] + "/" object_path += filename else: object_path += filename print('object: ' + object_path) try: if not overwrite_existing_file and Path.is_file(filename): pass else: if localPath is None: s3_resource.meta.client.download_file(bucket_name, object_path, filename) else: s3_resource.meta.client.download_file(bucket_name, object_path, localPath + '/' + filename) print('Filename: ' + filename) return filename except ClientError as error: if error.response['Error']['Code'] == '404': print(s3path + " File not found: ") raise Exception(s3path + " File not found: ") except Exception as error: print("Unexpected error in download_object function of s3 helper: " + error.__str__()) raise Exception("Unexpected error in download_object function of s3 helper: " + error.__str__()) #Download into default localpath print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip")) #Download into given path print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip","C://AWS")) #File doesn’t exist in S3 print(download_object_from_s3("s3://Bucket_1/testfolder/abc.zip"))
输出
#Download into default localpath Filename: test.zip object: testfolder/test.zip Filename: test.zip #Download into given path Filename: test.zip object: testfolder/test.zip Filename: test.zip #File doesn’t exist in S3 Filename: abc.zip object: testfolder/abc.zip s3://Bucket_1/testfolder/abc.zip File not found: botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
Note: The default path to download is the directory where this function is written. In the same directory, file will be downloaded if local path is not provided.
For example, if this function is written into S3_class and this class is present at C://AWS/src/S3_class, then file test.zip will be downloaded into C://AWS/src/test.zip