如何使用 Boto3 通过 AWS 资源从 S3 下载对象?

boto3pythonserver side programmingprogramming

问题陈述 − 使用 Python 中的 boto3 库从给定本地路径/默认路径的 S3 下载对象,并将覆盖现有文件设置为 true。例如,从 S3 的 Bucket_1/testfolder 下载 test.zip。

解决此问题的方法/算法

步骤 1 − 导入 boto3 和 botocore 异常来处理异常。

步骤 2 − 从 pathlib 导入路径以检查文件名

步骤 3s3_path、localpathoverwrite_existing_file 是函数 download_object_from_s3

中的三个参数

步骤 4 − 验证 s3_path 是否以 AWS 格式传递为 s3://bucket_name/key。默认情况下,localpath = Noneoverwrite_existing_file = True。用户也可以传递这些值以在给定的本地路径中下载

步骤 5 − 使用 boto3 库创建 AWS 会话。

步骤 6 − 为 S3 创建 AWS 资源。

步骤 7 −拆分 S3 路径并执行操作以分离根存储桶名称和要下载的对象路径。

步骤 8 − 检查 overwrite_existing_file 是否设置为 False,并且文件是否已存在于给定的本地路径中;如果是,则不要执行任何操作。

步骤 9 − 否则(如果这些条件中的任何一个不成立),则下载对象。如果给出了 localpath,则在那里下载;否则下载到默认路径。

步骤 10 − 根据响应代码处理异常以验证文件是否已下载。

步骤 11 − 如果下载文件时出现问题,则处理通用异常。

示例

使用以下代码从 AWS S3 下载文件 −

import boto3
from botocore.exceptions import ClientError
from pathlib import Path

def download_object_from_s3(s3path, localPath=None,
overwrite_existing_file=True):

   if 's3://' not in s3path:
      print('Given path is not a valid s3 path.')
      raise Exception('Given path is not a valid s3 path.')

   session = boto3.session.Session()
   s3_resource = session.resource('s3')

   s3_tokens = s3path.split('/')
   bucket_name = s3_tokens[2]
   object_path = ""
   filename = s3_tokens[len(s3_tokens) - 1]
   print('Filename: ' + filename)

   if len(s3_tokens) > 4:
      for tokn in range(3, len(s3_tokens) - 1):
         object_path += s3_tokens[tokn] + "/"
      object_path += filename
   else:
      object_path += filename
   print('object: ' + object_path)
   try:
      if not overwrite_existing_file and Path.is_file(filename):
         pass
      else:
         if localPath is None:
            s3_resource.meta.client.download_file(bucket_name, object_path, filename)
         else:
            s3_resource.meta.client.download_file(bucket_name, object_path, localPath + '/' + filename)
      print('Filename: ' + filename)
      return filename
   except ClientError as error:
      if error.response['Error']['Code'] == '404':
         print(s3path + " File not found: ")
         raise Exception(s3path + " File not found: ")
   except Exception as error:
      print("Unexpected error in download_object function of s3 helper: " + error.__str__())
      raise Exception("Unexpected error in download_object function of s3 helper: " + error.__str__())

#Download into default localpath
print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip"))
#Download into given path
print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip","C://AWS"))
#File doesn’t exist in S3
print(download_object_from_s3("s3://Bucket_1/testfolder/abc.zip"))

输出

#Download into default localpath
Filename: test.zip
object: testfolder/test.zip
Filename: test.zip

#Download into given path
Filename: test.zip
object: testfolder/test.zip
Filename: test.zip

#File doesn’t exist in S3
Filename: abc.zip
object: testfolder/abc.zip
s3://Bucket_1/testfolder/abc.zip File not found:
botocore.exceptions.ClientError: An error occurred (404) when calling
the HeadObject operation: Not Found

Note: The default path to download is the directory where this function is written. In the same directory, file will be downloaded if local path is not provided.

For example, if this function is written into S3_class and this class is present at C://AWS/src/S3_class, then file test.zip will be downloaded into C://AWS/src/test.zip


相关文章