Home » Python » python – How to access AWS S3 data using boto3-Exceptionshub

python – How to access AWS S3 data using boto3-Exceptionshub

Posted by: admin February 24, 2020 Leave a comment

Questions:

I am fairly new to both S3 as well as boto3. I am trying to read in some data in the following format:

https://blahblah.s3.amazonaws.com/data1.csv
https://blahblah.s3.amazonaws.com/data2.csv
https://blahblah.s3.amazonaws.com/data3.csv

I am importing boto3, and it seems like I would need to do something like:

import boto3
s3 = boto3.client('s3')

However, what should I do after creating this client if I want to read in all files separately in-memory (I am not supposed to locally download this data). Ideally, I would like to read in each CSV data file into separate Pandas DataFrames (which I know how to do once I know how to access the S3 data).

Please understand I’m fairly new to both boto3 as well as S3, so I don’t even know where to begin.

How to&Answers:

Try this:

import boto3
s3 = boto3.resource('s3')
obj = s3.Object(<<bucketname>>, <<itemname>>)
body = obj.get()['Body'].read()

Answer:

You’ll have 2 options, both the options you’ve already mentioned:

  1. Downloading the file locally using download_file
s3.download_file(
    "<bucket-name>", 
    "<key-of-file>", 
    "<local-path-where-file-will-be-downloaded>"
)

See download_file

  1. Loading the file contents into memory using get_object
response = s3.get_object(Bucket="<bucket-name>", Key="<key-of-file>")
contentBody = response.get("Body")
# You need to read the content as it is a Stream
content = contentBody.read()

See get_object

Either approach is fine and you can just chose which one fits your scenario better.