Script Day: Upload Files to Amazon S3 Using Bash
Here is a very simple Bash script that uploads a file to Amazon’s S3. I’ve looked for a simple explanation on how to do that without perl scripts or C# code, and could find none. So after a bit of experimentation and some reverse engineering, here’s the simple sample code:
#!/bin/bash
file="$1"
key_id="YOUR-AWS-KEY-ID"
key_secret="YOUR-AWS-KEY-SECRET"
path="some-directory/$file"
bucket="s3-bucket-name"
content_type="application/octet-stream"
date="$(LC_ALL=C date -u +"%a, %d %b %Y %X %z")"
md5="$(openssl md5 -binary < "$file" | base64)"
sig="$(printf "PUT\n$md5\n$content_type\n$date\n/$bucket/$path" | openssl sha1 -binary -hmac "$key_secret" | base64)"
curl -T $file http://$bucket.s3.amazonaws.com/$path \
-H "Date: $date" \
-H "Authorization: AWS $key_id:$sig" \
-H "Content-Type: $content_type" \
-H "Content-MD5: $md5"
So what is going on here?
This example script uses what Amazon calls “pre-authenticated URL”, which basically means that instead of performing a “login process” for each upload to generate an an authorization key (OAUTH style), you generate the authorization key yourself by signing the upload details with your AWS secret.
So we first the upload details – these need to be consistent between the signature step and the actual upload request, so we save them in variables instead of using them directly:
key_id
andkey_secret
should be changed to be your actual AWS key detail (no, you can’t have mine)path
is the path of the file to be stored under the S3 bucket (what Amazon calls the file’s “key”)bucket
is the name of the S3 bucket where we will upload the filecontent_type
is the MIME Content-Type of the file, which – unless you want to serve the file through a web site – shouldn’t matter much so I’m using “application/octet-stream” (“binary” for you systems folks)date
is an RFC2616 compliant date specification. I forceLC_ALL
to the POSIX “C” locale to preventdate
from genearting localized day and month names, which are not valid in RFC2616md5
is an RFC2616 Content-MD5 compliant MD5 checksum, which means its the binary MD5 digest encoded using Base64 (instead of the hex encoding most people use nowadays)
Finally it is time to generate the authorization key, which we do by creating an “upload details message” that includes:
- The HTTP method (“verb”) that we will use for the upload – which we will use “
PUT
” usingcurl
‘s-T
option - The MD5 signature, which must be the same as the
Content-MD5
header in thecurl
request - The MIME content type, which must be the same as the
Content-Type
header in thecurl
request - The current date, which must be the same as the
Date
header in thecurl
request - And finally, the S3 resource key for the file that we are uploading, which is composed of the bucket name and the file key
The upload details message is signed using SHA1 HMAC with our AWS secret key, and Base64 ASCII-armoured. After which we can issue the request, where we use the authorization type AWS
(meaning AWS pre-signed URLs) and the authorization token is our AWS key ID and the upload signature.
If all goes well, the script should output nothing, otherwise there is some problem and the AWS response should provide the necessary details. Its important to note that all the headers I’m using in the curl
command are required, though AWS support additional AWS specific headers that start with X-AMZ-
and allow you to provide additional meta-data, such as the uploader’s name and the required permissions for the file. If you use X-AMZ
headers, then you must also add these headers in the “upload details message” that is signed – check the AWS documentation for more details.
Why not use s3cmd or aws-cli?
Because installing s3cmd or aws-cli requires installing a lot of dependencies – some are not available on long-term-supported operating systems. Also, in production, it introduces an element of risk that you don’t get with a simple shell script.
Hello,
I do have a simple web hosting plan but still the ability to run bash scripts.
So I came across the script you wrote and I am now getting the error:
Code: NoSuchWebsiteConfiguration
Message: The specified bucket does not have a website configuration
What do I have to configure to have everything working?
Thanks for your help!
cgdu
You mean that you get this error when you rub the script? That’s bizarre – you should get it if you try to GET files from an S3 bucket that is not configured for website hosting, but not when you use PUT (which is the -T flag in my script).
[…] the years – an earlier version (I think version 1) I implemented in a previous blog post: upload files to Amazon S3 using Bash, with new APIs and newer versions of existing APIs opt in to use the newer signing […]
You are awesome. Great article worked for me in a minute without any error..Thanks 🙂