Script Day: Upload Files to Amazon S3 Using Bash

Here is a very simple Bash script that uploads a file to Amazon’s S3. I’ve looked for a simple explanation on how to do that without perl scripts or C# code, and could find none. So after a bit of experimentation and some reverse engineering, here’s the simple sample code:


#!/bin/bash
file="$1"

key_id="YOUR-AWS-KEY-ID"
key_secret="YOUR-AWS-KEY-SECRET"
path="some-directory/$file"
bucket="s3-bucket-name"
content_type="application/octet-stream"
date="$(LC_ALL=C date -u +"%a, %d %b %Y %X %z")"
md5="$(openssl md5 -binary < "$file" | base64)"

sig="$(printf "PUT\n$md5\n$content_type\n$date\n/$bucket/$path" | openssl sha1 -binary -hmac "$key_secret" | base64)"

curl -T $file http://$bucket.s3.amazonaws.com/$path \
    -H "Date: $date" \
    -H "Authorization: AWS $key_id:$sig" \
    -H "Content-Type: $content_type" \
    -H "Content-MD5: $md5"

So what is going on here?

This example script uses what Amazon calls “pre-authenticated URL”, which basically means that instead of performing a “login process” for each upload to generate an an authorization key (OAUTH style), you generate the authorization key yourself by signing the upload details with your AWS secret.

So we first the upload details – these need to be consistent between the signature step and the actual upload request, so we save them in variables instead of using them directly:

  • key_id and key_secret should be changed to be your actual AWS key detail (no, you can’t have mine)
  • path is the path of the file to be stored under the S3 bucket (what Amazon calls the file’s “key”)
  • bucket is the name of the S3 bucket where we will upload the file
  • content_type is the MIME Content-Type of the file, which – unless you want to serve the file through a web site – shouldn’t matter much so I’m using “application/octet-stream” (“binary” for you systems folks)
  • date is an RFC2616 compliant date specification. I force LC_ALL to the POSIX “C” locale to prevent date from genearting localized day and month names, which are not valid in RFC2616
  • md5 is an RFC2616 Content-MD5 compliant MD5 checksum, which means its the binary MD5 digest encoded using Base64 (instead of the hex encoding most people use nowadays)

Finally it is time to generate the authorization key, which we do by creating an “upload details message” that includes:

  • The HTTP method (“verb”) that we will use for the upload – which we will use “PUT” using curl‘s -T option
  • The MD5 signature, which must be the same as the Content-MD5 header in the curl request
  • The MIME content type, which must be the same as the Content-Type header in the curl request
  • The current date, which must be the same as the Date header in the curl request
  • And finally, the S3 resource key for the file that we are uploading, which is composed of the bucket name and the file key

The upload details message is signed using SHA1 HMAC with our AWS secret key, and Base64 ASCII-armoured. After which we can issue the request, where we use the authorization type AWS (meaning AWS pre-signed URLs) and the authorization token is our AWS key ID and the upload signature.

If all goes well, the script should output nothing, otherwise there is some problem and the AWS response should provide the necessary details. Its important to note that all the headers I’m using in the curl command are required, though AWS support additional AWS specific headers that start with X-AMZ- and allow you to provide additional meta-data, such as the uploader’s name and the required permissions for the file. If you use X-AMZ headers, then you must also add these headers in the “upload details message” that is signed – check the AWS documentation for more details.

Enhanced by Zemanta

6 Responses to “Script Day: Upload Files to Amazon S3 Using Bash”

  1. Alon:

    Why not use s3cmd or aws-cli?

  2. Oded:

    Because installing s3cmd or aws-cli requires installing a lot of dependencies – some are not available on long-term-supported operating systems. Also, in production, it introduces an element of risk that you don’t get with a simple shell script.

  3. cgdu:

    Hello,

    I do have a simple web hosting plan but still the ability to run bash scripts.
    So I came across the script you wrote and I am now getting the error:

    Code: NoSuchWebsiteConfiguration
    Message: The specified bucket does not have a website configuration

    What do I have to configure to have everything working?

    Thanks for your help!
    cgdu

    • Oded:

      You mean that you get this error when you rub the script? That’s bizarre – you should get it if you try to GET files from an S3 bucket that is not configured for website hosting, but not when you use PUT (which is the -T flag in my script).

  4. Script day – Amazon AWS Signature Version 4 :: Things n' Stuff:

    […] the years – an earlier version (I think version 1) I implemented in a previous blog post: upload files to Amazon S3 using Bash, with new APIs and newer versions of existing APIs opt in to use the newer signing […]

  5. shweta:

    You are awesome. Great article worked for me in a minute without any error..Thanks 🙂

Leave a Reply