Also available at

Also available at my website http://tosh.me/ and on Twitter @toshafanasiev

Tuesday, 4 May 2010

Amazon S3 url gotcha

I have recently been tasked with producing a 'one click' operation to provision a Windows server on EC2 to host an SQL Server database, a web application and an FTP server ( I won't go into the details of my solution as they warrant a post of their own ).

In order to get a production ready server online and secure from scratch I've had to
  • provision storage with S3
  • configure Elastic Block Storage ( EBS ) disks
  • upload and register a previously bundled Amazon Machine Image ( AMI )
  • configure Security Groups ( AWS firewalls )
  • manage key pairs
  • allocate Elastic IP addresses
all in all making it a really engaging introduction to the Amazon Web Services API.

One interesting featurette of the S3 service that particularly stood out was the impact of AWS Region on access to objects stored under S3.

Under the default, US - Standard ( East Coast ) Region objects can be accessed by one of two methods - the  subdomain way and the path way; the urls http://<bucket-name>.s3.amazonaws.com/<key-name> ( subdomain ) and http://s3.amazonaws.com/<bucket-name>/<key-name> ( path ) are both valid.

In order to reduce latency you could choose to deploy in one of the other regions ( US - Northern California, EU - Ireland or the newly added Asia Pacific - Singapore ) as the geographical distribution of your user base dictates. If you do, note that the path form of S3 object urls is not valid ( at time of writing 2006-03-01 was the latest version, see http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html for details ) and the subdomain version must be used instead.

This poses no problem so long as you have control over the source of urls or your code can be made to follow 301 redirects but a problem I ran into was that without modification neither the official Java sdk ( http://aws.amazon.com/sdkforjava/ ) nor .NET sdk ( http://aws.amazon.com/sdkfornet/ ) can register AMIs from bundled data stored in S3 buckets in any region other than US - Standard. This is due to the fact that 'http://s3.amazonaws.com/' is prepended to any string you give them that does not already start that way, and for all regions but US - Standard the path form that is insisted upon results in a 301.

As it happens the user base of the product I am working on is never expected to grow beyond quite a small number and so I can afford the slight hit in latency terms and simply deploy to the default region.

I've written this post in case anyone else encounters the same problem and can either save themselves some trawling or let me know a simple way around this for future reference.

5 comments: