Saturday, June 4, 2011

Things your IT can do in the cloud - #1: anonymous proxy

Why anonymize your business?


In many people's minds browsing via an anonymizing proxy is associated with either hard-core anarchists or with opponents to oppressive regimes.

However, there are several very good reasons to use anonymizing proxies in your business.
You may want to hide from your competitors the fact that you visit their web site and look for specific information to acquire competitive intelligence. You also want to hide these visits from third party web sites, which may cooperate with your competition more closely than they do with you.

When researching a new product, a firm searches various databases, vendor sites and academic sources for information on the relevant materials, processes, possible suppliers, and market information. When using subscription based information sources, the firm's confidentiality is protected contractually. That is not so in public sources. In many industries, knowing in advance that a certain firm is looking for information about certain processes and materials, is worth a lot of money.


Prior to all mergers or acquisitions, a team of analysts will research the firm to be merged with or acquired. Just like in regular R&D, an exposure of intentions to the researched subject or to a third party can be very costly.

Of course, there is no need to anonymize the entire traffic going out of your firm's network. The problem can be usually pinpointed to a few individuals who really need this protection, and even they do not need it all of the time.

Using EC2 to anonymize your users


The idea to use An Amazon EC2 server as an anonymous proxy is not a new one. There are a number of articles already on the web explaining the technicalities of setting up proxy servers and tunneling in EC2. However, the existing articles aim at highly technical individuals and usually the setup is much more complex than what I am about to show.

Technically speaking, there are several ways to achieve our goal. You can tunnel the traffic via SSH, setup a SOCKS server, or use anonymizing networks like TOR and I2P. I chose to showcase a simple http proxy using standard Apache, because it is so easy to set up and yet so effective.Using AWS CloudFormation, you can have a 1-Click proxy up and running in no time!!

I assume that the readers of this blog post already have an AWS EC2 account, along with their credentials, certificates and keys. You should also have some familiarity with the AWS self service portal.  I will focus on what you need to specifically do to quickly and easily deploy an anonymizing apache proxy server. Scroll down to see the CloudFormation template used to automate the deployment.


Recipe materials:
  • 1 EC2 security group
  • 1 Elastic IP
  • 1 EC2 Micro Linux server
  • 1 Apache httpd
  • A dash of configuration changes

Create an EC2 security group


We will use the EC2 security group to limit access to the proxy server. After all, we do not want free riders to use our proxy, especially since we will be paying for all of the traffic.
First, you have to know the outbound addresses of your network. In this example, I used a bogus address of 79.181.46.194. Create a new Security group, and add a rule like the one in this screenshot:

Start a new EC2 instance


For a small number of users, a Linux micro instance is more than enough. Start a new instance and select the basic 32bit Amazon Linux AMI, and then chose to start a micro size server.

The Amazon Linux servers have almost nothing preinstalled on them, but they do have the cloud-init service. The cloud-init service, developed by Canonical to be used on Ubuntu, allows you to pass to the server bootstrap configuration data, parameters and commands.  Our instance will read the user data passed to it during initialization, and use it as the input for cloud-init.


Here are the actual contents to be used for the user data.
The "packages" section installs the latest Apache httpd service from Amazon's yum repository.

The "runcmd" section appends the minimal set of Apache configuration directives that are required for the Apache proxy, and restart apache. Port 443 is added to support browser configurations that use it for SSL proxying.

Before you copy and paste, take care to modify the IP address of your network. Although the EC2 Security group should take care of unwanted network access, I think that it is good practice to include some access control here as well.

#cloud-config

packages:
- httpd

runcmd:
- echo listen 443  >> /etc/httpd/conf/httpd.conf
- echo ProxyRequests On >> /etc/httpd/conf/httpd.conf
- echo ProxyVia Block >> /etc/httpd/conf/httpd.conf
- echo \<proxy \*\> >> /etc/httpd/conf/httpd.conf
- echo Order deny,allow >> /etc/httpd/conf/httpd.conf
- echo Deny from all >> /etc/httpd/conf/httpd.conf
- echo Allow from 79.181.46.194 >> /etc/httpd/conf/httpd.conf
- echo \<\/Proxy\>  >> /etc/httpd/conf/httpd.conf
- service httpd restart


Last thing to do before you launch the instance, is to assign the previously created EC2 security group to our new micro instance.


Before we turn to configure the end users, we may want to associate an Elastic IP with our new instance. We want a predictable environment, where the configuration changes to end users and our IT infrastructure are minimal. A new server instance in EC2 gets unpredictable IP address, and when we use an Elastic IP we can keep a known IP address and even use a DNS record to point at our proxy. The allocation of an EIP to a running instance happens after the instance is up and running.


That's it – start the server, wait 3 minutes, and you have a private yet anonymous proxy.
 

How much does it cost?

If you are a new customer, you are entitled to a period of 12 months of free tier discount, reducing the costs significantly.
Assuming that your business needs a proxy 50% of the time, and a monthly bandwidth usage of 50GB, the setup used in this post costs $26 a month, or $15 after discount.
If you plan to keep the proxy online 100% of the time, it will cost you a total of $30 a month, or $12 after discount.
The reason for this seemingly strange pricing is the cost of Elastic IP. It does not cost you to use it, but you pay when you keep it reserved without usage. 



AWS CloudFormation Template

CloudFormation allows you to bundle all of the resources needed for an application launch into a single, automatic job. In our case, we have 3 resources: a security group, an elastic IP, and a server instance. The following template starts everything in one step.

{
 "AWSTemplateFormatVersion" : "2010-09-09",

 "Description" : "Instant Anonymizing proxy",

 "Parameters" : {
    "KeyName" : {
      "Description" : "Name of an existing EC2 KeyPair to enable SSH access to the instance",
      "Type" : "String",
      "Default" : "MySSHKeypair"
    },
    "MyNetwork" : {
      "Description" : "Outbound IP Address of your corporate network",
      "Type" : "String",
      "Default" : "100.101.102.103"
    },
    "MyEIP" : {
      "Description" : "Existing Elastic IP",
      "Type" : "String",
      "Default" : "50.51.52.53"
    },
    "InstanceType" : {
      "Description" : "Type of EC2 instance to launch",
      "Type" : "String",
      "Default" : "t1.micro"
    }
   
  },

  "Mappings" : {
    "RegionMap" : {
      "us-east-1" : {"AMI" : "ami-8c1fece5"},
      "us-west-1" : {"AMI" : "ami-3bc9997e"},
      "eu-west-1" : {"AMI" : "ami-47cefa33"},
      "ap-southeast-1" : {"AMI" : "ami-6af08e38"},
      "ap-northeast-1" : {"AMI" : "ami-300ca731"}
    }
  },


  "Resources" : {
    "Ec2Instance" : {
      "Type" : "AWS::EC2::Instance",
      "Properties" : {
        "KeyName" : { "Ref" : "KeyName" },
        "InstanceType" : { "Ref" : "InstanceType" },
        "SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" } ],
        "ImageId" : { "Fn::FindInMap" : [ "RegionMap", { "Ref" : "AWS::Region" }, "AMI" ]},
        "Tags" : [
            {
                "Key" : "Name",
                "Value" : "MyProxy"
            }
       ],
        "UserData" : { "Fn::Base64" : { "Fn::Join" : ["",[
            "#cloud-config","\n",
            "\n",
            "packages:","\n",
            "- httpd","\n",
            "\n",
            "runcmd:","\n",
            "- echo listen 443  >> /etc/httpd/conf/httpd.conf","\n",
            "- echo ProxyRequests On >> /etc/httpd/conf/httpd.conf","\n",
            "- echo ProxyVia Block >> /etc/httpd/conf/httpd.conf","\n",
            "- echo \"<proxy *>\"  >> /etc/httpd/conf/httpd.conf","\n",
            "- echo Order deny,allow >> /etc/httpd/conf/httpd.conf","\n",
            "- echo Deny from all >> /etc/httpd/conf/httpd.conf","\n",
            "- echo Allow from " , { "Ref" : "MyNetwork" } , " >> /etc/httpd/conf/httpd.conf","\n",
            "- echo \"</proxy>\"  >> /etc/httpd/conf/httpd.conf","\n",
            "- service httpd restart","\n" ]]}}
      }
    },

    "InstanceSecurityGroup" : {
      "Type" : "AWS::EC2::SecurityGroup",
      "Properties" : {
        "GroupDescription" : "All ports access from my corporate network",
        "SecurityGroupIngress" : [ {
          "IpProtocol" : "tcp",
          "FromPort" : "0",
          "ToPort" : "65535",
          "CidrIp" : { "Fn::Join" : [ "/" , [ { "Ref" : "MyNetwork" } ,"32" ] ] }
        } ]
      }
    },
    "IPAssoc" : {
      "Type" : "AWS::EC2::EIPAssociation",
      "Properties" : {
        "InstanceId" : { "Ref" : "Ec2Instance" } ,
        "EIP" : { "Ref" : "MyEIP" }
      }
    }
  },

  "Outputs" : {
    "ProxyIP" : {
      "Description" : "The IP address for the newly created Proxy server",
      "Value" : { "Ref" : "MyEIP" }
    }
  }
}





Advanced proxy setup


You may consider a more customized Apache configuration.

Optimizing your Apache installation for performance and security is a good idea. It means stripping out all unnecessary apache modules, keeping only the bare minimum for our required functionality, and modifying some apache directives for increased security.

Adding better access control with some kind of user authentication is a good idea. Basic authentication is very easy to setup and adds another layer of security to your proxy.

You should decide what to do with the server logs. If you keep them, then you need to define log a retention policy and configure Linux/Apache accordingly. But, maybe the best idea is to give up on logs altogether. After all, we wanted an anonymous proxy, didn't we?

Dealing with the road warriors is another matter. You will have to either limit their proxy access via the enterprise network, or change the entire security scheme by using a VPN or another encrypted channel between the proxy and the end users.

The downside to all of the advanced setup ideas is that you will most likely have to configure a private AMI and add more software to support your needs. A future post will deal with some advanced scenarios.


How to configure your users' browsers


This is beyond the scope of this article, but I added a few links nonetheless.
Here are the instructions for Firefox
http://support.mozilla.com/en-US/kb/Options%20window%20-%20Advanced%20panel?s=proxy&as=s#w_network-tab , here for Chrome
http://www.google.com/support/chrome/bin/answer.py?answer=96815
and here for Explorer http://support.microsoft.com/kb/135982

In a corporate environment, you should also take a look at Proxy Automatic Configuration scripts http://en.wikipedia.org/wiki/Proxy_auto-config 
You can set up a code based PAC server, that can supply different PAC files to different users based on various criteria.

 Caveats


Do remember that a proxy like the one we just discussed only hides the origin IP address. It does not hide any cookies the user accumulated in previous browser sessions, and it certainly does not hide you if you login to a subscription based service with your corporate email. I recommend using a browser dedicated only to sensitive operations. At least use a private session - all modern browsers support this feature. For example, if you use Chrome, pressing CTRL+Shift+n opens a new, incognito session.
One more thing: it turns out that Flash does not respect the browser's proxy settings.