This is part 3 of a 4-part project to build a generic TCP proxy. This proxy will be capable of handling any TCP-based protocol, not just HTTP.
So far, the fake DNS server that we built in part 2 is tricking your smartphone into sending TCP data to your laptop. In this section we’re going to build our actual TCP proxy server, and run it on your laptop. Its job will be to listen for data sent from your phone; forward it to the real, remote server; and finally forward any responses from the server back to your phone. In other words, act like a proxy.
Our proxy will run on your laptop and listen for incoming TCP connections on port 80 (by convention, the unencrypted HTTP port). When it receives one, presumably from your smartphone, it will first make another TCP connection, this time with our target hostname’s remote server. Second, it will take any data that it receives over the connection with your smartphone, and re-send it over its new connection with the remote server. Third, it will listen for response data coming back from the server. Fourth and finally, it will relay this response data back to your smartphone, completing a 4-step loop. Your smartphone will be able to talk to the remote server as normal, taking only a slight detour via our proxy.
We will use HTTP for testing instead of some other TCP-based protocol because it’s easier. To get your phone to make an HTTP request, all you have to do is visit a website. Our proxy isn’t “non-HTTP” - it’s “non-HTTP-specific”. In addition, the first version of our TCP proxy will not be capable of handling TLS encryption. We will therefore have to take care to test using websites that use unecrypted HTTP, not HTTPS. We will add TLS support in part 4.
Let’s take a closer look at each stage of our 4-step loop: from smartphone, to laptop, to remote server, and back again.
The first step is almost completely taken care of by our DNS server from the previous section of the project. Your phone has already been tricked into sending its TCP connections for our target hostname to your laptop, and all that remains is to ensure that our proxy receives them safely.
The second step, from laptop to remote server, is handled by our proxy. When our proxy receives a TCP conection from your phone, it will turn around and initiate a second TCP connection, this time with the target remote server. It will then re-send any data that it receives from your phone to the remote server.
As we discussed in part 2, we will hardcode the hostname of the remote server that our proxy should make its second connection with. We will also take care to ensure that this hardcoded hostname matches the hardcoded hostname in our fake DNS server.
Once our proxy has sent the remote server the data that it received from your smartphone, all that remains for it to do is to send the response data that it receives from the remote server back to your phone. In this third, server-to-laptop stage we will make sure that our proxy can receive response data from the remote server.
Finally, our proxy will send this response data back to your phone. Your phone will receive the data in exactly the same form as if it had been talking to the remote server directly, and it will assume that everything that just happened was completely normal.
I’ve written us an example proxy using Python’s
twisted networking framework. I found that
twisted gave the right amount of control over the innards of the proxy, whilst requiring very little boilerplate. In order to achieve this it introduces some of its own, new abstractions. These abstractions make
twisted code very terse, but also a little cryptic for the uninitiated.
Twisted is designed around “event-driven callbacks”. This means that it automatically runs particular methods (or “callbacks”) whenever a specific event occurs. The events that we are interested in are “connection made” and “data received”. We can tell
twisted what to do when these events occur by defining a
Protocol class with methods called
twisted sees a “connection made” event it runs our
connectionMade method, and you can probably guess what it does when it sees a “data received” event.
Here’s my code. It’s followed by a more detailed explanation of the different components.
Let’s take a closer look at this code. You might find it useful to open the code on GitHub.
TCPProxyProtocol is our main
Protocol class. It handles communicating with your phone, and delegates communicating with the remote server to the
ProxyToServerProtocol class. We initialize our proxy server by instantiating one of these
TCPProxyProtocol objects, and telling
twisted to use it to listen on port 80 - by convention, the unencrypted HTTP port. Next, nothing happens until your laptop receives a TCP connection on port 80 (presumably from your phone). When twisted sees this “connection made” event, it invokes the
connectionMade callback on our
TCPProxyProtocol. At this point our proxy has made a connection with your smartphone, and step 1 of our 4-step process is complete.
Step 2, from proxy to remote server, is handled by the
ProxyToServerProtocol class. When our
TCPProxyProtocol#connectionMade method is called, it creates an instance of a
ProxyToServerProtocol, and instructs this instance to connect to our target remote server on port 80.
TCPProxyProtocol receives any data from your phone before the
ProxyToServerProtocol’s connection to the remote server is complete, it adds the data to a buffer to make sure it doesn’t get dropped. Once the connection is ready,
ProxyToServerProtocol sends any data that the buffer has collected to the remote server. At this point our proxy has opened separate connections with both your smartphone and the remote server, and is sending data from your smartphone on to the remote server. Step 2 complete.
Finally, when the
ProxyToServerProtocol receives data back from the remote server,
twisted invokes the
dataReceived callback. The code in this callback instructs the original
TCPProxyProtocol to send the data that the
ProxyToServerProtocol received from the remote server back to your phone. Steps 3 and 4 complete.
Since we have not yet implemented TLS support for our proxy, we need to test our proxy using a website that does not have HTTPS enabled. I recommend nonhttps.com, a handy development hostname that, as promised, does not use HTTPS.
Before you begin testing, make sure that:
Then start both scripts and visit nonhttps.com on your phone. You should see your fake DNS server spoof the DNS request and return the IP address of your laptop. You should then see your TCP proxy receive HTTP data from your smartphone, and log its contents to the terminal. Next, it should log the corresponding HTTP response that comes back from nonhttps.com. Finally, nonhttps.com should load in your phone’s browser, as though nothing at all miraculous had just happened.
If this doesn’t work then its time for some debugging.
tcp port 80. Do you see anything that looks like an error? Do you see anything at all?
Now you can proxy any TCP request that doesn’t use TLS encryption. Even though we have been testing using HTTP requests for simplicity, notice that nowhere in our code do we even mention HTTP. We see only a generic, TCP-transported stream of bytes that can have any structure and use any application protocol that it likes.
All that remains is for us to make our proxy capable of handling TCP requests that do use TLS encryption. That’s in the fourth and final section of this project.
Read on - Part 3: Fake Certificate Authority