CS 261 Homework 3

Instructions

This problem set is due Thursday, November 6th.

You may work together and discuss the questions on this homework with others, but the writeup you turn in must be your own, and you should list anyone who you collaborated with. You may use any source you like (including other papers or textbooks), but if you use any source not discussed in class, you must cite it.

Question 1

A Stanford student has heard that there are lots of hackers out there probing web servers and send them malicious requests. So, as an experiment he decides to set up a simple server running on his desktop machine that listens on TCP port 80 and records all data sent to it (the HTTP request, HTTP headers, etc.). Moreover he sets it up so that the entire log of this data is automatically published on his personal web page, so that any member of his research group can check out the latest set of requests to his web server.

You can assume that he regularly logs onto Stanford web sites, does e-banking at Wells Fargo, and generally surfs the web from his desktop machine.

What could go wrong? List a security risk introduced by this experiment, i.e., describe a way that a hacker might be able to take advantage of this experimental setup to compromise the Stanford student's security. You can assume that the software is free of implementation flaws: it works as described, and has no bugs.

(Hint: Try looking up the DNS record for localhost.stanford.edu. Does it give you any ideas?)

Question 2

Many versions of the Internet Explorer browser do "MIME content-type sniffing", which is intended so that IE can still display websites that serve content labelled with the wrong MIME type. The way this works is that IE scans the first 512 bytes of all the content it receives (e.g., documents, images, etc.), looking for character-sequences that tend to be representative of HTML (e.g., <HTML, <BODY, <BR, <TITLE, and so on). It keeps a count of the number of matches, and if the number of matches is high enough, it ignores the MIME type provided by the web server and instead treats the content as HTML.

(a) Explain the security implications of this for a site like MySpace, which allows users to upload an image of themselves onto their page, or Flickr, which allows users to upload photo albums. In other words, describe an attack that is enabled by IE's content-type sniffing.

(a) Suggest a robust defense that MySpace could use to protect themselves. (Keep in mind that the details of IE's content-type sniffing heuristic are not clearly documented by Microsoft, and might even change in future versions of IE. For robustness, you probably ought to assume that hackers may know more about how content-type sniffing works than you do; and ideally your defense would remain secure even if Microsoft makes minor changes to IE's content-type sniffing heuristic in the future.)

Question 3

This question asks you to explore some of the consequences of active networks, where packets can contain mobile code that is executed by the routers along the path.

For concreteness, we can think of 'adaptive routing' as a sample application: if your TCP connection to France is too slow because of poor bandwidth on the transatlantic link and for some reason you happen to know that there is a much faster route to France via China, you might wish to adaptively update the route your TCP packets take. In this case, you would "push" some mobile code into each router along the way; the mobile code would run at each router before the packet is forwarded and select which interface to send it out over.

We describe below a series of extensions to the IP protocol suite which allows for progressively more sophisticated active networks applications. For each of the four parts below, first list the security threats that might arise for that extension; then explain how those threats could be addressed/mitigated. The purpose of this question is to study issues that are inherent in the functionality; you may ignore the risk of implementation bugs such as buffer overruns.

  1. In the simplest variant, we'd extend the IP packet format to allow an optional extra header which contains some mobile code to run at each router. The mobile code is specified in the BPF (Berkeley Packet Filter) bytecode language. Each router which receives such a packet first verifies that the bytecode contains no backwards jumps, and then interprets the bytecode. The only memory locations the bytecodes are allowed to read are (1) the packet itself, and (2) a global list of interfaces available at the router. (Each interface in the list is annotated with a little bit of relevant information that can be read by the handler, such as the IP address of the next hop along that interface. No writes to memory are allowed.) There are no function calls, computed gotos, exceptions, or other forms of indirect control flow. Just before exiting, the bytecode should store the name of the desired outbound interface in a fixed register, and the router will forward the packet out via that interface on towards its destination.
  2. One obvious performance issue with the previous scheme is that it requires an overhead of potentially hundreds of bytes of code in every packet. So we introduce the notion of "flows" to amortize the cost of specifying the mobile code. Each packet is associated with a flow. In TCP, the flow ID might be the (src host, dst host, src port, dst port) tuple. For other protocols, we might simply extend the packet format to allow for a 32-bit flow ID. We add a "set handler" IP option which allows endpoints to specify a single chunk of mobile code which will be run at the router every time a packet is received on the same flow. Thus one endpoint can send a packet with the "set handler" IP option and containing a lengthy chunk of mobile code; that mobile code will then be applied to all subsequent packets on that flow, and does not need to be sent again. This allows us to specify a chunk of mobile code once; then all subsequent packets in the flow will inherit the same code without incurring any bandwidth overhead.
  3. It occurs to us that we might like to allow the mobile code to make routing policy decisions based on the payload of the packets, or even to compress packets for us on the fly when bandwidth is scarce. Since this might require scanning the entire packet and possibly interpreting higher-level protocols, we will need to be able to write loops in bytecode. Therefore, we eliminate the restriction on backwards jumps, and allow arbitrary control flow in the bytecode. To implement compression, the handler will need to be able to modify the contents of the packet. Therefore, we also relax our security policy so that handlers are allowed both read and write access to the packet itself. If the handler modifies the packet during execution, the router will forward the modified packet instead of the original contents. Also, we allow handlers to maintain state across packet reception events. Thus, when a new flow is created, we set aside a chunk of memory for use by that flow's handler; the handler is allowed read and write access only its own chunk of memory.
  4. An astute reader points out that decompression may increase the size of a packet. If this exceeds the network's MTU, our decompression handler may need to send multiple packets. Therefore, we extend the scheme so that handlers can construct whole IP packets in their own memory space and invoke a special operation to send those packets over the wire.

Don't forget: In each part, you should list security threats, and also propose a way that those threats could be addressed (e.g., propose a fix).