CS 261 Homework 1 Solutions

Instructions

Questions are in black. Suggested answers are in red. Other possible answers are in green. (For a perfect score, it suffices to give only the answers in red. The answers in green are optional alternatives, answers suggested by other students, or comments from me.)

Question 1

Which of the vulnerabilities described below would likely have been avoided if the system had implemented in Lampson's message-passing model?

The tractorbeaming bug in wu-ftpd.
Avoided. Unix processes have global state (the euid) that determines their privilege level, but Lampson's processes don't.
Note that Lampson's model has no concept of privileged vs. unprivileged state for a process (this confused a few people).
Also, there are no preemptive asynchronous signals in Lampson's model.
The password-checking bug in Tenex that arose from a bad interaction with the virtual memory subsystem.
Avoided. Communication is done using pass-by-value, not pass-by-reference, hence the recipient automatically receives a fresh copy that cannot be tampered with by the sender.
Incomplete mediation risks in Java.
Probably not avoided. The needed mediation will still be spread throughout many processes, in Lampson's model.
In general, a number of people seemed confused about Lampson's model. Note that Lampson's bare message-passing model has no access control matrix (that's only present in an extension presented later).
In general, access control can be enforced in one of two ways: by the OS (e.g., by having an ACL on a file), or by the application (e.g., the application might stat() the file to get its uid, then decide whether to allow a request based on the processes's euid and the file's uid). Anything you can enforce in the OS you can enforce in the application (so long as the OS exports all information needed to make the access control decision). Lampson's bare message-passing model is an example of an extreme case where all access control is enforced by applications. The machine and OS don't enforce any access control whatsoever; they merely provide enough information (i.e., the ID of sender of the message) to the app so that the app can make the access control decision.
In general, where there is a choice, it is often better to enforce access control in the OS. The OS is time-tested (hence less likely to have bugs) and centralizes such decisions (hence reduces the risk of incomplete mediation). Once you realize this, you see that the incomplete mediation risks in Java arise exactly from the fact that mediation is enforced by the app, not by the OS, and hence Lampson's model doesn't help at all.
The ftpd/tar hole.
Unclear. On the one hand, in Unix tar inherits privileges from its caller (ftpd), while in Lampson's model there is no process inheritance, so the bug might be avoided. On the other hand, the fundamental source of the problem is that ftpd trusted tar, and when more features were added to tar, this introduced unexpected interactions, so the bug not have been avoided.
Either answer would be acceptable, if accompanied by a justification. (I personally prefer the latter.)
Filename canonicalization bugs in webservers.
Probably not avoided. Webservers already communicate using only messages. If there is a single process acting as the representative for the hard disk, files will probably still be designated by pathnames, leading to the same problems.

Question 2

Malicious Max has planted a malicious Trojan horse in your favorite web browser, unbeknownest to you. This is unfortunate, because you frequently use your web browser to access your bank account online, and Max would like nothing better than to steal your bank balance. Max has already set up the backdoor to silently capture your account number and password; now he needs to find a way to get this information off your computer and across the network back to himself.

List at least three covert channels that Malicious Max could use to leak information about your banking secrets to a colluder at large somewhere in the network. Can you find one that cannot be easily detected by a knowledgeable defender who can passively sniff on all network traffic?

In general, I will assume that the colluder can sniff on HTTP GET requests, or can arrange to host a website that you will visit. With that, here are some possible attacks:

Invisibly download extra URLs containing the password (e.g., www.hackers.org/the/secret/password/is/ossifrage). (Easy for defender to detect.)

Hide bits in cookies returned upon request of website. (Not quit as easy for defender to detect: defender must observe and remember all preceeding set-cookies, which could be arbitrarily far back in the past.)

Vary the order or timing of requests for inline images to communicate information. For instance, the low order bit of the time at which each inline image is requested might communicate one bit of information. (Defender has little hope of detection, particularly if the bits Max is transmitting are encrypted and hence look random.)
When the user types a URL, select among several equivalent variations on it. For instance, for each character in the hostname one might select between lowercase ("a") and uppercase ("A") to communicate one bit per character. Or, one might select between ordinary coding ("A") and extended coding ("%41") for each character in the filename. (Easy for defender to detect.)
Set up a domain name that resolves to two IP addresses. Normal clients would pick one at random and use it for outgoing HTTP requests. The Trojan horse might wait for the user to visit the web server at the selected domain name and then communicate one bit of information according to its choice between these two addresses. (Defender might be able to detect this.)
Set up an SSL-encrypted session to a website under the colluder's control, and then pass the secret information down this channel. (Passive defender has no hope of detecting this.)
Put up a screen saying "Your machine crashed; write down this error code and call IBM at 1-800-555-1212". Of course, that phone number might actually go to the attacker, not to IBM, and when the user calls and gives his error code, the secret information is leaked. (Defender has no hope of detection, as this is out of band.)

In general, any non-deterministic behavior of a normal browser often leads to an undetectable covert channel: the Trojan can select among the non-deterministic choices in a way calculated to leak information. Hence, non-deterministic or randomized algorithms are bad for bit confinement, whereas for deterministic algorithms there is at least in principle a hope that the defender might be able to detect misbehavior. Sadly, in practice there are many sources of non-determinism, and fully deterministic systems are extremely difficult to build.

Question 3

You're the dev lead on a networked multiplayer game, and it's still early in the design process. Two different architectures have been proposed (see below). Which one is likely to have better security properties?

In the first proposal, the software is split up into three components: a network shim (a very small piece of code that opens a socket and translates between network packets and internal data formats), a game engine (a large chunk of code that manages game strategy and evolution), and a renderer (a large chunk of code that prepares gorgeous graphics).

In the second proposal, the software is split up differently: there's a game core (like the game engine, but it talks directly to the Internet rather than going through the network shim), a rendering engine (most of the code of the renderer, which figures out what to draw), and a graphics card shim (a very small piece of code which translates the rendering engine's output into a format understood by the graphics card).

In the above, large boxes represent hundreds of thousands of lines of code; small boxes represent thousands of lines of code; and lines represent connections between components. Because the game listens on a low-numbered port, opening a socket will require special privilege, and so the piece of code that talks to the network (network shim or game core) runs with root/Administrator privileges; the remainder runs under the user's account.

The first proposal is probably better: there is less code running with privilege, which reduces the likelihood of a vulnerability that can expose the entire system. (For instance, buffer overruns in the game engine or renderer will only affect the user's account, but won't give away root/Administrator privileges to an attacker.)

In addition, the first proposal would make it easy to build a network firewall and to filter incoming packets.

Question 4

HTML can be viewed as a crude programming language, with security-relevant features like references to resources (e.g., A HREF links). How well does HTML follow capability discipline?

(Hint: If you're stuck, you could think about the scenario where the user is sitting behind a firewall and viewing HTML content generated by an untrusted server on the outside. Also, keep in mind that following a URL is an action that might have a side-effect: consider, e.g., web interfaces to databases.)

Not very well. Malicious Molly can create a web page with an A HREF link naming some resource that Molly doesn't have access to. This is a violation of capability discipline (designation without authority), and it can lead to confused deputy problems.

Here is a scenario where this could pose a problem. Company X has a firewall. On the intranet behind the firewall, there is a web front-end to their sales database. Anyone on the intranet can connect to the web front-end and issue SQL commands by filling out a form (which is submitted by a GET method, i.e., when you click the "submit" browser, your browser constructs a special URL and sends it via a HTTP GET request to the internal webserver). Malicious Molly, an outsider, constructs a URL that encodes a SQL command to delete the entire sales database. Molly can't request that URL, because the firewall doesn't let outsiders like her connect to the internal web front-end. Instead, Molly creates an external web site with an enticing link ("click here for free Britney MP3's") naming her URL, and she waits for someone inside the company to click on the link. Or, if she is especially clever, she spams the entire company with a HTML email containing an IMG SRC tag naming the URL; if the recipient's mailer understands HTML email, it will try to grab the inline image when the user reads this email, thereby inadvertently erasing the company's sales database. In either case, the database can't tell whether this request originates with Molly or with an authorized company employee, and this leads to confused deputy problems. In short, HTML is inherently susceptible to confused deputy attacks due to its failure to follow capability discipline.

A few people had some trouble with this question, and might want to re-visit their understanding of modern capability systems a little bit.