CS261 Projects

General information

Your term project should address a research issue in computer security. The main goal of the project is to do original research on a problem of interest in computer security. At the end of the semester, you will write a conference-style paper on your work.

You should work in a small group; I expect that teams of approximately 2--3 will be appropriate for most projects. If you have trouble finding a project partner, I can help you get matched up with someone else by maintaining a list of people seeking teammates.

I expect that most projects will fall (more or less) into one of two categories:

  1. Design. Design projects will usually attempt to solve some interesting problem by proposing a design; implementing a prototype; and using the implementation as a basis for evaluating the proposed system architecture.
  2. Analysis. Analysis projects might, for example, study some previously-proposed implementation technique, existing system, or class of systems; evaluate its security properties; find flaws, or strengths, in it; and provide new insight into how to build secure systems.
The research should be relevant to computer security, but I encourage you to find topics of interest to you. You're welcome to pick a topic that is connected to your current research.

If you're at a loss for a project topic, I've prepared a list of sample project topics below. You're welcome to come discuss possible project ideas with me, if you like. I'm happy to make myself available to discuss projects.

The process

You will write a concise (approximately 1 page) project proposal. It should have three sections:

The project proposal is due Monday, October 10, at 8pm.

Here's how to submit your proposal. You should put together a web page for your project; currently all it needs to contain is the project members, their email addresses, title of your project, and the project proposal. Then just email the URL for your project web page to daw@cs.berkeley.edu by Monday, October 10, at 8pm.

In mid-November I might ask you to write a concise status report so I can make sure the projects are on-track. I am always available to meet with any groups who would like to discuss their project, request additional resources, or ask for advice.

The poster session will be held Friday, December 9, 2-4pm.

Finally, the project report will be due Tuesday December 13 at 6pm.

The final report

Write a technical paper, in the style of a conference submission, on the research you have done. State the problem you're solving, motivate why it is an important or interesting problem, present your research thoroughly and clearly, compare to any related work that may exist, summarize your research contributions, and draw whatever conclusions may be appropriate. There is no page limit (either minimum or maximum), and reports will be evaluated on technical content (not on length), but I expect most project reports will probably be between 7--15 pages long.

Here are some good resources on writing conference-style papers:

You may submit your project report electronically or on paper. I prefer electronic submission of a PDF. In either case, the deadline is the same.

Example ideas for project topics

If you are interested in any of the project topics below, feel free to talk to me about it; I may be able to make some more concrete suggestions.

Analysis and attacks

Security review of published schemes
Pick any recently published paper that proposes a new security mechanism or scheme. Ask the authors for the code. Perform a careful security review of the paper's scheme; does it meet the claims made for it? To find recent papers, you could peruse recent proceedings of Usenix Security, IEEE Security & Privacy, ACM CCS, ISOC Network and Distributed System Security, or other security-related conferences.

Building blocks

Better decision procedures for strings
SMT solvers or decision procedures are a crucial component of symbolic execution frameworks which have recently found numerous applications in software and hardware security. Symbolic execution techniques are heavily limited by the speed, expressiveness and accuracy of the underlying SMT solvers. One recent direction towards making SMT solvers more expressive is augmenting them with the "theory of strings". This project entails designing a new architecture for a string decision procedure to achieve better speed and correctness. Specifically, the project involves combining the SAT-based string solvers (like Kaluza, which use an encoding to bitvector solvers, and automata-theoretic solvers (such as DPRLE) to achieve a fast and correct string solver for an expressive string theory.

Most notably, there is a recent push towards a SMTCOMP competition for string solvers. If the project succeeds, your work can result in a submission to this world-wide competition! Feel free to contact Prateek Saxena for more ideas or collaboration.

Precise Type Inference for JavaScript
JavaScript applications are growing in complexity and now mirror the complexity of several desktop applications. This project asks you to develop a precise and scalable type inference infrastructure for JavaScript applications, which is challenging due to JavaScript's dynamic type system and its functional features for higher-order functions and code evaluation. Types can capture various security properties about programs. For instance, taint types can be used to separate trusted content from untrusted data and to enforce that untrusted or tainted data is sanitized properly to prevent code injection attacks like SQL injection and XSS. Can you develop a type inference engine that can detect taint-style XSS vulnerabilities in 10,000 LOC with low false positives (less than 10%)?

Feel free to contact Prateek Saxena for more ideas or collaboration.

Web security

Defending legacy web apps
Recent work has studied how to protect legacy web applications against authentication/authorization bypass attacks. A system called CLAMP has pioneered a fascinating approach for retrofitting defenses onto a legacy system, based upon ensuring that web application code can only access those parts of the database that should be accessible to the current logged-in user. However, CLAMP introduces a significant performance overhead, due to its use of virtual machines. Can you make these ideas perform and scale better, perhaps by using some other mechanism for isolation? Perhaps SELinux, OS process isolation, or some sandboxing scheme, instead of virtual machines?

Software security

Evaluation of tools
There are now a number of static and dynamic analysis tools for finding security vulnerabilities and reliability bugs in programs, including Coverity, Fortify, Klee, CREST, BuzzFuzz, SmartFuzz, and zzuf. Devise and carry out a set of experiments to evaluate their effectiveness and probe their relative strengths and weaknesses. Can you characterize their effectiveness quantitatively?
Inferring security annotations for C/C++ programs
Microsoft has proposed SAL, a set of annotations for C and C++ code intended to help avoid buffer overrun and similar vulnerabilities: the programmer annotates their code with information about buffer lengths and the like, and a static analysis tool checks those annotations to detect possible bugs. See also Deputy, an open-source system from Berkeley with similar goals. Writing these annotations can get a bit tedious. Can you design a dynamic analysis tool that observes code as it runs and infers SAL/Deputy annotations from how the code is used? For instance, if you run the program on 1000 inputs, and in every case, function f() is only called with null-terminated strings, you might infer a SAL/Deputy annotating asserting that the argument to f() is always null-terminated. (See also Daikon, though Daikon is a general-purpose tool; you can probably do a lot better by focusing specifically on the kinds of properties that SAL/Deputy are designed for.) Tools like Valgrind, CIL, ltrace, etc. could be a good building block for this project.
Software verification tools for security
Researchers have recently made dramatic improvements in tools for software verification. See, e.g., ESC/Java2 and JML (verification tools for Java) and Spec# (a verification tool for C#). These tools allow programmers to verify properties, such as that the program will never throw a NullPointerException, ArrayIndexBoundsOutofBoundsException, or other runtime exception, and that the program will never use uninitialized memory. Since unexpected exceptions can cause surprising behavior (which is dangerous in a security-critical program), these could be useful for secure programming. You might study how useful and expensive these are for security. For instance, pick one or two security-critical applications written in Java or C# and attempt to verify that they satisfy some useful property (e.g., free of runtime exceptions). How many annotations did you have to add? How much time or effort did it take? Did the effort reveal any vulnerabilities? Or, you might try to see if you can express any of the security requirements for those applications into the JML/Spec# modelling language and see whether it is possible to verify that those requirements are met by the code. Are the JML/Spec# annotation languages rich enough to specify important security policies? If not, what kinds of extensions would be useful for security? Are the existing tools powerful enough to verify that code meets those requirements?

Measurement

Use of Web security features
Today's web browsers and web application development frameworks provide many features for improving security. Perform a measurement of web sites in the wild to understand what features are being used by web sites.

Other

Automated signature verification
Banks, election administrators, and others use automated tools to verify your (ink) signature, to check that your signature matches the one they have on file. It would be interesting to survey the field and analyze the security of the state-of-the-art algorithms for this; how hard is it to forge a signature that will pass the automated verification process? Alternatively, you could study new algorithms for signature verification that are hard to fool.