This assignment deals with representation and inference in Bayesian networks. You will use the code provided with AIMA to build and exercise a simple network. You will also measure the performance of two stochastic sampling algorithms and implement a third for comparison. In A6 you will build on the material in A5 to develop learning algorithms for Bayes nets.
Be sure to use the latest version from ~cs188.
There are some ready-made Bayes nets in the uncertainty/domains directory. For example, try
>> (setq burglary-net (load-bn "uncertainty/domains/burglary.bn")) #S(BN :NODES (#S(TABULATED-BNODE :NAME EARTHQUAKE :INDEX 0 :VALUE-NAMES # :ARITY 2 :PARENTS NIL :CHILDREN # :CPT-FN # :SAMPLE-FN # :LIKELIHOOD-FN # :CPT #0A #) etc.
The raw data structure for a Bayes net is circular and very ugly to look at. You can display a Bayes net as follows:
>> (display-bayes-net burgary-net) Node: EARTHQUAKE Parents: NIL Type: TABULATED-BNODE P(EARTHQUAKE|NIL) EARTHQUAKE= | TRUE FALSE ------------------------------- | 0.002000 0.998000 etc.Another very useful function, especially for debugging, is (bnode-by-name name bn), which returns the actual node with the given name from bn.
Once you have a Bayes net, you can create evidence for
it by creating a data structure called an event.
Do the following:
>> (setq be1 (create-event burglary-net)) Name of node to set (nil if none)? JohnCalls Value to set it to? true Name of node to set (nil if none)? MaryCalls Value to set it to? true Name of node to set (nil if none)? nil #(NIL NIL NIL 0 0)Notice that an event is actually a vector of integers (or NILs) representing the values of all the nodes. (The integers correspond to the values defined for the node, with 0 being the first value -- in this case, "true".) Once you have evidence, you can ask for the probability of a variable given the evidence. The simplest exact inference algorithm is (enumeration-ask X e bn), which takes a variable name X and an event e and returns a distribution over X:
>> (enumeration-ask 'Burglary be1 burglary-net) ((TRUE . 0.2841718348510418d0) (FALSE . 0.7158281651489582d0))Enumeration works by enumerating all entries in the joint distribution. The variable elimination algorithm is usually more efficient:
>> (elimination-ask 'A e1 ab-net) ((TRUE . 0.2841718348510418d0) (FALSE . 0.7158281651489582d0))The two sampling algorithms that are provided take an extra argument indicating the number of samples to generate (defaults to 1000):
>> (rejection-sampling-ask 'burglary be1 burglary 100000) ((TRUE . 0.3160377358490566d0) (FALSE . 0.6839622641509434d0)) >> (likelihood-weighting-ask 'burglary be1 burglary 100000) ((TRUE . 0.3070544034594543d0) (FALSE . 0.6929455965405458d0))Notice that even with 100000 samples, the sampling algorithms do not do very well!
A Bayes net can be constructed using the interactive function
create-bayes-net. For example, the following transcript
shows the construction of a two-node network; you should try this
>> (display-bayes-net (setq ab-net (create-bayes-net)) *****************Creating New Node******************* What is the node's name (nil if none)? A Variable type? One of tabulated, deterministic, linear-gaussian, probit. tabulated Enter list of values (true false) *****************Creating New Node******************* What is the node's name (nil if none)? B Variable type? One of tabulated, deterministic, linear-gaussian, probit. tabulated Enter list of values (true false) *****************Creating New Node******************* What is the node's name (nil if none)? nil ************Creating bayes net arcs***************** Enter list of parent node names for B (A) Enter list of parent node names for A nil ****Creating distribution for node A of type TABULATED-BNODE**** Enter probabilities for A = (TRUE FALSE) (0.6 0.4) ****Creating distribution for node B of type TABULATED-BNODE**** Given A = TRUE Enter probabilities for B = (TRUE FALSE) (0.9 0.1) Given A = FALSE Enter probabilities for B = (TRUE FALSE) (0.1 0.9) Node: A Parents: NIL Type: TABULATED-BNODE etc.The interactive entry process is somewhat error-prone, so you may choose to use some of the functions called by create-bayes-net to create and save intermediate stages in the process. For example, you might create the network structure (with variables and values) first, then create and edit a file of lisp commands to set the CPT entries. In general, it's a good idea to draw out your Bayes net and CPTs on paper first. Once a network hs been created, you can save it to a file; for example, do
>> (save-bn ab-net "uncertainty/domains/ab.bn")
Let Pass be a variable which has value true if a given student passes a given class, and false otherwise. Let us consider some of the factors useful for predicting if a student will pass, as seen from the point of view of the professor. The observable variables are the GPA in all previous classes (high, medium, or low), whether or not the student has taken the PreReqs (true or false), and whether or not the student is Asleep in class (true or false). There are also some relevant unobserved variables: whether the student is Smart, Studious, and pulls an AllNiter, and the student's WorkLoad. (It is up to you to choose the possible values of these unobserved variables.)
Question 1 (20 pts). Begin by using the procedure in Chapter 14 to design the topology of the network. Choose an ordering for the variables that reflects the causal processes in the domain, and try to avoid having nodes with too many parents. Then use the function create-bayes-net to create the Bayes net itself. Use the function display-bayes-net to display it readably. Explain your choice of topology. (You may also want to use save-bn to save the network to a file.)
Question 2 (10 pts). There are twelve possible cases for the three evidence variables in Q1. Compute the exact probability of Pass in each case, using your network. Check to see that these values look reasonable. If not, you may want to alter some of your conditional probabilities. You can do this using edit-cpt-row for each row you want to change. Include the final results and Bayes net with your writeup.
Imagine you are an oil industry analyst trying to predict whether LilOilCo will go bankrupt. Shown here is a Bayes network for assessing the financial situation of the company. LilOilCo has one prospect with an unknown amount of oil, OilAmt. There is a geologist's report GeolRep that estimates the amount of oil. Whether the company goes Bankrupt depends on the Profit it makes, which in turn depends on its Revenue and the Total$$ it spends producing the oil. Revenue depends on the amount of oil and its price next year (Y+1Price). Total cost depends on the cost to drill and the production cost; the latter depends on the amount of oil and the production cost per barrel.
Question 3 (2 pts). Load the oil network using
(setq oil (load-bn "uncertainty/domains/oil.bn"))(assuming your current directory is the code directory). You may want to do display-bayes-net to look at the CPTs. Compute the exact probability of bankruptcy given that GeolRep=low (using either of the exact algorithms).
Question 4 (10 pts). Suppose that in addition to the geologist's estimate, we can also look at a long, boring article in the Atlantic Monthly assessing whether the current Israel-Palestine situation will be peacefully resolved in the near future. This might be useful to know because if peace breaks out between now and next year that will dramatically affect next year's oil price. To take this into account, we will need two new variables: AMSaysPeace (which can be true or false), and Y+1War (which can be true or false). Explain where these nodes would go in the network, what arcs you would need, and what new and modified CPTs are needed (propose some reasonable values for these).
Question 5 (8 pts). If you run one of the sampling algorithms to answer the query in Q3, you will see that the answer is quite accurate even with 1000 samples. Explain why the sampling algorithms work well for this query, but work poorly for the query about Burglary given JohnCalls and MaryCalls. Are there queries on the oil network such that the sampling algorithms perform much worse? If so, demonstrate this; if not, explain why not.
Question 6 (20 pts).
To get a more quantitative idea of how well various sampling algorithms work,
we will need to instrument them to enable measurement of the
error with respect to the true probability
distribution as a function of the number of samples.
Write a modified version of rejection-sampling-ask called
(recording-RS-ask Xname E bn iterations interval runs true-distribution file)
and similarly recording-LW-ask. The idea is that after every interval iterations (but not after 0 iterations) the current distribution for X is compared to the true distribution. Accumulate an association list, each element of which is a cons pair of the number of iterations and the squared error (see probability.lisp for squared-error). Repeat the process several times, as specified by the runs argument, and average the results. At the end, you should have an alist with the mean squared error at each interval point, sorted by number of iterations. Call plot-alist to write the data to a file. Do as many runs and iterations as you can reasonably manage. [Hint: beware of dividing by zero in rejection-sampling! Consider how to define the error when no samples survive the rejection process.] Use your functions to plot the squared error for Bankrupt given GeolRep=low, writing the output to q6-rs-oil.data and q6-lw-oil.data. Do the same for Burglary given JohnCalls and MaryCalls, writing the output to q6-rs-burglary.data and q6-lw-burglary.data. These files will be in a format suitable to be processed by gnuplot. There is a gnuplot command file called q6.gnuplot. Copy this to your directory, and run it using the Unix command
/usr/sww/bin/gnuplot q6.gnuplotThe results should appear on the screen and then will be written as q6-oil.ps and q6-burglary.ps, which should be included in your submission after you have checked to make sure they look right. Comment briefly on your results in the writeup.
Question 7 (30 pts). Implement the MCMC algorithm on page 517 of AIMA2e; for this question, you can assume that all nodes are tabulated nodes. It's a good idea to write a separate function MB-distribution that computes the distribution of a given variable given the current values of the variables in its Markov blanket, using Equation 14.11. You can debug this by comparing the results to exact inference on the variable given evidence in all its Markov blanket variables. Instrument the MCMC algorithm, run it on the oil and burglary queries to produce files called q7-mcmc-oil.data and q7-mcmc-burglary.data, and run q7.gnuplot to produce q7-oil.ps and q7-burglary.ps, which should be included in your submission after you have checked to make sure they look right. Comment briefly on your results in the writeup.