Practice Midterm Solutions 1. a) P(3 or more spots) = 4/6 b) P(heads or 3 or more spots) = P(heads) + P(3 or more spots) - P(heads and 3 or more spots) = (1/2) + (4/6) - ((1/2)(4/6)) = 5/6 2. Engine failures may not be independent. For example, things that cause engine to fail (birds, running out of fuel, etc.) may cause multiple engines to fail. Thus, increasing the number of engines does not improve reliability. 3. Deans of Schools and Colleges are likely older than regular faculty when they're initially appointed to the position; someone early in their career would not even be considered for a Dean position. Given that they have already reached an older age by the time they're appointed as a dean, the deans are expected to live longer than regular faculty. 4. This is an example of an ecological fallacy. Correlations at the group level do not always translate to the individual level. We need individual level data to make any claims about associations between race and the likelihood of voting Republican. More detail: For instance, suppose on average African-Americans tend to prefer to live in states that also happen (for other reasons) to lean Republican, but African-Americans aren't more likely to vote Republican. This is one possible way that states with more African-Americans could tend to be more likely to vote Republican. 5. P(bus on time) = 90% P(app report on time | bus on time) = 80% P(app report on time | bus late) = 10% P(bus on time | app report on time) = P(bus on time and app report on time) / P(app report on time) = P(bus on time and app report on time) / [P(bus on time and app report on time) + P(bus late and app report on time)] = (.9*.8)/(.9*.8+.1*.1) 6. No, I know that I am a clumsy student that likely uses a smartphone much more on average than a randomly selected person that participated in the survey, so the news article is not convincing. To make a better decision, I might try to estimate the probability that I damage my phone. I'll also need the cost of insurance and the value of my phone. More detail: I use my phone a lot (on the bus, walking around campus, etc.), and there are many opportunities for it to get damaged. Thus, I am likely to purchase the insurance as long as the cost is not prohibitively expensive. If I wanted to make a careful, informed decision, I might try to estimate the probability that I will damage my phone, and compare the expected value of insurance (one way to get a rough estimate of this is to multiply the probability that I destroy my phone times the value of the phone) to the cost of the insurance. 7. Confounding factors (variables that are not controlled for and may explain the observed association) and reverse causation (instead of A causing B, B might be causing A) are two possible issues with observational studies. There can be many explanations for correlations found in observational studies. Correlation does not imply causation. Controlled experiments that are designed and implemented well have random assignment of study participants into different conditions (usually treatment and control). This creates comparison groups where the only dimension of difference is the treatment. Therefore, confounding factors are not an issue, and one can examine the causal effect of the treatment or intervention on study participants. As the treatment/intervention is randomly assigned to study participants, it is not correlated with any variable. Therefore, there is no reverse casuation issues to worry about. 8. The representativeness heuristic is used when determining probabilities under uncertainty. For example, to figure out the probability that A comes from B, people think about the degree to which A resembles (or is representative) of B. Two judgement biases that result from this heuristic are insensitivity to prior probability and misconceptions of regression. The representativeness heuristic accounts for the insensitivity to prior probability issue because representativeness leads people to only think about how similar A and B are. Thus, they can easily forget to take base rates into account when determining probabilities. This leads to judgement errors. The example here is you are given a paragraph describing a person and must determine the likelihood of the paragraph describing a doctor or a nuclear physicist. The paragraph might seem representative of a nuclear physicist, but nuclear physicists makes up a very small portion of the population and you should also take that into account. The representativenss heuristic accounts for misconceptions of regression because people often do not expect regression to the mean. They often expect the measurement on the second variable to be as extreme as the measurement on the first, thinking that the first measurement is representative of the second. This leads to judgement errors and biases. For example, people might expect a softball player that has a spectacular season her rookie year to have an equally spectacular season during her second year. When she does not do as well her second year, people might come up with alternative explanations, such as a sophomore slump, when in fact all that could be happening is regression to the mean. 9. Expected earnings in country A = .9*20000+.1*6000 Expected earnings in country B = .3*(salary in country B - 5000) + .7*(-5000) I'm indifferent when expected earnings in country A = expected earnings in country B .9*20000+.1*6000 = .3*(salary in country B - 5000) + .7*(-5000) salary in country B = ((.9*20000 + .1*6000 -.7*(-5000))/.3) + 5000 10. This is an example of the ecological fallacy and why you have to be careful when looking at correlations at the group level. Wealthy counties in California (i.e. Los Angeles, San Francisco) are also urban and contain many of the low-income voting precincts that lean Democratic.