(10 points) Conditional probabilities
My neighbor has two pets, which are either a cat or a dog. Assuming that the type of pet is like a coin flip, it is most likely, a priori, that my neighbor has one cat and one dog, with probability 1/2. The other possibilities—two cats or two dogs—have probabilities 1/4 and 1/4.
(15 points) Bayes rule for medical diagnosis
After your yearly checkup, the doctor has bad news and good news. The bad news is that you tested positive for a serious disease, and that the test is 98% accurate (i.e., the probability of testing positive given that you have the disease is 0.98, as is the probability of testing negative given that you don’t have the disease). The good news is that this is a rare disease, striking only one in 20,000 people. What are the chances that you actually have the disease? (Show your calculations as well as giving the final result.)
(20 points) Fisher's Linear Discriminant
Show that the formula on slide 40 of the slides on Linear Classification is correct. I.e. proof that $${(m_1 −m_2)^2 \over (s_1^2 +s_2^2)} = {\mathbf{w}^T\mathbf{S}_B\mathbf{w} \over \mathbf{w}^T\mathbf{S}_W\mathbf{w}}$$
(25 points) Gaussian Class-Conditional Densities
In the case of a logistic sigmoid (as defined on slides 67/68 of our Linear Classification lecture) show that the \(a\) simplifies to a linear function under the assumption that the class-conditional densities \( p(\mathbf{x}|C_k)\) are Gaussians, and have the same covariance matrix \(\Sigma\): $$ a = ln {p(\mathbf{x}|C_1)p(C_1) \over p(\mathbf{x}|C_2)p(C_2)} = \mathbf{w}^T\mathbf{x}+w_0 $$ (see slide 74 of our Linear Classification lecture).
(30 points) Log likelihood
Consider a classification problem in which each observation \(x_n\) is known to belong to one of four classes, corresponding to \(t = 0\), \(t = 1\), \(t = 2\), and \(t = 3\), and suppose that the procedure for collecting training data is imperfect, so that training points are sometimes mislabelled. For every data point \(x_n\), instead of having a value \(t\) for the class label, we have instead values \(\pi_{n0}\), \(\pi_{n1}\), \(\pi_{n2}\), and \(\pi_{n3}\) representing the probabilities that \(t_n = 0\), \(t_n = 1\), \(t_n = 2\), and \(t_n = 3\) respectively. Note that \(\pi_{n0} + \pi_{n1} + \pi_{n2} + \pi_{n3} = 1\). Given probabilistic models \(p(t = 0 | \phi)\), \(p(t = 1 | \phi)\), \(p(t = 2 | \phi)\), and \(p(t = 3 | \phi)\), write down the log likelihood functions appropriate to such a data set.
Please submit a PDF-document with your answers to Moodle. Use the following naming scheme for your submission: “lastname_matrikelnumber_A2.pdf”. The naming of the files is important. If you do not follow the submission instructions, then you will receive a grade of 0 for the assignment.