Midterm Study Guide -- CSCI567 -- Fall 2008

Topics to know for the midterm:

Sample questions

  1. You are given a data set where it looks like the underlying data from each class come from a gaussian distribution. Which of the methods we have talked about (perceptron, logistic regression, LDA, decision trees, ...) would be the first method you would try? Why?
  2. You are an expert machine learning consultant. A customer comes to you with a problem that is a good fit for machine learning ang and you ask what kind of costs are associated with false positives and false negatives. The customer does not know and believes that this may change often. Knowing that learning a model will take a long time and cannot be done often, what kind of classification models are likely to be better for this and why?
  3. You want to apply gradient descent to minimze a hypothesis of the following form: y=w1+w2*x. What are the update rules for w1 and w2?
  4. What is the difference between discriminative and generative models?
  5. You notice that a customer is using a neural network on a problem domain where you are fairly sure that a hyperplane can separate the positives and the negatives. Is a full neural network needed? What other methods might you suggest to the customer?