Testwiki:Reference desk/Archives/Mathematics/2018 July 1

From testwiki
Revision as of 01:58, 9 July 2018 by imported>Scsbot (edited by robot: archiving July 1)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Error:not substituted

{| width = "100%"

|- ! colspan="3" align="center" | Mathematics desk |- ! width="20%" align="left" | < June 30 ! width="25%" align="center"|<< Jun | July | Aug >> ! width="20%" align="right" |Current desk > |}

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


July 1

"If you disagree, you're probably both wrong in the same direction" in ensemble learning

I've tried to ask this question before, but I don't think I worded it clearly, so I'm trying again. What ensemble learning models, if any, could make inferences that would translate in English to ones like the following?

  • "Model A says you probably voted for Donald Trump, and Model B says you voted for Hillary Clinton. But if you were a Trump voter or a Clinton voter, then the training data says both models would almost certainly agree about that; and most of the voters whom A and B disagree about in our training data, actually voted for Gary Johnson."
  • "Estimator A says X is 50 ± 2. Estimator B says X is 60 ± 3. But when their estimates are incompatible, they're usually both too low, and in this case the ensemble estimate is 75 ± 10."

NeonMerlin 00:17, 1 July 2018 (UTC)

Good question. I'm not too sure of the answer.
You could assign a prior probability to each of models A and B, P(X) where X denotes A or B.
Then update the probabilities using Bayes's theorem P(X|data)=P(data|X)P(X)P(data|A)P(A)+P(data|B)P(B)
Then calculate a probability that you voted for Johnson (J), say, weighting the prediction of each model with the probability of each model. P(J)=P(J|A)P(A|data)+P(J|B)P(B|data). This could be regarded as an "ensemble model".
Now suppose, for example, P(T|A)>P(J|A)>P(C|A) and P(C|B)>P(J|B)>P(T|B). Neither model predicts voting for Johnson. But it's possible the ensemble model does... maybe. I don't know. I'm not sure how the ensemble model would converge as you gather more data points; it's worth investigating at some point. I apologise if my answer is useless. PeterPresent (talk) 06:03, 2 July 2018 (UTC)