[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 236 KB, 1998x1040, bayesrulesequential.png [View same] [iqdb] [saucenao] [google]
10448122 No.10448122 [Reply] [Original]

Why is normalizing the posterior with the normalizing constant after updating the prior with multiple data points equivalent to normalizing the posterior at every data point (sequentially).

I can reproduce this computationally (picrelated), but can't prove it to myself analytically.

>> No.10448128

This is from "Think Bayes":
Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each. Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla. What is the probability that it came from Bowl 1?

>> No.10448422

bump

>> No.10448440

First of all, what's up with the V's in your for loops?

Second, likelihood works on RELATIVE probability and does not care whether you normalize or not. So you could or could not normalize at any point, and it wouldn't matter to the ratio between any two likelihoods.

Coincidentally, why not do this in logarithms, which is much easier to work with and more efficient computationally?

>> No.10448910

V is from >>10448128 where it’s a vanilla cookie drawn from the drawer. I get that intuitively p(data) per step is independent of the previous step but I really want to see a proof.

>> No.10448977

>>10448910
>zoomer entitlement
If you want to see a proof, sketch one yourself
If you get stuck with your proof ask for help and give sufficient detail

>> No.10449209

>>10448977 (you) kys
I tried to work it out given two data points, d1, d2
[math]
P(h_1|d_1,d_2)
= \frac{P(h_1)P(d_1,d_2|h_1)}{P(d_1,d_2)}
= \frac{P(h_1)P(d_1|h_1)P(d_2|h_1)}{P(d_1)P(d_2)}
= P(h_1)*\frac{P(d_1|h_1)}{P(d_1)}*\frac{P(d_2|h_1)}{P(d_2)}
[\math]
I think this makes sense, second equation is normalizing after all data points, last is normalizing after each.

>> No.10449215

take 2, used to /g/ not /sci/
[math] P(h_1|d_1,d_2) = \frac{P(h_1)P(d_1,d_2|h_1)}{P(d_1,d_2)} = \frac{P(h_1)P(d_1|h_1)P(d_2|h_1)}{P(d_1)P(d_2)} = P(h_1)*\frac{P(d_1|h_1) {P(d_1)}*\frac{P(d_2|h_1)}{P(d_2)} [/math]

>> No.10449225
File: 28 KB, 1172x148, bayes.png [View same] [iqdb] [saucenao] [google]
10449225

k just posting a picture. what im confused about, is that in the last equation, isn't the denominator P(d2), when expanded, defined by the previous fraction? That would mean it is recursively defined?

>> No.10449315

>>10449225
First kys, then realize you are a Monty Hall Goatlet. You do it by likelihoods in this way:

[eqn]\frac{p(v \mid 1)^n}{p(v \mid 1)^n+p(v \mid 2)^n}=\frac{p (v \mid 1)}{p (v\mid 1)+p (v \mid 2)} \times \frac{p(v \mid 1)^{n-1}}{p(v \mid 1)^{n-1}+p(v \mid 2)^{n-1}}[/eqn]

Renormalization in both of your approaches there.

>> No.10449320

>>10449315
damn...

>> No.10449552
File: 6 KB, 215x234, goatlet.jpg [View same] [iqdb] [saucenao] [google]
10449552

>>10449315