Hello, stats question please. Confidence intervals.

Let's say I have 5,000 daily values in a time series, and that the sample standard deviation of those 5,000 values is 10, with a mean of, say, 0.

I would say that, with 90% confidence, the next day's value will be between -10 and 10, right?

But I had this discussion with someone who says that you need to divide the SD by sqrt(5000-1), -> (-.14, .14)

I of course know the formula that they are using, but I forget why and when that's used, seems retarded here. What's going on? pic is dog

Yes I know that now please let's discuss something that isn't completely obvious.

You're both wrong, but your friend is closer. When determining the margin of error, you divide the standard deviation by the square root of the sample size, not the sample size -1. Also, you multiply that by the z-score that corresponds to your give confidence interval. For 90%, that's 1.645. Then you add and subtract that product from the mean.

If the values are normally-distributed, then the 90% confidence interval for the next value will be ±10*Z_0.95 = ( -16.45, +16.45)

Its (-.233, .233).

>> No.11194961

well for real though, what is the interpretation of the mean +/- z*std/(sqrt(n-1)) when we are looking at drawing one additional datapoint? Maybe the interpretation is something like, if we draw "n" additional datapoints, then their average would be within that interval 90% of the time or whatever. But if the true standard deviation is 10 (mean 0) then I feel like, well okay yeah for 90%, z*10 -> (-16.45, 16.45) makes sense. But I don't see when the fuck you would divide the SD by 5000 or whatever. However that's a mainstream formula...

Okay I get it now. Thank you apples you are my greatest ally. Basically i'm right. Yeah, with 5000 observations, and a sample mean of zero and a sd of 10, yeah we would estimate the true mean to be like ( >>11194951 ) what you said with 90% confidence.

Okay that makes sense.

But what I am talking about is, if we are to draw one more piece of data, what is a 90% CI of its expected value? Okay and that is -16,16? That seems reasonable too.

Okay pack in up I think we're done thanks for entertain.

Where are you even getting that first form of it? You always divide by the square root of the sample size, because using a larger sample size allows you to be more accurate. Imagine using a sample size of 3: your interval would become (-9.527, 9.527). It's still a 90% confidence interval of a normally distributed population with a standard deviation of 10, but because the sample is so much smaller, the interval is wider, meaning less precision.

No, you would use the same formula but with a sample size of 5001.

The sample standard deviation is already divided by the square root of the sample size.

No it’s not. It’s the square root of the sum of the squares over the sample size.

same thing tho unironically?

It helps if you know what standard deviation represents, as well as how to calculate it. You find the difference between each datum and the mean, square the difference to make them all positive, then divide by the sample size. This gives you the variance, which represents to total variation of your sample. Taking the square root of the variance gives you the standard deviation, which is the average difference any datum is from the mean.

Going back to confidence intervals, you further divide the standard deviation by the square root of the sample size before multiplying with the z-score. Why? Because you can arrive at the same standard deviation with two samples of wildly different sizes. The larger your sample size, the more precise you can be in determining the 90% CI. See >>11194981
>Imagine using a sample size of 3: your interval would become (-9.527, 9.527). It's still a 90% confidence interval of a normally distributed population with a standard deviation of 10, but because the sample is so much smaller, the interval is wider, meaning less precision.

