[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 671 KB, 997x900, cover2.png [View same] [iqdb] [saucenao] [google]
11908323 No.11908323 [Reply] [Original]

I've never been good with statistcs, nor believed then very much. It began when all big organs that make opinion polls on my country (Which has 200 million habitants) only asked 2000 people and yet claimed it only had "a margin of error of 2 points". For me it sounded like BS and i was always very loud about it.
Until someone in a comment section told me it was normal and gave me the link to a college level book explaining the formula for that. I didn't understand that, but i thought it was normal, as i had not yet learned statistics in college.
Fast fowards to this semester when i finally had statistics in college (I do CompSci), and apart from the first test (That involved simple stuff such as averages, variance and standard deviation) i went terribly and did not understand anything, so much that i didn't even bother to do the final test.
Pretty much we were just given mathematical formulas without being explained why they were this way nor deducting them.

So i asked for help in a FB group i'm in and some guy gave me a link to an article that was about "pedagogically explaining Pearson's Correlation Coefficient".
It didn't help much as, again, for most of the article i was just being thrown formulas without knowing the why. The worst instance of that was when on the same point he started talking on how measurement units do not matter and then suddenly says, out of nowhere, that z = (x - μ)/σ.
I completelly gave up on it on the part it was talking about "outliers and lurking variables" and the author himself said that "not always the results obtained from correlation tables are informative about the relationship between the variables". For me that's like him admitting that it was all BS.
For social issues i already preffer settling things on ethics instead of googling random statistics and graphics of what "gives the best results", since both sides can always google something that agree's with then.
Same thing with economics, which is why i prefer praxeology.

>> No.11908330

>>11908323
Youre just bad at stats

>> No.11908333

>>11908330
Yes, i know, i admitted it on my first phrase.

>> No.11908337

Me fail statics ? That's unprobable.

>> No.11908338

lmfao
>too retarded to learn math
>is statistics bullshit?
Yes, it's a great big conspiracy you fucking retarded faggot

>> No.11908343

>>11908323
statistics isn't bullshit.

if you have a problem with certain polls, then it is more likely that the people doing/analyzing the polls are bullshit -- as opposed to the field of statistics. i think Mark Twain said something about how there are lies, damned lies, and statististics.

pollsters are often not even good statisticians. it's like the shit job they get after failing out of industry, which is itself a fallback for academia dropouts. statistics are easy to manipulate and pollsters are pretty bad at that. they can manipulate the data after they have it, or they can collect their data in a way that makes it skewed from the beginning.

however, this criticism is a classic MAGAtard thing to repeat. but unfortunately it won't save him this time. because he's a fucking retard and no degree of skewing the polls either away or toward him will make any difference.

>> No.11908347

>>11908338
I'm actually good at math, i have always been. Not only that, i had fun doing math, the reasons i picked what i went for college was that i wanted to learn calculus.
>>11908343
It's just that it can't get through my head on how you can have only 2% of error margin while only asking 0,001% of the population, and random formulas didn't answer my question.

>> No.11908355

>>11908323
>I've never been good with statistcs, nor believed then very much.
That's the problem.
Math is not magic, math is not god, math is not faith, math is just a shortcut for disturbing the world.
All statistics can tell you are the odds of something happening.
It can't predict the future, it can't know the unknowable, it can just tell you what is likely to happen.

It's very easy to just take statistic results and make predictions and conclusions but anything derived is just an assumption like anything else.
Only the scientific method can actually prove things and that has major limits.

tl;dr
There is a lot of bullshit yes, but unblemished no statistics is a legitimate way of determining probability.

>> No.11908375

>>11908347
>the reasons i picked what i went for college was that i wanted to learn calculus.
Calculus is a high school subject anon

>> No.11908378

>>11908375
No it isn't, wtf.
Calculus I, where you learn limits, derivatives and integrals, is a college subject.

>> No.11908380

>>11908378
anon, I....

>> No.11908384

>>11908380
Dude, i went through highschool, top 3 school in the city, no calculus.
I went through college, there i had calculus.

>> No.11908385

>>11908378
>Amerishart education system

>> No.11908387

>>11908385
I'm not even american, since when the US has 200 million habitants?

>> No.11908388

>>11908384
>i went through highschool, top 3 school in the city, no calculus.
does that city happen to be Little Rock or Birmingham or Jackson?

>> No.11908389

>>11908384
>City education
>Real education
Pick one

>> No.11908391

>>11908387
Wherever you're from, in the first world calculus is a high school subject.

>> No.11908396

>>11908391
Then what do you learn about math in college?

>> No.11908407

>>11908396
math

>> No.11908408

>>11908396
First year is usually things like group theory, analysis, topology and infinitesimal

>> No.11908418

>>11908338

Statistics isn't math.

>> No.11908451

>>11908347
>It's just that it can't get through my head on how you can have only 2% of error margin while only asking 0,001% of the population
because the chance that the 2000 people you picked all conspire to be biased is really low

>> No.11908456

>>11908408
>First year is usually things like group theory, analysis, topology and infinitesimal
haha no

>> No.11908460

>>11908451
if you assume it is a truly random sample. since OP is talking about political polls, at least in 2016 they were heavily biased in favor of people who owned land lines who were mostly rich boomers

>> No.11908463

>>11908451
It's because they're not biased that they will represent reality.
It's only 0,001% of the population.
Besides
>the chance [...] is really low
You're using statistics to prove statistics

>> No.11908484

>>11908378
Took calc in HS, not all of us are smooth brained

>> No.11908519

>>11908484
you are if you think the basic mechanical rules you learn in hs counts as calc

>> No.11908520

>>11908463
>You're using statistics to prove statistics
Not him, but correct.
This is how people use statistics for there own motives.

Statistics can only tell you a coin flip has 50/50 odds. It can't tell you what the outcome will be.

With even odds like 49/51 you can finagle wording to suggest that the outcome will be the 51 option. Then claim that statistics have proven that the 51 option is going to happen.

>> No.11908538

>>11908323
I think it's more of a "crap in, crap out" problem rather than the science/math itself. The equations are sound, but which to use, when, how and why are often muddled. That and I think most people are shit at picking sample populations.
Who knows, I could just have a shit sample population myself in which I base my statistic theories..

>> No.11908564

>>11908323
Seems like the problem is the sources you've consulted not explaining the concepts properly. If you are really curious about this you can download All of Statistics by Wasserman from https://link.springer.com/book/10.1007%2F978-0-387-21736-9 for free.

Also, people misusing/misrepresenting statistics does not invalidate statistical methods, only their conclusions.

>> No.11908567

>>11908520
It's redundant, can't serve as a proof.
If "statistics" is correct, yes it can be proven with statistics, as it is correct.
If "statistics" is incorrect, you can't prove it is correct with statistics, as it is incorrect.

But that's the problem, here we have pretty much a tautology, how do you prove statistics is correct without using statistics?

>> No.11908570

>>11908567
>how do you prove statistics is correct without using statistics?
It doesn't matter.
None of that matters because people are dumb and just believe whatever is published.

>> No.11908572

Unbelievably good bait, OP. Honestly impressed

>> No.11908573

>>11908323

Statistics is an incredibly powerful tool. Like any powerful tool, it will be abused by foolish people hungry for control. You are right to doubt the statistics of opinion polls, for many reasons.

I would also say the vast majority of people do NOT understand statistics, certainly not journalists, but they will absolutely exploit the ignorance of people in order to spin a narrative.

>> No.11908579
File: 102 KB, 662x350, ClintonElectionFraud.png [View same] [iqdb] [saucenao] [google]
11908579

>>11908343
>however, this criticism is a classic MAGAtard thing to repeat.

What makes you say that? The most rigged polls were the Democratic Primary against Bernie (Check the Michigan Primary).

>> No.11908583

>>11908567
>here we have pretty much a tautology

That's what mathematics is. Related, relevant tautologies.

>> No.11908594
File: 77 KB, 847x349, garbage-in-garbage-out.jpg [View same] [iqdb] [saucenao] [google]
11908594

>>11908323
stats are only as good as the data that goes in. garbage in garbage out. doesn't invalidate the field.

>> No.11908616

>>11908583
What i meant to say is: How can we know if something in statistics is valid and not just a gibberish formula?
>well, you compare it to reality
And how do we analyze reality
>use statistics
See? Circular logic

>>11908573
I still cannot comprehend on how you can have only 2% of error while interviewing only 0.001% of the population
>oh, it's based on a mathematical formula
OK, and that formula is based on what?

>> No.11908633

>>11908616
Samples are similar to populations, thats the fucking point
you dont need to get half of a fucking population in the doors to tell what theyre like
This is the entire fucking point of RANDOMNESS
a random sample is evenly spread throughout the population
So it behaves the same as the population, but its smaller

>> No.11908643
File: 40 KB, 331x132, 1541083825712.png [View same] [iqdb] [saucenao] [google]
11908643

>>11908633
>a random sample is evenly spread throughout the population
>So it behaves the same as the population, but its smaller

>> No.11908649

>>11908643
>statistics is bullshit
>dude, trust me

>> No.11908648

>>11908643
jesus fucking christ your retarded

>> No.11908654

>>11908643
it is not evenly spread
and the probability that it will be biased is exactly what this error OP keeps sperging out about is measuring

>> No.11908656

>>11908648
Because how do you expect me to believe that i can, within 2% of error, know how an entire population will behave only analizing 0.001% of it when you only claim it, but doesn't prove it nor brings any backup to it?

>> No.11908657

>>11908633
>you dont need to get half of a fucking population in the doors to tell what theyre like
>because I say so

>a random sample is evenly spread throughout the population
>because I say so

>> No.11908661

>>11908657
See? This guy gets it
>>11908654
And that's the problem, how did i get this formula?

>> No.11908669

>>11908643
>>11908657
what the fuck do you think RANDOM means then

>> No.11908674

>>11908669
How being random guarantees that 0.001% of the population represents 100% of it?

>> No.11908705

>>11908616
>See? Circular logic

How do we know physics is valid and not just gibberish formulae?

>Compare it to reality

And how to we analyze reality?

>Use Physics

That argument still applies. Ultimately there are different parts of statistics. You can test to see if your 2,000 person sample was decent statistics by conducting a 20,000 person sample and seeing if they are comparable.

Most pollsters don't like doing that.

>OK, and that formula is based on what?

Well, as another poster states, you make some assumptions (like perfect random sampling, which is basically magic) and you can find a tautology.

Of course, as these people have their heads up in the clouds, they don't consider the fact they are not sampling randomly, nor do they care. They are paid to provide a desired result.

>> No.11908720

>>11908705
Because we can analyze reality through our senses, that's how we came up with physics, not the opposite.
>you make some assumptions (like perfect random sampling, which is basically magic)
So you make fake assumptions and somehow the results reflect reality?

>> No.11908904

>>11908720
>So you make fake assumptions and somehow the results reflect reality?

This actually happens in physics, too. See: spherical cows.

>> No.11908913

>>11908904
>A spherical cow is a humorous metaphor for highly simplified scientific models of complex real life phenomena.[2][3] The implication is that theoretical physicists will often reduce a problem to the simplest form they can imagine in order to make calculations more feasible, even though such simplification may hinder the model's application to reality.

I mean, it kind of admits it hinders it's application to reality.

>> No.11908923

>>11908913

Right, and any honest statistician understands the hindrance assuming 'perfectly random' really is. That's why good ones will put a lot of work into getting a representative group. They can use accepted statistics like the Census (supposedly sampling anyone they can) to base their models on.

The statisticians you are mainly concerned about are the ones that predicted Hillary had a 98% chance of winning because they aren't actually mathematicians but ideologues.

>> No.11908941

>>11908923
I don't live in the US.
I'm talking about the brazillian "DataFolha", which always uses 2000 people.

>> No.11909097

>>11908941

I apologize for my US-centric view. We talk in terms of what we know.

But I hope you still got some of what I meant about useful statistics versus pop-journalism statistics.

>> No.11909371
File: 42 KB, 314x499, 8E4CB646-AC86-490C-9087-A6EDA2E9A109.jpg [View same] [iqdb] [saucenao] [google]
11909371

Not the best book for statistics but seemed good enough for me.

What is the biggest problem I have with statistics as it admits to a Foley system where they can never truly collect a very accurate set of data especially when corporate lobbyists have a higher amount of influence on the overall message that the Study intends to find. My recommendation is to study the interpretive fallacies of statistics (ie. one in 100 vs 1% bias) and you can filter out a large portion of fake studies (ex: Thinking fast and slow)

>> No.11909412

>>11908643
Taking 2000 simultaneous extractions out of a jar with 2'000'000 balls will give you a rough estimate of the ratio of balls in the entire jar, yes.
That's because a single extraction follows the same ratio as the entire jar. If there's 1% of yellow balls in the jar, you're expected around 20 yellow balls in your extraction, +/- the uncertainty.

>> No.11909437

>>11908323
>Are statistics bullshit?
What a poorly formulated question.

>> No.11909442

>>11908323
>pedagogically explaining
that sounds like a terrible direction.
>facebook group
ah, that's why. mate go on fucking youtube and type in the question and watch any video that has like a million views and mostly thumbs up.

>> No.11909451

The science (or mathematics) of statistics is not bullshit. However its applications in the real world are often bullshit because they rely on false assumptions. Many studies ignore various important confounding factors, and make baseless and usually implicit assumptions about the distribution they're sampling. These problems are in many cases insoluble and people simply ignore them, but they're still there. Combined with the difficulty of evaluating the statistical methods of others, these problems render a lot of studies quite meaningless. This is compounded by various degrees of dishonesty and outright lying.

>> No.11910086

>>11909371
this post reads like nonsense

>> No.11910100
File: 11 KB, 226x224, quant1.jpg [View same] [iqdb] [saucenao] [google]
11910100

>>11908323
You don't understand statistics because you don't understand mathematical probability. Basically you're too retarded to think critically, upper-year math will rip you to pieces.

>> No.11910119

>>11908323
You need to stfu and work through undergrad prob and stats handbook. DeGroot Morris is good. Thats it, either you accept formulas given to you or you put the work in yourself. You are supposed to learn a lot by yourself and use classes as a guide.

>> No.11910152

>>11910100

This. Statistical sampling is pretty much applied probability.

>> No.11910158

>>11908656
Pick up a proper mathematical statistics book and work through it then. We're not going to fucking handfeed it to. If that's what you want, you may as well accept what you're told. You have to start with the axioms of probability and understand why they're the correct axioms (basically, you're doing with probability measures). Then you want to go and understand computing moments and covariance and the likes. Understand all of it. Then, you're ready to start understanding distributions, and the analysis therein before finally moving on to these actually very complicated questions youre asking. What you took was a bullshit level science major stats course. You dont learn anything in those courses except how to do monkey work.

>> No.11911297

>>11908323
matematically, no. the problem is when retards (i.e. 99% of those who make sptatistics) make them. Still the worst bunch of maths and proably the cause for scientific stagnation

>> No.11911300

>>11908633
too bad no polls at all get actual random samples. that would only be true if all the selected were forced to answer

>> No.11911990
File: 1.52 MB, 1812x1242, 65799431-5A06-4308-9310-9EB0C7917D9C.jpg [View same] [iqdb] [saucenao] [google]
11911990

There is a problem with mainstream statistics, but it's more subtle than most people have a grasp of: Frequentist vs Bayesian methods.

Mainstream stats is frequentist, which means that statements like >>11908643 are actually axioms rather than theorems. It is taken as a given that an infinite sized population will have the same composition as indicated by the distribution. All other concepts are made with reference to this. So if a frequentist statistician says "The margin of error is 1%", they're really saying "If we repeated this experiment an infinite number of times, 1 out of every 100 of these countably infinite samples would give us a different answer." But since it's impossible to actually do anything an infinite number of times, strictly speaking the application of this statement to any concrete example would be unfalsifiable.

Bayesian statistics takes a different approach. Probabilities are treated as subjective or objective levels of confidence. It extends classical logic to fractional truth values. Then each equation of statistics is actually a way of combining sources of partial information in a logically coherent way. Arguably, this is a much better framework to start from. The population of the US will only vote one way in 2020, but the "randomness" is actually an expression of your ignorance about precisely how they will vote.
In Bayesian statistics, every statement of probability rests on a layer cake of prior information. When you have none, you go with the model that makes the least assumptions. For example, if all you know are the names on the ballot, the Bayesian guess would be they each are most likely to get an even share of the vote, since that encodes the maximum amount of uncertainty possible.

>> No.11912449

>>11908451
What about the pollsters themselves being biased and/or falsifying data just like the Chinese govt does?

>> No.11912510

amateur statisticians in science and finance caused the replication and financial crises respectively

>> No.11912533

>>11911990
> frequentist statistician says "The margin of error is 1%", they're really saying "If we repeated this experiment an infinite number of times, 1 out of every 100 of these countably infinite samples would give us a different answer."

You’re on the right track, but what OP is talking about is usually polls that give us percentages per candidate (25% votes for some politican let’s say), then they say its 2% max error and they provide the confidence level but hide it in corner somehwhere which is usually 95% because 99% would make their mistake margin higher.

Above numbers translate to:
>If we repeated the experiment infinite number of times, the range 23-27% would contain the true politician popularity 95% of the time

>> No.11912983

OP look up analysis of co-variance ANCOVA and ANOVA tables

>> No.11913006

>>11908323
You are supposed to have taken at least an intermediate probability course before taking statistics. Those courses teach you about the distribution of certain common transformations applied to common random variables.

The formulas you talk about are derived by knowing the distribution of a certain transformation so that you can solve an inequality in terms of your observations and quantiles

>> No.11913010

>>11913006
Also, knowing how to try to have a sample as representative of a population as possible is not a subject of inferential statistics, look for that in another assignature

>> No.11914493

>>11908456
Actually yes. I come from a third world shithole where the top university throws all of those at you at your first year, which is overkill since we do not even see Calculus in high school.

>> No.11914501

>>11908378
Lol where do you live moron ? Here we learn calculus in high school.

>> No.11914556

Yeah statistics is bullshit. Psychohistory master race.

>> No.11914681

>>11908337
So underrated

>> No.11914832
File: 48 KB, 712x960, 75241192_167918761408071_4834424501929851621_n.jpg [View same] [iqdb] [saucenao] [google]
11914832

>>11908378
I finished my entire Calculus sequence in High School. All I had to take in University was DiffEQs and LinAlg.

>> No.11915809

is sigma supposed to mean sum or something else?

>> No.11915925

>>11908355
>scientific method
>prove things
Major yikes.