[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 828 KB, 1114x1214, 1559589237126.png [View same] [iqdb] [saucenao] [google]
10979527 No.10979527 [Reply] [Original]

This somewhat simple exercise is up on /g/ right now. It seems to present some difficulty for the average /g/ user. I have slightly rewritten it and present it for your amusement, you should certainly be able to solve this.

suppose you have a number generator that has the discrete uniform distribution on {0,1}

1. describe an algorithm for a number generator with the discrete uniform distribution on {0,1,2,3,4,5,6}

2. repeat (1) but your algorithm must terminate, or prove no such algorithm exists

>> No.10979531

just roll the first one three times. and of course it will terminate there's only like three steps. what the fuck does that last part even mean?

>> No.10979536

Step N, for 0 <= N <= 6. Output N
Step 7. Goto step 0.

If you wanted a random distribution, you should have said so.

>> No.10979553

>>10979531
Roll the first one three times and then...? You're going to have to do something with those three numbers.
>>10979536
Wrong, it seems like you don't understand what a discrete uniform distribution is. I've never heard of this so called "random distribution", why don't you define it for me.

>> No.10979565

>>10979553
>don't understand what a discrete uniform distribution

Are you saying it's not discrete, not uniform, or not a distribution?

>> No.10979566

>>10979527
Flip the coin 3 times (8 possibilities) and output the corresponding number from binary. If 6 or 7, repeat.

>> No.10979567

>>10979566
but that doesn't terminate

>> No.10979571

Assign to each value 0 through six a unique set of three binary values, such as

0 = (0,0,0)
1 = (0,0,1)
2 = (0,1,0), etc.

Generate three values from the first distribution, order them as generated, and assign to this list one of the seven initial values.

In the case of getting the unassigned set, such as (1,1,1) in this case, either reiterate the process (not sure if this would strictly be considered a terminating process), or return an error value (shitty option).

I'm sure it could be done better, but fuck it.

>> No.10979580

>>10979565
of course its all three. I suspect the problem is you don't know what a distribution is. you should crack open a statistics textbook. stop replying to my posts and acting like you know what you're talking about, we both know you don't.
>>10979566
>>10979571
nicely done this solves 1. indeed they do not terminate though. how about 2?

>> No.10979585

>>10979580
is this basically proving there is no bijection between {0,1)^N and {0, 1, 2, 3, 4, 5, 6}

>> No.10979590

>>10979527
>do my cryptography 101 homework for me
it's trivial, the solution has been left as an exercise for the faggot OP

>> No.10979594

>>10979527
Any combination of results from the {0,1} generator will have a x/2^y chance of occurring where x and y are positive integers.

1/7 = x/2^y
2^y = 7x

No solution.

>> No.10979599

>>10979585
haha yes well done this is starting to get to the heart of the exercise. however the question is not whether there is a bijection between them.

>> No.10979600

>>10979599
See >>10979594

>> No.10979607

>>10979594
haven't posted in this thread and have 0 training in combinatorics and discrete mathematics, but can you show me the theory or background to this? genuinely curious.

>> No.10979615

>>10979600
>>10979594
Nice! here is mine. We were looking for a function from {0,1}^N to {0, 1, 2, 3, 4, 5, 6}, where every element in the codomain had the same number of elements in the domain mapping to it. however as pointed out 7 does not divide 2^N. so there is no such function.
>>10979607
consider my functional description of the solution. can you understand it?

>> No.10979620

>>10979615
he's asking for a proof that 7x=2^y doesn't have a solution?

>> No.10979624

>>10979607
I'll try to explain in some terms. Since the algorithm must terminate we can't have results with probabilities that take the limit as the number of generators goes to infinity. So we can only combine a finite number of generator results. You can either combine results in sequence, meaning you multiply probabilities, or together, meaning you sum probabilities. Any multiplication and summing of 1/2 results in x/2^y, which can never give us the 1/7 chance we need.

>> No.10979632

>>10979620
LOL I doubt anyone would fail to understand that.

>> No.10979645

>>10979620
indeed it does have solutions but none of them are integers.

>> No.10979648

>>10979615
Yep!

I can see how your explanation works to dismiss that kind of algorithm since it can map to 7, but how do you know if there's no function of that family that maps to 7, there's no other?

Intuitively it makes sense. I'm a bit tired and incoherent at the moment!

>> No.10979658

>>10979624
That does make sense, thanks.

>> No.10979706

>>10979648
consider what things we can input into our function. we can use the first number generator any number of times to get an input. So the input is will be a string of 0's and 1's. for example 1011. basically its a binary number. and we can make it any length we want. now you should consider how many different binary numbers there are of a given length n, and convince yourself there are 2^n, and that each one is equally likely to be produced by the given generator. call that probability p.

now convince yourself if we map x of these binary numbers to 5, the probability we get 5 is x*p. thus we need to map the same number of binary numbers to each element in {0,1,2,3,4,5,6} so that they will each have the same probability. however there are 2^n binary numbers, we cannot divide that evenly up into 7 parts! that is to say for there is no n and q so that 7*q=(2^n), if there were we could divide take the binary numbers of length n and divide them into 7 groups of q and map each group to one value in {0,1,2,3,4,5,6} and we would be done. You say it like this "for all n 7 does not divide 2^n", and you would write this [math] \forall n, 7\nmid 2^n[/math]. actually this fact is not completely trivial.

>> No.10979709

>>10979527
Roll 3 times and skip 7.

>> No.10979718

>>10979709
And 2? No looking through the thread.

>> No.10979723

>>10979527
I think all you need to do is show that 7 doesn't divide into 2^n for all n.

>> No.10979742

>>10979723
yes, well done. however the other solutions posted in the thread have actually skipped this part, its not trivial though! can you do it?

>> No.10979784

>>10979632
multiple people have stopped at this part, perhaps they knew how, or maybe they just thought it was intuitive. However the proof has not been produced in this thread. it is indeed trivial if you are equipped with the right theorem, do you know which one? i invite you to produce a proof.

>> No.10979811

>>10979709
that's not uniform

>> No.10979813
File: 995 KB, 490x275, 1557818081773.gif [View same] [iqdb] [saucenao] [google]
10979813

>>10979742
>>10979784
well I have to go to sleep so i'll post the solution. It turns out because 7 is prime, 7 divides a*b => 7 divides a or 7 divides b. this is called euclids lemma, which i invite you to look up the proof of, its not a difficult proof. and so by using induction you can see that if 7 divides 2^n then 7 divides 2 which is a contradiction.
>>10979811
he probably means he would interpret the three rolls as binary

>> No.10980658

>>10979813
>if 7 divides 2^n then 7 divides 2
not intuitive at all

>> No.10980723

>>10979784
It's easier than you're making it. 2^n is a prime decomposition. Since equal numbers have the same prime decomposition and 7 is prime but missing from the prime decomposition, the two sides cannot be the same number.

>> No.10980871

>>10980723
thank you.

>> No.10980982

>>10980658
well it is directly implied by euclid's lemma. its about as simple as proofs by induction get. if you are not familiar with them i encourage you to spend a few minutes studying them as anyone with at least a passing interesting in math should know how to do them. spend an hour or two looking at simple proofs by induction then try to prove:
if a divides b*c implies a divides b or a divides c, then a divides b_1*b_b*...*b_k implies a divides b_i for some 0<i<k+1.
>>10980723
>>10980871
yes well done of course this works as well because the prime factorization of 7*x is 7*(prime factorization of x) and is clearly not equal to 2^n. the existence and uniqueness of prime factorization(fundamental theorem of arithmetic) is indeed a very famous result! perhaps for this reason you feel it is very simple and intuitive to use. HOWEVER it is actually a more complicated theorem then euclid's lemma, it will certainly be easier to see exactly why euclid's lemma is true than why the fundamental theorem of arithmetic is true. in fact euclid's lemma is generally the key point in proving the fundamental theorem of arithmetic, if you understand why the prime factorization exists and is unique you will almost certainly understanding why euclid's lemma is true, but the reverse is definitely not the case!

>> No.10981020

>>10979527
Do >>10979566 but introduce an arbitrarily large recursion/iteration limit where at the last iteration you return a random 0 or 1
As n->infinity the distribution becomes truly uniform, so good enough(tm)

>> No.10981034

>>10979527
sure, measure CPU voltage and output the last digit mod 7.

>> No.10981047

>>10981034
sorry a cpu isn't given ;)
>>10981020
ok! but as long as you put a limit on the recursion you wont quite get the discrete uniform distribution. is there some clever way that you can actually get it? actually, there isn't, proving this is the point of (2).

>> No.10981090
File: 6 KB, 250x205, 1566195059346.jpg [View same] [iqdb] [saucenao] [google]
10981090

>>10981034
Now show that its uniform
>protip: you can't

>> No.10981105

>>10981047
What is given then? ;)

>> No.10981129

>>10981105
a number generator that has the discrete uniform distribution on {0,1}, of course you can feel free to use whatever basic facts you know about sets, arithmetic, functions, and so on. you aren't expected to know or stick to a particular set of axioms. i may also have to lock you in a dark room so you aren't led astray by trying to use physical processes.

>> No.10981161

Easy. Using the {0,1} distribution we can generate an arbitrarily long string of bits. This bit string describes an arbitrarily good approximation of a random real number in the interval [0,1]. Using this approximation of a uniform continuous distribution, we can generate the required seven discrete values by dividing the interval [0,1] into seven subintervals and observing into which of those the generated random number ends up in. This algorithm always terminates in constant time and it gives an arbitrarily good approximation of a discrete uniform distribution of seven values.

>> No.10981195

>>10981161
>by dividing the interval [0,1] into seven subintervals
Into seven equally sized intervals, I meant to say.

>> No.10981202

>>10981161
there's nothing wrong with approximations but this problem doesn't make for very interesting practice in making approximations does it it? that's why you are asked for the discrete uniform distribution and not an approximation of it. i encourage you to try to solve the exercise as stated.

>> No.10981426

>>10979527
uh, literally just read out ceil(log2(M choices)) = N bits from the given number generator and modulo by M

>> No.10981467

>>10981426
nope, that's skewed

>> No.10981494

>>10979566
This was the /g/ answer.

>> No.10981514

>>10981426
instead, forgo the binary generator. given an ceil(log2(M choices)) = N bits register
0) clear register to zero
1) output binary number, interpret as index into set
2) increment
3) modulo by M
4) repeat from 1

this both has a uniform distribution (although it is hardly random), and is an algorithm in that it stops

>> No.10981859

The absolute state of /sci/, even /g/ did better than this, apparently no one here can even read.

>> No.10981862

>>10981859
That's because we're all high spacial, low verbal autists, except for all the schizo posters.

>> No.10981873

>>10981859
yeah i tried to make the problem more clear but perhaps no matter what most people would rather make up their own easier problems to answer instead.

>> No.10982277

>>10981202
I thought it was pretty neat and I wanted to do something different than the "well just re-roll if it's no good" -approach. As the bit string length tends to infinity, the limit will be the desired uniform distribution.

My hunch is that there's no constant time algorithm for the accurate distribution following from the fact that the binary representation of 1/7 (and its multiples) is infinitely long. That's why you have to have an exact algorithm with unbounded amount of steps or an approximate algorithm with exact amount of steps.

>> No.10982423

>>10982277
in all your posts i feel you are describing your algorithms interaction with the limit too generously or something. either your n is finite and the distribution is wrong or its not finite and it certainly never terminates. the only thing the limit lets you do is make your approximation as good as you want, you can immediately stumble upon solutions with this same behavior but are more elegant. just let the binary number get really big then mod 7 should do this.

I haven't thought to hard about this connection you are trying to make but i don't think its quite right. you mean the binary representation of the right hand side of the decimal in 1/7 is not finite causes the problem? i dont think so. try to do the same problem but now you want the discrete uniform on {0,1,2,3,4}. note 1/5=.2 is finite, can you solve it now?

>> No.10982431

>>10982423
>>10982277
also given the nature of your solution just saying the binary number describes an arbitrarily good approximation of a real number is very sloopy, what is this description? you should actually say how you are mapping the binary number to [0,1]. if there is a canonical way of doing this that just always comes up again and again and everyone knows about it i am not aware of it but if there is you can ignore this.

>> No.10982450

>>10982431
It's the infinte sum of b_i * 2^(-i) where i goes from 1 to infinity and b_i is the i'th binary number from the string.

>> No.10982468

>>10982423
1/5 is also infinite in binary, I think. If it's not, could you write it down?

The problem with non-terminating binary numbers is that you can't really decide whether two different numbers are equal or not in finite time. You can just say that they are as close to each other as you want.

My inspiration came from the analysis of computable real numbers, which are very nice from an algorithmic point of view.

Sorry for not answering the original question. I thought that it'd be interesting to loosen some constrains and to see what other solutions there might be.

>> No.10982984

>>10982468
ok i see that function does seem to work fine and 1/5 and 1/7 are infinite. because 1/y=x/2^n implies 2^n=xy but 5 and 7 do not divide 2^n, i was thinking of some other function. but then is your hunch correct? after all there are other functions that do map finite binary numbers to 1/7 yet they will all also not work. the issue is that you are getting 2^n different real numbers in [0,1] but they can not be divided evenly into 7 intervals so each interval will not have the same probability and so the distribution is not uniform. of course they can not be divided into evenly into 7 intervals because 7 does not divide 2^n. so indeed there is a connection, as your infinite representation also arises from the fact that 7 does not divide 2^n. actually 1/y infinite iff y does not divide 2, so you can go the other way. but the key point does not seem to be that the representation is infinite, but that 7 does not divide 2^n, though this may have been exactly what you were thinking of.

>> No.10983088

>>10982468
also i'll mention a few things about the question. its fine to try to come up with some other solutions, however if you stop there you won't get much out of the exercise. your solution indeed doesn't seem to have notably different behavior from many simpler methods like taking a larger binary number then doing mod 7 that i will call the naive approach. consider the domain and codomain, f:{0,1}^n ->{0,1,2,3,4,5,6}, all we are doing is taking 2^n values and dividing them up. Convince yourself that the only thing that matters for our application is how many values in the domain get mapped to each value in the codomain. Then we can actually completely ignore which values actually get mapped to which all that matters is trying to map an equal number of values to each element in the codomain. most naive approaches to dividing up 2^n values should extend naturally with a larger n, so every naive approach including yours is essentially equivalent and display the behavior you talked about as n gets large, you just picked a particularly convoluted one.

its actually not simple at all to make the jump that we should "try again" if for certain values in the domain, why would we come up with this? the reason it presents itself somewhat naturally is because many people will quickly think of taking n=4 and converting binary to decimal, then everything will almost work out perfectly except if we get 1111, there are only 7 values so we have to map it to something that already has one, ruining the distribution. from here its not that difficult at all to say ok lets just try again if we get 1111 until we get something we like better.

>> No.10983089

>>10983088
this is notable exactly because it has different behavior to the naive approaches. we actually get to have the uniform distribution but the problem is we don't know for sure if it will ever return a result. with this class of solutions as n goes to infinity the probability the function returns after 1 call goes to 100%. for example there are 64 binary strings of length 5 and we can divide 63 of them evenly among the 7 values then try again only if we get the one we leave out.

the purpose of part (2) of the problem is to force you to not just regurgitate or stumble upon the second class of solutions but to understand or at least think about how it doesn't matter which type of function you use, all you are doing is dividing 2^n into 7 parts and because 7 does not divide 2^n it will never work. and to have a better understanding of the "try again" approach.

>> No.10985043

>>10982984
>>10983088
>>10983089
Thanks for the comment. You've given this problem a lot of thought. Yes, now that I think about it, it is not at all trivial to come up with some "try again" method. Was there a proof on the non-existence of the algorithm presented here in this thread or on /g/?

If I modify my approach so it'll be a "try again" approach, then I would say that it needs fewer random bits than the naive "try again". Let me explain: Instead of deciding beforehand how many bits we generate to our string, we observe the bit string and see if it is in one of the seven bins for certain. This can be done by comparing the bit string to the binary expansions of 1/7, 2/7 and so forth. If the generated bit string is in one of the intervals, then we stop. If not, we generate another random bit and compare the bit string again.

Now if we need to "try again", then we don't need to generate a whole new bit string, but we can just append to the existing one.

>> No.10985068
File: 92 KB, 504x500, 1559212987305.jpg [View same] [iqdb] [saucenao] [google]
10985068

>>10985043
ARE YOU FUCKING RETARDED?! OF COURSE THERE FUCKING IS. HOLY SHIT.

FUCK. ME. DEAD.

1 = IFF INTEGER THEN BASE_CASE ELSE STORE
2 = IFF N THEN MAP XOR FILTER
3 = IFF N => 3 THEN GOTO 2
4 = IFF N < 3 THEN GOTO 1

STORE = binary bits or string

>Man, I love it when 4chan gets me raging.

>> No.10985155

>>10985043
yes i have posted it here. actually the problem that i described about your algorithm in >>10982984 that makes it so the distribution isn't uniform is the same problem every function will have. its in a very terse form >>10979594 by someone else, and >>10979615 by me. i will do it slightly more formally here.

proof:
i will use some more specialized language quickly here so anyone still lurking this thread can know that taking a probability measure on one set and "pushing it forward" to get a probability measure on a different one is actually very rigorously defined, its not just some fly by night bs. we have a probability triple ({0,1}^n,power set B, uniform probability measure), now we we will use a function f to push forward this probability measure onto another measurable space, of course this is ({0,1,2,3,4,5,6}, and its power set A). this is called a push forward measure. and its defined like this: probability(a in A)=probability of the preimage of a under f.

easy to read again: of course for each element in {0,1,2,3,4,5,6}, say for {1}, this just means its probability is the same as probability that any one of the binary strings that map to 1 happens. because each binary string has probability 1/(2^n) the probability of {1} is (the number of binary strings that are mapped to it by the function we choose)/(2^n). so if we want probabilty(1)=probability(2)=... then we need (a_1)/(2^n)=(a_2)/(2^n)=(a_3)/(2^n)=..., where a_x is the number of binary strings that map to x. now, there are 2^n binary strings and 7 values we are mapping binary strings to, but 7 does not divide 2^n so every a_i can not be equal to each other. DONE, sorry for not using latex lol.

>> No.10985159

>>10985155
now onto your function:
the real number you generate will actually for sure be in one of the evenly spaced intervals you described. it should never be 1/7, 2/7 and so on because as you said the binary number that you map to that in your method is infinite, actually even if that wasn't the case it doesn't really matter. the problem is NOT that we might get a real not in your intervals! the problems is that some of your intervals have more binary numbers that map to them then others because 2^n does not divide evenly into 7 parts. so those intervals are more likely.

of course that doesn't mean your function can't be modified to work with the try again method, you just have to look at which intervals have more binary numbers that go to them and make binary numbers "start over numbers" until you have the same in number if binary numbers in each interval. note if when you get a "start over number" you just add one more generated binary number to the end you only push the corresponding real number up by nothing or pushes it up 2^(-n-1) which means that binary number end up in one of two intervals, the same one as before, or a new one determined by the size of n and the binary number you choose to be a "start over". this is not the same as starting over at all and will not work to even the numbers out as it counts for half in each of the two intervals it might end up in. your adjustment as stated is essentially just f: {0,1}^(n+1)->{0,1,2,3,4,5,6}.

>> No.10985182
File: 642 KB, 445x875, 1564570461586.png [View same] [iqdb] [saucenao] [google]
10985182

>>10985159
>>10985155
Or, ya know, Integer Pivot Programming to be included as a programming schema for 'interesting base cases for consumer of software/function output'.

>> No.10986969

Why does an algorithm have to be a stateless pure function? Surely we could just do some basic seeding to change what happens to the 8th case out -- then as long as the original seed is random and unknown, the resulting sequence would be uniformly distributed as the sequence grows long.

>> No.10987010

>>10986969
I take this back -- even a machine with memory of previous calls will always have the same problems as a memoryless algorithm, if the probabilities have to be independent of its previous results, as at each step the previous random bits are effectively inputs to the next step, so the previous proofs apply.

>> No.10987392

>>10979527
1) generate 3 bits
2) interpret the 3 bits as an integer in [0,7]
3) if the integer is 7 goto 1)
4) return the integer

The expected number of iterations is 7/8 + 2*7/8^2 + 3*7/8^3 + ... = (7/8)/(1 - (7/8))^2 = 56
The probability that it doesn't terminate before the nth iteration is 8^(1-n)

Part 2
Does there exist a function f from {0,1}^N to {0,1,2,3,4,5,6} such that the size of the pre-image of each element of the range is the same?
Let k be the size of the pre-image.
7k = 2^N has no solutions.
No such function exists.

>> No.10987419

>>10987010
yes that seems to be the case to me. however don't forget none of the algorithms described are functions. after all what we start with is the number generator, neither of the methods can be said to be a function f:number generator -> {0,1,2,3,4,5,6}. the function is what we use to define the push forward measure. the try again method has more non-functional steps after that of course. so how do we know you can't do more non function stuff and somehow get what we want? because the only other thing for us to do is generate more bits and do more push forward measures, and if we know for certain we will get to stop after generating a finite number more bits then we could have just generated that number of bits in the first place and stopped after using a push forward measure. we don't know that we will stop after a certain number of bits generated in the try again method, so the only way to stuff that behavior farther back into the algorithm and stop after taking the push forward measure once is by generating an infinite number of bits at the beginning. so you can see the relationship between the distribution going to the uniform as n goes to infinity in the naive algorithms and the probability of getting to stop after each iteration going to 100% as n goes to infinity in the try again method.
>>10987392
perfect :)

>> No.10988757

>>10987419
Its true that we don't have a function number generator -> [0,6], but since the only external data comes from the generator, what we have is a function from the generator data {0,1}^N -> [0,6], for some N. Since all the procedure does is take exactly this data and give a single output, its a pure function.

There is a technicality in that for the proofs above, you must show that there is a single finite such N, but that isn't all that tricky:
Assume an algorithm can call the generator an unbounded number of times. Since the decision on whether to call again can only be based on the previous bits, there must be a sequence of inputs that never halts.
The contradiction implies that any algorithm has a upper bound N for how many times it calls the generator. This algorithm is equivalent with always calling the generator exactly N times and just ignoring the tail bits based on whatever criteria the original algorithm used to decide to stop calling. This algo maps {0,1}^N -> [0,6].

>> No.10989379

>>10988757
yes that's true for algorithms we can say for sure terminate. we can always just generate all the bits at the start then do one function. my point is just to say you were right in the first place to be wary of thinking that we are limited to pure functions, if we got stuck thinking the entire problem could be summed up by investigating the functions [math]\{0,1\}^N \to [0,6][/math] we might never find the try again method. we would think any algorithm that gives the uniform could be described completely by [math]\{0,1\}^{\infty} \to [0,6][/math], which is actually almost true! that is the form the try again method has as a pure function. but we would mistakenly think getting the uniform will certainly require an infinite number of bits, but actually the try again method does not ALWAYS require an infinite number of bits.

>> No.10989399

>>10979594
based

>> No.10989570

>>10989399
Thanks.

>> No.10989583

>>10979527
any modern programming language has a random number generator you brainlet. you don't even need to do anything, I could literally write this code in 5 minutes in python

>> No.10989595 [DELETED] 

>>10979527
>suppose you have a number generator that has the discrete uniform distribution on {0,1}

>>10979527
>1. describe an algorithm for a number generator with the discrete uniform distribution on {0,1,2,3,4,5,6}


floor(X*7)

what's hard about that?

>> No.10989606

>>10989583
nice pajeet reply but no you can't mr code monkey. first i did not give you a computer to write code on, second no programming language has a random number generator. luckily i have supplied one. from now on im going to have to strap you down on a table in a dark empty room so you don't get led astray again.

>> No.10989621 [DELETED] 

draw an 3-bit binary number, retry if it's a 7 or 8. not guaranteed to terminate, but it solves 1 and will almost always terminate in a reasonable amount of time.

>> No.10989649

tricky. i guess you'd just have to draw three bits and retry on a 7, as others have said.

>> No.10989652

>>10979527
2^x = 7y
Impossible

>> No.10989769

3 uses of the generator gives you 3 uniformly distributed bits, the concatenation of which into a binary integer is able to represent 2^3=8 states 0 (000) to 7 (111).

Since 7 is not desired, whenever we encounter 7 (111), we discard all three bits and try again with a new three bits.

We only retry with P(y=7=(111))=1/8 probability. Since each retry is independent, the probability of n retries is 1/(8^n). The algorithm will almost certainly terminate within practical time due to the retry probability being O(8^(-n)).

>> No.10989884

>>10989652
haha quite terse indeed but yes!
>>10989649
>>10989769
yup but it would be better if there was one that we knew for sure would terminate. can you find one or prove there isn't one?