[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 27 KB, 266x189, image.jpg [View same] [iqdb] [saucenao] [google]
6781302 No.6781302 [Reply] [Original]

The element distinctness problem is this: given a list of inputs d1, . . . , dM ∈ D, determine whether or not there are any duplicates in the list. Show how to use universal hash functions to solve the element distinctness problem in expected time O(M). You may assume that for every t > 0, there exists N ∈ [t,2t] and a 1/N-universal family of hash functions {hk}k∈K, with hk : D → [0..N); moreover, assume that choosing a random hash key, evaluating a hash function, and comparing two elements in D are each constant time operations.

>> No.6781315

Hash them all into the table and then read the table once over to check that each row only has one entry. Takes 2M ops which is O(M).

>> No.6781326

I feel retarded reading this. Thanks for lowering my self esteem.

>> No.6781359

>>6781315
Can you explain this is more detail?

>> No.6781361

>>6781315
Can you explain this in more detail? Sorry about the typo before

>> No.6781365

>>6781361
Do you know what a hash table is and how it works?

If not watch lecture 8 and also look the notes over
http://courses.csail.mit.edu/6.006/fall11/notes.shtml

>> No.6781432

>>6781365
A hash table is a data structure that stores info along with a key which is determined randomly by a hash function. Sometimes more than one item can map to the same key which causes collisions. This can somewhat be fixed by using a linked list. Is this somewhat an accurate idea of what a hash table is?

>> No.6781439

>>6781315
>>6781359
>>6781361
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

>> No.6781442

>>6781365
>http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
>>6781432
>>6781439
http://en.wikipedia.org/wiki/Lambert_W_function#Asymptotic_expansions

>> No.6781443

>>6781432
Somewhat accurate, yeah. Just think of it as an array of linked lists where the index of the array you stick an item into is given by hash(that item).

The big motivator for a hash table is that it's constant time to search for any element, as opposed to O(lg N) for sorted arrays and O(N) for linked lists.

>> No.6781447
File: 197 KB, 800x833, 800px-Buckminsterfullerene-perspective-3D-balls[1].png [View same] [iqdb] [saucenao] [google]
6781447

>>6781442
>>6781443

>> No.6781455

Careful. This asks explicitly for universal hashing. The question is much harder than it seems, because you have to deal with knowing whether the collision is a collision or if it's the same item. Perfect hash functions are stronger than universal hashes, and while doing this with perfect hashes is trivial, with universal hashes it isn't.

>> No.6781464

>>6781455
>comparing two elements in D are constant time operations

If a row has more than one element just compare them.

>> No.6781473 [DELETED] 

>>6781455
>>6781464
Oh wait, would that make it O(M + n!), where n is the number of elements in the same row? Not sure how that simplifies.

>> No.6781475

>>6781464
What if you have M collisions? How do you compare them? What about M/2? It's not linear if you're naive.

>> No.6781479
File: 589 B, 79x41, bbb1b74b511f868d0882a668de23c111[1].png [View same] [iqdb] [saucenao] [google]
6781479

>>6781464
>>6781473

>> No.6781480

>>6781475
Op, I'm really curious, where is this problem from? I don't know the answer, and I feel like I don't know enough about randomized structures to solve this.

>> No.6781502

Wake the fuck up /sci/, why isn't anyone trying to solve this?

>>6781315
this is naive and wrong by the way

>> No.6781549

>>6781480
This problem is for an Algorithms class. By far the hardest class for any computer science major