[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/biz/ - Business & Finance

Search:


View post   

>> No.24454986 [View]
File: 126 KB, 1679x1587, world_models_1990_feedback.jpg [View same] [iqdb] [saucenao] [google]
24454986

>>24454647
Are you kidding? How much data do you think they have? How much do you think is necessary? There's fucking petabytes of data on human conversations at every big tech company.

You can sit there and throw literal supercomputers worth of effort at as much data as you want, and your system is still going to be brittle as fuck until we figure out how to get these things to generalize properly to unseen examples. Look at the DotA 2 bot for example, they threw absolutely insane amounts of computer power at it, they had training batch sizes in the MILLIONS, and yet it could only do 1v1, with the same character, and it fucking broke as soon as someone tried an unorthodox strategy against it.

It might just be we need bigger networks. They seem to follow scaling laws, where more parameters gets you lower losses, on less data, with more feature disentanglement, and more robustness to o.o.d. tasks.

But our current hardware is not equipped to handle the kind of computation we need for that. It's built to be good at what we are not; quick, precise, general computation. But it's sluggish and burns a ton of power -- yes literally electrical power, it matters when we want to put these things on edge devices -- to do the absurdly enormous matrix ops we need for big nets. Lots of people are working on this, from different approaches, and I expect it to be solved. But it's not going to happen for a while yet, because it's a hard, multidisciplinary problem.

Navigation
View posts[+24][+48][+96]