[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 426 KB, 1647x1406, 75357357.png [View same] [iqdb] [saucenao] [google]
15747047 No.15747047 [Reply] [Original]

so what's the point of all this calculus bing bing wahoos? why can't we just train neural networks with random values? it will take less computation power

>> No.15747165

The point is to have exact solutions that can be manipulated with symbolically.

>> No.15747170
File: 778 KB, 1000x1080, static-noise-bad-signal-tv-screen-seamless-pattern-vector-28081806.jpg [View same] [iqdb] [saucenao] [google]
15747170

>>15747047
k. here's your random neural network, bro

>> No.15747178

>>15747047
Now you're thinking like an evolutionist.

>> No.15747284

>>15747047
You don't have to use gradient deacent, it's just the most robust way we know so far for larger models. You can also do weight updates with statistical methods, genetic algorithms and more.
Look up the Forward-Forward training algorithm; if it's shown to scale well it will replace backprop - if not now then when we move to neuromorphic hardware

>> No.15747424
File: 896 KB, 1616x927, 87468.png [View same] [iqdb] [saucenao] [google]
15747424

>>15747170
yes

>> No.15747430

>>15747284
>it's just the most robust way we know so far for larger models
No, it isn't. Gradient descent is notorious for getting stuck in local minima and being finicky to tune. OP's method is significantly more robust if you have the time to wait for it to finish.

>> No.15747435

oops, i just realized that line 10 is printing the wrong variable

>> No.15747499

>>15747284
>Forward-Forward training algorithm
thanks, i'll try working with it. i was thinking of training a network by iteratively adjusting each parameter and seeing which direction it needs to be adjusted for the loss to go down, and storing a -1 or a +1 for each parameter to update them without needing to do a backwards pass. is there a name for this method already?

>> No.15747513

>>15747284
>they're falling for the "biologically inspired" meme all over again
I thought computer science folks have figured out by now that neuroscience is mostly fake.

>> No.15747537

>>15747499
>i was thinking of training a network by iteratively adjusting each parameter and seeing which direction it needs to be adjusted for the loss to go down
You adjust one parameter. It changes. Maybe even the sign flips. Now you have to recompute everything that depends on it to know how to adjust those other parameters. You may have also made things worse by greedily and shortsightedly optimizing only one parameter without respect to the rest.

>> No.15747571

>>15747047
Lmao, you can do both nigger. If you're working with highly nonlinear regression it's ALREADY the norm to do a factorial or Latin Hypercube design for starting values of the optimization routine.

What you're basically doing is the NEXT of any experiment (which fitting a model on data is a computers experiment) which is trying to design it so you get an optimal solution.

>> No.15747579

>>15747571
>>15747047
Btw, there are more effective ways to explore a space than just random fucking draws. Look up Design of Computer Experiments (DoE in general too) to get an idea of you actually go about efficiently exploring input (model parameters) vs output (whatever measure you're using for optimization) space.

>> No.15747614

>>15747537
>Now you have to recompute everything that depends on it to know how to adjust those other parameters.
oh, yeah it's not as simple as that. i'll still experiment with it, but i will have to instead store the -1/+1 values for every neuron and brute force the weights in that regard

>> No.15747719

>>15747047
1) it doesn't take less computational power
2) it's a generalization of linear regression
The solution to your toy problem there is one exact solution (i.e., it takes one step to solve) which is just linear solving for the weight y/x. This is fundamentally extended the linear algebraic problem of regression which has an EXACT analytic solution.

The calculus comes fundamentally, from extending the regression problem to different non-linear functions (all a neural network is, is a chain of non-linear functions with known derivatives) under different loss (i.e., cost) functions.

There are other methods, but a regression problem is best solved in a fundamentally Calculus oriented way because it is, at its root, an extension of a linear algebraic problem.

Further, there ARE problems that are best solved employing more randomness, genetic algorithms and simulated annealing methods are what is used in complicated combination optimizations because it's just easier (for MANY reasons).

>> No.15747761

>>15747719
>i.e., it takes one step to solve
redpill me on finding the correct weight in a single step

>> No.15748008

>>15747614
If you manage to fix the problems with it you'll just end up with an inferior version of backpropagation where you use finite differences used to figure out the signs of the gradients.

>> No.15748039

>>15747499
Yes its called retarded imprecise backprop

>> No.15749160
File: 22 KB, 559x548, 1670164281487520.jpg [View same] [iqdb] [saucenao] [google]
15749160

>>15747499
hmmmm so kinda like backpropagation but much worse and more retarded

>> No.15749163

>>15748039
aka line search

>> No.15749268

>>15749160
yeah but at least it is easier than learning SGD

>> No.15749327

>>15749268
How is SGD hard? It's conceptually simple AND you should never actually implement because there are a dozen libraries that do it faster and better. It works behind the scenes. The libraries all let you define whatever network you want and they take care of the rest.

>> No.15749329

>>15749327
>you should never actually implement
>t. can't implement

>> No.15749334

>>15749329
>t. >>15749268

>> No.15749335

>>15749327
abstractions are why no one is learning anything

>> No.15749342

>>15749334
I'm a different poster. I just don't see why you'd try to discourage someone from implementing SGD. If you had ever implemented it yourself, you'd know it's a perfectly reasonable exercise in terms of how much you learn vs. how much effort it takes.

>> No.15749343

>>15749335
I'm not saying you shouldn't learn SGD or implement it yourself (for education purposes). I'm saying SGD is not very difficult so you should avoid making up your own retarded version of it. If you can't understand SGD you should perhaps review basic linear algebra and calculus.

>> No.15749345

>>15749342
That's not what I'm saying. I mean implementing it yourself for an actual application is retarded. Of course you should implement it yourself for educational purposes if you so desire.

>> No.15749348

>>15749343
a lot of neural network techniques only got popularized in the last ~10 years (even things like dropout), and so i think trying out "retarded" things can be the thing that lets me discover some interesting things

>> No.15749349

>>15749348
OK, and? That is completely irrelevant to what I'm saying.

>> No.15749351

>>15749349
if i implement it myself, then it lets me edit the algorithm easier. using some bloat giant library for it will never teach me anything

>> No.15749352

>>15749351
Yes, obviously

>> No.15749355

>>15749352
glad you get it now

>> No.15749357
File: 25 KB, 128x128, 1692432002836424.png [View same] [iqdb] [saucenao] [google]
15749357

>>15749355
Get what?

>> No.15749360

>>15749345
Ok. "You should never" sounded pretty categorical. Whatever.

>> No.15749361
File: 9 KB, 176x176, ut.jpg [View same] [iqdb] [saucenao] [google]
15749361

>>15749357
the thing you agreed with

>> No.15749363

>>15749361
I never disagreed with that
>>15749360
Try examining the context next time

>> No.15749368

>>15749363
>Try examining the context next time
You are truly a vile animal.

>> No.15749373
File: 19 KB, 269x283, 1685690717577340.jpg [View same] [iqdb] [saucenao] [google]
15749373

>>15749368
What gave it away?

>> No.15749375

>>15749373
The way ludicrous lying is pure mindless reflex to you.

>> No.15749398

>>15749375
Now now, being glib about your niggling pedantry doesn't count as ludicrous lying

>> No.15749403

>>15749398
Do you ever ponder why no body in real life respects you or wants anything to do with you?

>> No.15749404

why can't we just train one layer at a time, and then add a new layer when it finishes? that would take way less compute

>> No.15749406

>>15749404
Train one layer at a time to do what?

>> No.15749409

>>15749406
freeze the previous ones when you add a new one to train, do it up until you have as many as you want

>> No.15749415

>>15749409
If you train one hidden layer you get a one-hidden-layer network. That layer isn't tuned to receive the outputs from another layer.

>> No.15749419

>>15749403
Don't worry. I don't behave like this in real life.

>> No.15749420

>>15749419
You do, though. That's part of the reason why you have no friends. Only part of it, though. I think a bigger part is that you're ugly and you've always been the weird autistic kid.

>> No.15749422

>>15749415
>That layer isn't tuned to receive the outputs from another layer.
no, i mean to add layers to the network you want to build. hidden layers are added and then trained

>> No.15749428
File: 364 KB, 646x595, 1684525373883585.png [View same] [iqdb] [saucenao] [google]
15749428

>>15749420
Wtf how do you know all of this?

>> No.15749438

>>15749428
Who else would be posting low energy frogs in the current year of our lord?

>> No.15749444
File: 90 KB, 957x621, 1693417079114042.jpg [View same] [iqdb] [saucenao] [google]
15749444

>>15749438

>> No.15749585

>>15747430
>if you have time to wait for it to finish
by the time it finishes, I'll most likely be dead
Using backprop with simulated annealing is good enough

>> No.15749993

>>15749585
>by the time it finishes, I'll most likely be dead
Yes, but it's pretty fucking stable. It'll finish eventually. Just leave my machine plugged in.

>> No.15750049
File: 604 KB, 1646x1656, 86575335.png [View same] [iqdb] [saucenao] [google]
15750049

okay it's truly non-linear now. does anyone know the simplest way of making it fit onto a dataset? i am guessing that i need to iterate over the dataset and then average the loss for all the outputs, then test my weight adjustments against that

>> No.15750060

>>15750049
Anon, 4chan is 18+.

>> No.15750067

>>15750060
>only kids are allowed to do math
don't think like that. adults are allowed to calculate too

>> No.15750072

>>15750067
I'm not one to discourage people from trying to reinvent the square wheel badly, but that's fundamentally a solitary activity that you shouldn't pester others with. Who the hell has the interest or even patience to track your trajectory through bad solutions to solved problems and understand what you're on about at every given point in your unique journey?

>> No.15750099

>>15750072
do you write this comment in your reviews of research papers for alternative training algorithms?

>> No.15750116

>>15750099
"Alternative training algorithms"... kiddie, 4chan is 18+. You can start thinking about "alternative training algorithms" when you understand the basic and well-established ones, because you keep asking dumb questions and making dumb propositions you wouldn't be making if you actually understood GD.

>> No.15750172
File: 130 KB, 1131x905, 8648357.png [View same] [iqdb] [saucenao] [google]
15750172

>>15750116
>you wouldn't be making if you actually understood GD.
i understand it just fine, enough to know that is time to find a better one

>> No.15750182

>>15750172
>i understand it just fine
Not judging by the stuff you've been posting ITT.

>> No.15750197

>>15750182
you're probably confusing me with someone else

>> No.15750216

>>15750197
You are OP. Every single post of yours can be summarized as "why wouldn't this retarded idea work?" and the answers to all of your questions are trivial to someone who understands normal GD.

>> No.15750222

>>15750216
i don't think it's a retarded idea, considering the points that some others have made about GD getting stuck for some weights if the learning rate is too low

>> No.15750229

>>15750222
4chan is 18+.

>> No.15750232

>>15750229
you said the problem was already solved, even though it isn't

>> No.15750236

>>15750232
The problems you're running into are already solved. You're very slowly inching towards a shitty version of GD.

>> No.15750244
File: 268 KB, 1700x2200, digits.jpg [View same] [iqdb] [saucenao] [google]
15750244

My favourite book