[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 217 KB, 933x368, 0_pEekt4mGzq-4LDJC.png [View same] [iqdb] [saucenao] [google]
11416540 No.11416540 [Reply] [Original]

What does research in machine learning look like? Do you just take a random ML algorithm, apply to a random problem, and then publish?

>> No.11416581

>>11416540
It depends on what part of the CS department you're working in. If you're in just ML, then it's usually a discussion about the problem, applying method, results, and maybe some discussion about what this says about the robustness / properties of the method. That being said, these types of researchers are more concerned with 'tweaking' and stretching the use of known methods. The best of these researchers are in computational medicine and vision research, and the worst of them are typically found in financial applications oddly enough lol. That, and there's a LOT of garbage research in ML these days.

There's the other side, which is computational learning theory (CLT). These people are theorists, so they're usually in the TCS part of the department. They actually try to understand what's going on and / or develop new methods and an underlying theory of learning. They don't publish as quickly, but that's more a result of the rigor and process. These people are writing mathematics papers, so their research is more observation -> definition -> theorem -> proposition -> lemma -> proof style.

>> No.11416617

>>11416581
how is CLT different from stat learning theory

>> No.11416667

>>11416617
SLT usually focuses on improving current methods to be more accurate, while CLT focuses on learning, learnability, and studying what that entails. They have nonempty intersection though

>> No.11416687

>>11416617
>>11416667
To give an example, in CLT you would ask whether we can actually solve the problem, how many training samples for a learner to learn a good hypothesis whp, etc. using PAC as a general framework. In SLT, you generally aim to give error bounds like convergence when misclassifying during training, etc etc.

Does it sounds like these things go together? That's because they're two sides of the same coin, since you need both to understand the theory and practicality of learning as a statistical, computational process. There are many schools that teach it foundationally in one course
http://ttic.uchicago.edu/~nati/Teaching/TTIC31120/2015/

>> No.11416701

>>11416667
>>11416687

thanks for the explanation anon

>> No.11417542

idk

>> No.11417587

>>11416540
yes

>> No.11417603

You either use an existing tool to fix a problem or make a tool for fixing problems. But these days people only try to do the same thing faster and republish same ideas into oblivion.

>> No.11417710

>>11416687
Every single theoretical result is published in ML venues, not COLT venues, though. All I'm saying is you're basically inventing a distinction that doesn't exist.

>> No.11417714

Just download 50 comp-sci journals and put them into GPT-2 lmao.

>> No.11417725

>>11417603
This is true.
Theoretical work is advancing because of this though, while it was always outpaced before, it no longer is.
Maybe this will lead to cool stuff.
>>11416540
The general idea is similar to how stat research works. Either you have a problem you want to model, so you collect data and setup a model which you believe might be able to express information contained in the data that you're interested in, or you're trying to figure out methods to make the process work better, such as new tests, new models, new optimization methods, etc. However there are many weird mysteries in deep learning, such as how it's working so much better than any theoretical bounds predict, or how it can work so well despite the loss surface being so non-convex while gradient descent is convex. This enables another axis, which is an attempt at describing these phenomena, as described (very poorly) in >>11416581

>> No.11417730

>>11416540
That is the bulk of it, but there's also people developing new algorithms.

>> No.11417823
File: 50 KB, 1400x788, TurningNGL_Model__1400x788.png [View same] [iqdb] [saucenao] [google]
11417823

>>11417714
>GPT-2
pffft

>> No.11417854

>>11416540
new architectures, regularization methods, optimization methods etc to do better than current state of the art on standard datasets, or novel ways to deal with new problems. or advances in theory supported by experiments

just applying a known method to a new problem will not get you published at a machine learning conference, but it can often be published in whatever subfield of science/engineering if it leads to novel results

>> No.11417913

>>11417823
Meanwhile a bog-standard bilstm plus a few fully-connected layers does just as well or even better. All of that shit is pure memetics and PR bullshit.