/sci/ - Science & Math

File: 217 KB, 933x368, 0_pEekt4mGzq-4LDJC.png [View same] [iqdb] [saucenao] [google]

Anonymous Tue Feb 25 12:05:34 2020 No.11416540 [Reply] [Original]

What does research in machine learning look like? Do you just take a random ML algorithm, apply to a random problem, and then publish?

Anonymous Tue Feb 25 12:28:34 2020 No.11416581

>>11416540
It depends on what part of the CS department you're working in. If you're in just ML, then it's usually a discussion about the problem, applying method, results, and maybe some discussion about what this says about the robustness / properties of the method. That being said, these types of researchers are more concerned with 'tweaking' and stretching the use of known methods. The best of these researchers are in computational medicine and vision research, and the worst of them are typically found in financial applications oddly enough lol. That, and there's a LOT of garbage research in ML these days.

There's the other side, which is computational learning theory (CLT). These people are theorists, so they're usually in the TCS part of the department. They actually try to understand what's going on and / or develop new methods and an underlying theory of learning. They don't publish as quickly, but that's more a result of the rigor and process. These people are writing mathematics papers, so their research is more observation -> definition -> theorem -> proposition -> lemma -> proof style.

>>	Anonymous Tue Feb 25 12:45:48 2020 No.11416617 >>11416581 how is CLT different from stat learning theory

>>	Anonymous Tue Feb 25 13:13:16 2020 No.11416667 >>11416617 SLT usually focuses on improving current methods to be more accurate, while CLT focuses on learning, learnability, and studying what that entails. They have nonempty intersection though

Anonymous Tue Feb 25 13:20:24 2020 No.11416687

>>11416617
>>11416667
To give an example, in CLT you would ask whether we can actually solve the problem, how many training samples for a learner to learn a good hypothesis whp, etc. using PAC as a general framework. In SLT, you generally aim to give error bounds like convergence when misclassifying during training, etc etc.

Does it sounds like these things go together? That's because they're two sides of the same coin, since you need both to understand the theory and practicality of learning as a statistical, computational process. There are many schools that teach it foundationally in one course
http://ttic.uchicago.edu/~nati/Teaching/TTIC31120/2015/

>>	Anonymous Tue Feb 25 13:24:55 2020 No.11416701 >>11416667 >>11416687 thanks for the explanation anon

>>	Anonymous Tue Feb 25 18:38:54 2020 No.11417542 idk

>>	Anonymous Tue Feb 25 18:51:09 2020 No.11417587 >>11416540 yes

>>	Anonymous Tue Feb 25 18:57:29 2020 No.11417603 You either use an existing tool to fix a problem or make a tool for fixing problems. But these days people only try to do the same thing faster and republish same ideas into oblivion.

>>	Anonymous Tue Feb 25 19:49:15 2020 No.11417710 >>11416687 Every single theoretical result is published in ML venues, not COLT venues, though. All I'm saying is you're basically inventing a distinction that doesn't exist.

>>	Anonymous Tue Feb 25 19:50:13 2020 No.11417714 Just download 50 comp-sci journals and put them into GPT-2 lmao.

Anonymous Tue Feb 25 19:55:08 2020 No.11417725

>>11417603
This is true.
Theoretical work is advancing because of this though, while it was always outpaced before, it no longer is.
Maybe this will lead to cool stuff.
>>11416540
The general idea is similar to how stat research works. Either you have a problem you want to model, so you collect data and setup a model which you believe might be able to express information contained in the data that you're interested in, or you're trying to figure out methods to make the process work better, such as new tests, new models, new optimization methods, etc. However there are many weird mysteries in deep learning, such as how it's working so much better than any theoretical bounds predict, or how it can work so well despite the loss surface being so non-convex while gradient descent is convex. This enables another axis, which is an attempt at describing these phenomena, as described (very poorly) in >>11416581

>>	Anonymous Tue Feb 25 19:58:03 2020 No.11417730 >>11416540 That is the bulk of it, but there's also people developing new algorithms.

>>	Anonymous Tue Feb 25 20:34:08 2020 No.11417823 File: 50 KB, 1400x788, TurningNGL_Model__1400x788.png [View same] [iqdb] [saucenao] [google] >>11417714 >GPT-2 pffft

Anonymous Tue Feb 25 20:45:24 2020 No.11417854

>>11416540
new architectures, regularization methods, optimization methods etc to do better than current state of the art on standard datasets, or novel ways to deal with new problems. or advances in theory supported by experiments

just applying a known method to a new problem will not get you published at a machine learning conference, but it can often be published in whatever subfield of science/engineering if it leads to novel results

>>	Anonymous Tue Feb 25 21:12:51 2020 No.11417913 >>11417823 Meanwhile a bog-standard bilstm plus a few fully-connected layers does just as well or even better. All of that shit is pure memetics and PR bullshit.

Advanced search
Text to find
Subject [?]Search by post subject. Leave empty for any.
Username [?]Search for user name. Leave empty for any user name.
Tripcode [?]Search for tripcode. Leave empty for any.
Email [?]Search by email. Leave empty for any.
Filename [?]Search by image filename. Leave empty for any.
From Date [?]Enter what date to start searching from. Format is YYYY-MM-DD
To Date [?]Enter what date to start searching until. Format is YYYY-MM-DD
Image hash
Search in	All Posts OPs Only
Deleted posts	Show all posts Show only deleted posts Only show non-deleted posts
Internal posts	Show all posts Show only internal posts Show only archived posts
Order	New posts first Old posts first
Capcode	All Posts Only by Users Only by Mods Only by Admins Only by Developers
Results	Posts Threads
Action	[ Simple ]