[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/vt/ - Virtual Youtubers

Search:


View post   

>> No.38135892 [View]
File: 158 KB, 780x742, Screenshot_16 (2).png [View same] [iqdb] [saucenao] [google]
38135892

>>38119852
False. I already went over this for retards like you in an earlier thread. Increasing model size has has diminishing returns, and Google hit them hard.
There are three (3) models mentioned in the paper about fine-tuning: 2B, 8B and 137B parameters. The first can run on a graphics card. The second can run on a top-of-the-line graphics card. The third requires a modest bitcoin farm of several dozen GPUs (still, not a military supercomputer).
The crowdworkers were asked to rate these models' responses. It was done twice - before and after fine-tuning. Google's own pajeets rated responses made by a 2B model after fine-tuning higher than responses made by the 137B model before fine-tuning on pretty much every metric. Putting in additional work works much better than simply throwing more GPU processing power at the problem, and massively so. The 137B model is 68 times larger than the 2B one.
We know that Character.AI is using a LaMDA model or something derived from it because the CAI model is fine-tuned very similarly, if not the same. So the same conclusions apply here. Also, the same can be seen with other AI, like InstructGPT.
https://openai.com/blog/instruction-following/

>> No.37320400 [View]
File: 158 KB, 780x742, Screenshot_16.png [View same] [iqdb] [saucenao] [google]
37320400

>>37319917
There is a correlation, but you're still a retard. Their graphs and the abstract itself emphasize how doing additional work (pretraining, giving it a calculator and google) did a good job on its own, elevating the response quality on most metrics more than just throwing more GPUs at it.

Navigation
View posts[+24][+48][+96]