[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/vt/ - Virtual Youtubers

Search:


View post   

>> No.55092444 [View]
File: 40 KB, 665x63, I Can't Believe It's Not Claude!.png [View same] [iqdb] [saucenao] [google]
55092444

>>55059173
>>55059575
>>55059748
>>55079951
>>55085762
Thank you for the compliments!
Alright, alright, before people get too hyped: this model isn't about to release, we've just rolled it out to some testers to see if the dataset is good. Metharme 2 (More Weeks) isn't entirely how we want it - we had to train it at a reduced context size (4k -> 2k) due to not having enough compute and with a significantly reduced dataset because of it (originally 700 million tokens cut down to 300 million tokens for this particular run). And there are still faults associated with 7B - sometimes repeating itself, incoherent at times, et cetera et cetera. But our testers have provided good feedback, and it seems like the model has a lot of potential. We've recently acquired some funding and H100s, so we're gonna try to get a full-power 7B working and, afterwards, 13B. Let's hope it goes well!

There's our progress update. Our death was greatly exaggerated, kek. Let me know if you have any questions and I'll do my best to answer them later. In the meantime: keep your smile. Keep on smiling.

Navigation
View posts[+24][+48][+96]