[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]

/vt/ - Virtual Youtubers

Search:


View post   

>> No.83906815 [View]
File: 1.11 MB, 896x1280, file.png [View same] [iqdb] [saucenao] [google]
83906815

>>83904644
Alright, thanks. In that case...
>>83903591
So I've been trying to train SDXL models to use v-prediction and so far I've kinda succeeded in converting ArtiWaifu https://civitai.com/models/615476 (kinda legacy) and Pony https://civitai.com/models/684052 (v0.1 is quite an old version at this point desu) using a really small, but hand-picked and hopefully diverse dataset, at around 2k pictures. Now I've expanded it to 5.5k with a goal of 10-30 pictures per artist which is currently in training (pic).
1. Why did you stick to using a lyco and not a full unet finetune? From my experiments so far, the "true" finetuning is the only viable option on a larger scale, while lyco tends to deepfry some random layers and freeze the others. Yes, I'm aware there's a "full" algo in lycoris, but it just doesn't work as well and suffers from these issues all the same.
2. Did you try freezing the text encoder? I've heard that if the text encoder is not completely broken, then there's not much point in finetuning it, and my experiments *kinda* align with that, although this would be hard to test on SDXL given I only have a 3090 and make test samples on a 3060.
3. Why you didn't really try baking for SDXL, specifically for Pony? Is it just the fact that it's too compute-intensive and/or you couldn't make it work, or was Animagine good for enough for you?

Navigation
View posts[+24][+48][+96]