[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/vt/ - Virtual Youtubers

Search:


View post   

>> No.67016708 [View]
File: 3.81 MB, 1152x2520, ComfyUI_03697_.png [View same] [iqdb] [saucenao] [google]
67016708

>>67015605
Tiled VAE is something that also exists in auto and is basically just a way to make vaes more performant for larger resolutions. Though I don't remember them literally being more than twice as fast.
The TextEncode is, as far as I understand, a new way that SDXL is able to handle prompts. There's a G Clip, which is supposedly for "concepts", "complicated things" etc and an L CLIP for the actual contents of the image, think characters etc. I've interpreted that as "G CLIP for abstract stuff, L CLIP for physical stuff".
There's also height, width, targetHeight, targetWidth, though I have no idea what exactly they do. They seem to act almost like a mask, in that the prompt is focused on that area, and making it smaller for example can create zones where the prompt is less relevant (check this for example, which was a 1girl prompt https://files.catbox.moe/3akbgq.png).). Overall there's like three reddit comments where anything about this is discussed, and otherwise it's just been people confused for the past 6 months.

>>67015678
Check above for my experience with PromptEncodeSDXL, I honestly just think nobody has any clue. Probably not even the people behind it all.
Hm, for me it's the other way around for the tiled encoder. Literally more than double the speed(for the encoding, not overall. Overall it's from 2pass 1152x2016 200secs to around 120-140secs), as well as the psychological benefit of seeing a progress bar. But I'm also a AMDlet, maybe it just can't handle normal VAES.

Navigation
View posts[+24][+48][+96]