Hololive AI VOD Subtitling and Translation Test

File: 97 KB, 850x604, __tokino_sora_and_a_chan_hololive_drawn_by_funi_mu9__sample-b26f4a25e663693b9f19520ca730e94c.jpg [View same] [iqdb] [saucenao] [google]

Hololive AI VOD Subtitling and Translation Test Anonymous Tue Mar 14 19:12:36 2023 No.45061611 [Reply] [Original]

Sup schizos and shitpos

Just wanted to put something I've been trying of late out there, I've been using OpenAIs Whisper sequence-to-sequence neural net model (specifically the large variant) to try and generate AI subs for Hololive vods. As an example of a result I got today, here is are some subs for Subarus latest VOD (the Resi Remake playthrough) that I generated over the course of about 15 minutes.

It does get stuck in places (this one shows the same line for three minutes at the start until Subaru starts talking, for instance) and this being an AI that doesn't understand context and Japanese being a language comprised of 70% guesswork stuff can sound jank so I wouldn't advise using these without at least rudimentary Japanese knowledge, but it can still be an excellent tool to make JP vods more watchable for those who aren't fluent in nihongonese.

You can find the SRT file I generated here:
https://files.catbox.moe/0f8abp.srt

And if you don't want to download the VOD and apply it manually you could use this extension or something like it to apply the subs directly to the youtube VOD:

https://chrome.google.com/webstore/detail/subtitles-for-youtube/oanhbddbfkjaphdibnebkklpplclomal?hl=en

>>	Anonymous Tue Mar 14 19:14:39 2023 No.45061723 Okay turns out the subs are mistimed because I'm a retard and used the wrong timestamp, lemme just redo this real quick

>>	Anonymous Tue Mar 14 19:15:58 2023 No.45061790 In the meantime here's a proper example using part 1 of Fubukis dead space playthrough: https://files.catbox.moe/5b6tzz.srt

>>	Anonymous Tue Mar 14 19:23:31 2023 No.45062198 >>45061790 Be aware it takes a little bit to kick in, it starts subtitling at about 30 seconds in.

>>	Anonymous Tue Mar 14 19:24:47 2023 No.45062271 >>45061611 Thanks for your hard work Anon!

>>	Anonymous Tue Mar 14 19:25:24 2023 No.45062308 Couldn't you at least link the streams you're talking about

>>	Anonymous Tue Mar 14 19:27:52 2023 No.45062424 >>45062308 Sure thing anon, here's the Fubuki stream you can use the non borked subs with: https://www.youtube.com/watch?v=pAT_AZCNTck&

>>	Anonymous Tue Mar 14 19:58:37 2023 No.45063874 >>45062424 Seems to work decently well, at least for well-formed sentences that she reads off the game and short simple reactions that she has But I'm more interested in seeing how it reacts to difficult content

>>	Anonymous Tue Mar 14 20:02:50 2023 No.45064047 >>45061611 Here's a fixed version of the subs: https://files.catbox.moe/plicl4.srt For this VOD: https://www.youtube.com/watch?v=fp6l92a0oXQ

>>	Anonymous Tue Mar 14 20:03:56 2023 No.45064093 >>45063874 Any suggestions anon? I've gotta go sort out some paperwork so I can throw my workstation at em in the meantime and post the results when I get back

Anonymous Tue Mar 14 20:37:11 2023 No.45065726

>>45064093
Something really difficult would probably be this Koyo and Noel bathing ASMR
https://www.youtube.com/watch?v=pAT_AZCNTck
>multiple speakers
>constant background noise
>difficult to hear voices
>innuendos
Or one of those big collabs with many people talking over each other, I don't realistically expect it to get much at all out of those
https://www.youtube.com/watch?v=xfon1W9BCVs
Or a holo who mumbles a lot, Aqua maybe?

>>	Anonymous Tue Mar 14 20:41:22 2023 No.45065944 >>45065726 I'll throw it at KoyoNoel but I can tell you right now it would implode with one of the big chaotic group collabs, might feed it through anyway as a test though

>>	Anonymous Tue Mar 14 20:42:55 2023 No.45066018 >>45065944 As for Aqua I don't think it'll struggle too much, but when stuff is mumbled to a degree where you have to infer what word its meant to be it'll probably become much less accurate (though it does have a limited ability to do guesswork)

>>	Anonymous Tue Mar 14 21:12:27 2023 No.45067536 >>45065726 Yeah that bathing ASMR is too brutal for it with all the noise and the low speech volume, while it did better than I expected its not even worth posting the results, same thing would happen with one of the chaotic large collabs I'd wager

>>	Anonymous Tue Mar 14 21:39:44 2023 No.45068881 can i get subs for the recent minecraft vod from pegora? thanka

>>	Anonymous Tue Mar 14 21:41:06 2023 No.45068944 doing god's work, anon bless

>>	Anonymous Tue Mar 14 21:48:57 2023 No.45069300 >>45068881 Sure thing anon, give us about half an hour and I'll throw em up

>>	Anonymous Tue Mar 14 21:50:24 2023 No.45069368 >>45061611 impressive stuff, anon with the gpt-4 stuff from today, it seems like a universal translator is possible within the year.

>>	Anonymous Tue Mar 14 21:51:14 2023 No.45069414 >>45061611 /jp/ on suicide watch. Imagine wasting 10 years of your life learning a dead language when machines are close to translating it.

>>	Anonymous Tue Mar 14 22:06:45 2023 No.45070204 >>45068944 Stream downloaded - network has been kinda slow so it took a bit more time than I anticipated. Throwing the AI at it now.

>>	Anonymous Tue Mar 14 22:07:35 2023 No.45070248 >>45069414 i know this is difficult to understand for monolinguals, but learning a language does not mean you translate everything back to your native language translation is a different skill and will never give the same result

>>	Anonymous Tue Mar 14 22:09:18 2023 No.45070343 >>45067536 I figured, thanks for trying though

>>	Anonymous Tue Mar 14 22:11:48 2023 No.45070472 File: 181 KB, 400x400, 1623017241427.png [View same] [iqdb] [saucenao] [google] >>45070248

>>	Anonymous Tue Mar 14 22:13:38 2023 No.45070557 >>45070472 One day AI will surpass all human ability, and we'll learn to appreciate imperfection instead of always scrutinising art by its technical quality.

>>	Anonymous Tue Mar 14 22:31:51 2023 No.45071612 >>45061611 >>45069300 source code/how can I build this myself? thanks

>>	Anonymous Tue Mar 14 22:33:29 2023 No.45071701 >>45071612 https://github.com/openai/whisper Instructions for installation can be found here - be aware if you don't have a tensor core equipped GPU its gonna be slooow

>>	Anonymous Tue Mar 14 22:34:31 2023 No.45071764 >>45071612 As far as arguments are concerned I use --language Japanese --model large --device cuda --task translate --output_format srt

>>	Anonymous Tue Mar 14 22:35:32 2023 No.45071823 >>45068881 DONE! AI Subs: https://files.catbox.moe/wro0im.srt VOD: https://www.youtube.com/watch?v=xq6lCaie4Ag

>>	Anonymous Tue Mar 14 22:36:27 2023 No.45071884 japanese class was really that hard?lol

>>	Anonymous Tue Mar 14 22:37:52 2023 No.45071960 >>45071884 If you're not using every resource the tech you worked for provides you're cucking yourself anon

>>	Anonymous Tue Mar 14 22:39:08 2023 No.45072017 >>45071764 yeah I found it, thanks. is --task translate better than piping to another software though? skimming the paper now

Anonymous Tue Mar 14 22:41:32 2023 No.45072146

>>45072017
Whisper performed better than the SOTA in zero shot translation tests, and in my experience it beats out DeepL by a fair bit in terms of making subtitles legible.

It does make a fair few mistakes but if you know even basic Japanese vocab the mistakes it makes are a lot easier to ignore than the nonsense deepL outputs half the time when fed with a transcript.

>>	Anonymous Tue Mar 14 22:43:02 2023 No.45072238 >>45072146 their paper claims Maestro is better (for the given datasets ofc) for X -> EN on the largest models (that you're using). it appears closed source though or at least I can't find it

>>	Anonymous Tue Mar 14 22:44:07 2023 No.45072287 >>45072238 Yeah sadly that model aint open source so can't be built and readily used by your regular joe schmoe like Whisper can

>>	Anonymous Tue Mar 14 22:46:53 2023 No.45072433 File: 976 KB, 1920x1080, image-min.png [View same] [iqdb] [saucenao] [google] >>45072287 The fact it can translate what Pekora says at 4:36 this accurately is pretty insane considering she's putting on a chuuba voice and speaking crazy fast

>>	Anonymous Tue Mar 14 22:47:19 2023 No.45072463 >>45072287 now I wonder if you could make this live. pipe chunks of audio + rollback window (for improving accuracy). thoughts?

>>	Anonymous Tue Mar 14 22:49:52 2023 No.45072596 >>45072463 You definitely could but you'd need bloody fast hardware to do it, I have an RTX 4080 (got it discounted) and even that probably wouldn't be able to keep up with a live feed.

>>	Anonymous Tue Mar 14 22:51:40 2023 No.45072707 >>45072596 Actually, turns out someone's already had a crack at it: https://github.com/fortypercnt/stream-translator

>>	Anonymous Tue Mar 14 22:51:54 2023 No.45072718 >>45072596 15 minutes for a couple hour stream is pretty good though, is the issue non-amortized starting cost?

>>	Anonymous Tue Mar 14 22:53:16 2023 No.45072791 >>45072707 I'll take a look, thanks

>>	Anonymous Tue Mar 14 22:55:42 2023 No.45072922 >>45072718 I stand corrected, considering the existence of the repo I posted above and the faster fork of whisper you can definitely do this live if you have high end hardware. Exciting stuff!

>>	Anonymous Tue Mar 14 22:57:03 2023 No.45073002 >>45064093 Houshou Marine zatsudan. She speaks really fast.

>>	Anonymous Tue Mar 14 22:58:56 2023 No.45073100 >>45073002 The final boss of nihongonese, shoot us a zatsudan VOD of your choice and I'll feed it through anon

>>	Anonymous Tue Mar 14 23:02:46 2023 No.45073381 >>45072707 There's this as well actually https://github.com/Awexander/audioWhisper God I love FOSS shit

>>	Anonymous Tue Mar 14 23:03:47 2023 No.45073457 >>45072922 faster fork? you'd have to modify their thing tbough then right

>>	Anonymous Tue Mar 14 23:04:01 2023 No.45073477 >>45073381 The demo video used in the repo is even a Subaru clip, guess I've been beaten to the punch!

>>	Anonymous Tue Mar 14 23:05:05 2023 No.45073581 >>45073457 https://github.com/guillaumekln/faster-whisper#installation Apparently its up to 4x faster without losing any accuracy

>>	Anonymous Tue Mar 14 23:09:10 2023 No.45073849 >>45073581 oh I read the doc, the live translator compatible. I'll try it out and see if my build can take it, I have worse specs than you

>>	Anonymous Tue Mar 14 23:16:48 2023 No.45074357 Considering the existence of that repo that outputs live TL to the command line, all we need is to pipe that into an overlay and if it works well we have live subs

>>	Anonymous Tue Mar 14 23:51:29 2023 No.45076505 bumping good thread

>>	Anonymous Tue Mar 14 23:53:40 2023 No.45076653 File: 40 KB, 220x220, 1678700478695693.gif [View same] [iqdb] [saucenao] [google] >>45071884 >paying to learn a language

>>	Anonymous Tue Mar 14 23:55:51 2023 No.45076800 >>45064093 some say polka is really difficult to understand please try generating for some of her talking streams like https://www.youtube.com/live/L1ldmdqxV8A?feature=share

>>	Anonymous Wed Mar 15 00:06:39 2023 No.45077507 >>45076800 Sure thing, I'll give it a shot

>>	Anonymous Wed Mar 15 00:24:53 2023 No.45078720 Sorry, this thread is too smart for me and I don't understand anything that is going on. Have you guys figured out a way to translate live streams for dumb fucks like me? or do I have to wait longer for that to happen?

Anonymous Wed Mar 15 00:27:36 2023 No.45078870

>>45078720
Theoretically?
https://github.com/fortypercnt/stream-translator
https://github.com/Awexander/audioWhisper
I'm trying to install the first project above on minimal requirements right now. If it works out reasonably might make a rentry for literal retards similar to what they have over in the AI threads.

>>	Anonymous Wed Mar 15 00:31:07 2023 No.45079050 >>45078720 VODS yes, live streams - almost

>>	Anonymous Wed Mar 15 00:35:09 2023 No.45079240 >>45078870 >>45079050 That's amazing. Looking forward to those guides!

>>	Anonymous Wed Mar 15 00:39:25 2023 No.45079461 I love you autists so much

>>	Anonymous Wed Mar 15 00:41:38 2023 No.45079583 >>45076800 Done! Output below: https://files.catbox.moe/c2u91o.srt

>>	Anonymous Wed Mar 15 00:42:39 2023 No.45079631 >>45079583 Seems it had trouble with Polka because it got stuck for like the first half hour

>>	Anonymous Wed Mar 15 01:05:20 2023 No.45080813 Forgot to mention, to use it with your GPUs CUDA cores you need to do the following: pip3 uninstall torch pip cache purge pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117

>>	Anonymous Wed Mar 15 01:05:51 2023 No.45080854 >>45079583 thank you checking it out rn

>>	Anonymous Wed Mar 15 01:09:30 2023 No.45081086 >>45080854 I should warn you, polka seemed to be beyond its capabilities

>>	Anonymous Wed Mar 15 01:17:46 2023 No.45081629 >>45073100 I don't have one on hand, sorry. She's done loads though, so you could probably pick any. Also, Luna and Miko because people think they're hard to understand, and Korone because apparently she speaks in a different accent

>>	Anonymous Wed Mar 15 01:20:54 2023 No.45081913 OP here, signing off for the night as its getting incredibly late - if someone gets the live translation repo working be sure to report your findings here, I'm excited to see where we can take this!

>>	Anonymous Wed Mar 15 01:21:55 2023 No.45081984 >>45081913 Gets it working while I'm gone, that is - I'll install it myself come morning

Anonymous Wed Mar 15 01:22:11 2023 No.45082009

>>45080813
I redid torch and the dependency installation from scratch (for stream-translator), and also made sure the CUDA version matched the torch version downloaded, but torch.cuda.is_available() continues to return False.
Thoughts? I'm continuing without it for now.

>>	Anonymous Wed Mar 15 01:31:56 2023 No.45082820 >>45082009 What GPU are you running?

>>	Anonymous Wed Mar 15 01:33:22 2023 No.45082917 >>45082820 RTX 2070

>>	Anonymous Wed Mar 15 01:34:48 2023 No.45083031 >>45082917 Hm, that is odd. Are you running the install in a virtual environment or your regular windows environment?

>>	Anonymous Wed Mar 15 01:35:54 2023 No.45083129 >>45083031 virtual environment, idk if that affects anything

>>	Anonymous Wed Mar 15 01:38:45 2023 No.45083367 >>45083129 Right, which Pytorch version did you install after purging the one packaged with whisper?

>>	Anonymous Wed Mar 15 01:39:55 2023 No.45083461 >>45083367 pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

>>	Anonymous Wed Mar 15 01:42:15 2023 No.45083640 >>45083461 Are you certain you installed that package within your virtual environment and not just in Windows?

Anonymous Wed Mar 15 01:45:16 2023 No.45083874

>>45083640
I think torch exists locally, I'll try a clean build I guess. Should CUDA be installed on Windows or locally somehow? I just downloaded an installer and used it.
also as an update I can sort of get it to work with the small model. Lots (but not maxed) of CPU usage and it misses a lot of sentences/produces some nonsense, but a couple things are in common with what's actually said

>>	Anonymous Wed Mar 15 01:49:37 2023 No.45084243 >>45083874 I wouldn't worry about your CUDA toolkit install as long as you installed it through the official NVIDIA installer

Anonymous Wed Mar 15 01:52:30 2023 No.45084529

>>45083874
As far as torch is concerned if you're working in that virtual environment without using the regular openai whisper build just let requirements.txt take care of things for you and don't worry about uninstalling torch, failing that run through the purge process from the post above but install the cu117 build of torch.

>>	Anonymous Wed Mar 15 01:54:10 2023 No.45084677 Worst case scenario just draft some 5head autists from /g/

>>	Anonymous Wed Mar 15 01:54:21 2023 No.45084696 >>45084529 hmm torch.__version__ gives a cpu version. I'll try further. Feel free to go to sleep if it's late for you

>>	Anonymous Wed Mar 15 02:06:25 2023 No.45085616 >>45084696 Yeah I think I'll do that if that's okay, its too late for me to think clearly enough to troubleshoot over text

>>	Anonymous Wed Mar 15 02:07:43 2023 No.45085722 >>45085616 Looks like the cu113 install might be incompatible/phased out, I can get torch '1.13.1+cu116' in. Thanks for the help, I'll record additional observations as I go.

>>	Anonymous Wed Mar 15 02:17:09 2023 No.45086540 >>45085722 torch.cuda.is_available() reports True now

>>	Anonymous Wed Mar 15 02:30:53 2023 No.45087581 >>45079050 what's the retard-proof way of translating vods? (generating english subs)

Anonymous Wed Mar 15 02:38:50 2023 No.45088115

>>45086540
Doesn't use my GPU still for some reason. Medium model appears to run reasonably competently. Large model leads to
>torch.cuda.OutOfMemoryError
Despite apparently enough memory existing. Trying to fix that.
>>45087581
Build is here
https://github.com/openai/whisper
You'll need a reasonable GPU or it will be slow using CPU.

>>	Anonymous Wed Mar 15 03:08:55 2023 No.45090063 >>45088115 My GPU says python is using it and this disappears upon closing the program. So, perhaps the model is not using as many GPU resources as I expected (it uses a bit but seemingly more for the graphics of the cmd interface itself).

Anonymous Wed Mar 15 03:21:33 2023 No.45090882

>>45090063
So apparently:
large model needs ~10GB of VRAM. RTX 2070 only has 8GB
https://github.com/openai/whisper/discussions/895
parameters for different sizes:
https://github.com/openai/whisper#available-models-and-languages
Going to try to set up faster-whisper and see if that helps

>>	Anonymous Wed Mar 15 03:41:10 2023 No.45091963 >>45090882 faster-whisper can handle the large model in <8GB VRAM. Seems to use less resources overall. Either way though it doesn't appear to be using a lot of my GPU on defaults, going to fiddle with parameters now.

>>	Anonymous Wed Mar 15 03:49:11 2023 No.45092412 Imagine when live TL is possible how hard /jp/cels will seethe now that they cannot gatekeep lel

>>	Anonymous Wed Mar 15 04:00:50 2023 No.45093033 >>45092412 >Imagine I'm literally running it on my computer right now, anon.

Anonymous Wed Mar 15 04:03:46 2023 No.45093176

>>45091963
Looks like it can randomly have anywhere from a couple second latency to almost half a minute. Plus the memory loops that were mentioned above. This is with
>--use_faster_whisper --model large --language en --interval 2 --history_buffer_size 20
on an English stream so I can easily validate it.

>>	Anonymous Wed Mar 15 04:20:08 2023 No.45093989 >>45093033 Can you hook it onto a livestream and not just feed it vods?

>>	Anonymous Wed Mar 15 04:21:17 2023 No.45094063 >>45093989 Yes, that is what I was referring to. I'm still doing some tests but my notes about it are above. Currently doing some checks with VOD translation.

>>	Anonymous Wed Mar 15 04:53:42 2023 No.45095593 >>45093176 update: trying to get cuDNN

>>	Anonymous Wed Mar 15 05:42:48 2023 No.45097660 File: 282 KB, 1200x857, 1675486859884534.jpg [View same] [iqdb] [saucenao] [google]

>>	Anonymous Wed Mar 15 05:45:51 2023 No.45097799 >>45095593 not sure if it did anything. bumping while I create a makeshift instructions file for retards

>>	Anonymous Wed Mar 15 05:46:19 2023 No.45097822 holy shit, had no idea things were this far along. gonna clone this repo and give it a shot with the chinchilla stream tonight + 4090, will report results

>>	Anonymous Wed Mar 15 06:47:07 2023 No.45100033 >>45097799 tempbump

>>	Anonymous Wed Mar 15 07:03:26 2023 No.45100666 >>45061611 >>45062271 >>45071612 >>45092412 >>45093989 Rentry for stream-translator >>45072707 live TL I am stupid and bad at software so I probably wrote it in a stupid way rentry co live-tl

Anonymous Wed Mar 15 07:13:21 2023 No.45101001

>>45097822
>>45100666
thanks anon, that's a nice write up.
naturally I pick the stream to try this where the streamer sets up translation on her side lmao. should have known better than to try this on latest versions with Windows native. faster_whipser does not like and nvidia says its unsupported. will have to try again later with either downgrades as in the guide, WSL, or Linux native.

>>	Anonymous Wed Mar 15 07:15:14 2023 No.45101069 >>45101001 You can compare the output to her translations. What are your specs btw, and what is the command line command you're trying to run?

>>	Anonymous Wed Mar 15 07:38:08 2023 No.45101801 This sounds super neat. Always wanted to watch the JP girls but I'm already trying to balance learning two other languages

Anonymous Wed Mar 15 07:45:16 2023 No.45102023

>>45101069
yeah I did without faster_whisper... it was pretty usable most of the time, def enough to explore using it more. anywhere from 5-60 seconds behind, seemed to get stuck occasionally. her autotranslate clearly isn't super good either, even with my shitty japaneses hers was pretty off often so tough to compare.
just running the default options atm, no tweaks and no faster_whisper, so like this:
> python translator.py https://www.youtube.com/watch?v=IjWCTun0K1M
on i9-13900k 64gb ddr5 6000mhz rtx4090
not convinced gpu acceleration was working right even on the default model, cpu was getting pretty hot. pulled in the cuda deps with winget so I got latest versions which nvidia claims doesn't support gpu acceleration on windows anymore, so it probably screwed up someplace. been using this box mostly for gaming and not really a great windows dev, so I'll give it another shot tomorrow in either WSL or finally get around to throwing some linux on the other nvme

>>	Anonymous Wed Mar 15 07:51:01 2023 No.45102196 >>45102023 rtx4090 should be way more than enough. check the bottom of the rentry for a couple updates on how to check if your gpu is being utilized also use --model large since you can support it, it should be noticeably better.

>>	Anonymous Wed Mar 15 07:52:23 2023 No.45102231 >>45102023 also I checked the stream and she is playing an english dub. try adding "--language ja" as well, might help since it doesn't have to decide what/how to translate as much?

>>	Anonymous Wed Mar 15 07:56:11 2023 No.45102340 >>45102231 >>45102196 yup gpu was off, for some reason pip brought in torch+cpu. hang on, I'll monkey with pip and see if I can get the gpu verison for the default model tonight at least.

Anonymous Wed Mar 15 07:58:11 2023 No.45102406

>>45102340
try
pip uninstall torch
pip install torch torchvision torchaudio --no-cache-dir --extra-index-url https://download.pytorch.org/whl/cu116
(inside your venv). Replace 116 by the version of your CUDA (11.6 or 11.7).
The default, and what is contained in requirements.txt for stream-translate, is 113, which did not work for me and caused this issue.

Anonymous Wed Mar 15 08:01:57 2023 No.45102512

>>45102406
yeah pretty much, but I'll need 121 I think since winget grabbed the latest.
> nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2023 NVIDIA Corporation
> Built on Wed_Feb__8_05:53:42_Coordinated_Universal_Time_2023
> Cuda compilation tools, release 12.1, V12.1.66
> Build cuda_12.1.r12.1/compiler.32415258_0

Anonymous Wed Mar 15 08:04:14 2023 No.45102574

>>45102512
be aware if you do that you'll have to build pytorch from binaries for the time being
https://discuss.pytorch.org/t/install-pytorch-with-cuda-12-1/174294
You can also have multiple CUDAs downloaded with no issue (and I think the latest version doesn't override the older ones but not sure since I uninstalled 12.1)

>>	Anonymous Wed Mar 15 08:13:00 2023 No.45102850 >>45102574 ugh and that means getting msvc and all that jazz set up on this install. Ok path of least resistance: I'll try the downgrade real quick. anything wrong with 11.8? Or should I go 11.7?

>>	Anonymous Wed Mar 15 08:14:26 2023 No.45102887 >>45102850 I think 118 should work, I'd be interested to know if it does easily. 117 and 116 definitely work based on me and the other anon.

Anonymous Wed Mar 15 08:42:45 2023 No.45103697

>>45102887
ok downgraded to 11.8, readded cuDNN and readded torch to the venv as described. gpu is working properly now.
gave it another try as below (I already converted the models earlier):
> python translator.py https://www.youtube.com/watch?v=IjWCTun0K1M --use_faster_whisper --language ja --model large
and holy shit it's completely live. I had to refresh my stream window because the translations were ahead of what was being said. system is much happier, no noticable heat increases or system lag. super impressive.
thanks for the help anon, nice writeup. let me know if there's anything else you wanna test

Anonymous Wed Mar 15 08:51:05 2023 No.45103899

>>45103697
you're welcome!
I'll add 118 to the rentry. for now nothing. I think the other guy is better at this stuff anyway. I really should sleep in time for work, will maybe try fast-whisper VOD translation tomorrow
Also I'll read any post on /vt/ containing the string/word "rentrylivetl" in the body if anyone notices any errors/updates in the writeup

Anonymous Wed Mar 15 09:02:55 2023 No.45104223

>>45103899
Yeah me too, already gonna be light on sleep. I'll try to keep on eye on these threads and that repo. it's working pretty fantastic as is but would be nice if it exited a bit cleaner when the steam ends, had to spam control-c to get my console back. I'll try to take a shot at fixing it, PR if I get something working.
thanks again!

>>	Anonymous Wed Mar 15 09:08:12 2023 No.45104375 >>45104223 it takes a while to fully shut down after ctrl+C is registered, not sure fully why but probably de-loading the model. if it fails to load the model but stays in limbo it exits instantly, and smaller models seem faster

>>	Anonymous Wed Mar 15 09:15:10 2023 No.45104565 I've always wanted to see anki decks for specific JP vtubers' most common spoken words. Hopefully we're one step closer to that.

>>	Anonymous Wed Mar 15 09:17:04 2023 No.45104622 File: 681 KB, 1300x1190, 1639030462201.jpg [View same] [iqdb] [saucenao] [google] you could sell this to Cover and become multimillionaire

Anonymous Wed Mar 15 09:40:43 2023 No.45105368

>>45104375
you'd def know better than me, I'm a complete hack compared to the all star who got this working.
for what it's worth though on my system it seems to hang indefinitely until I spam interrupts, then I get an exception from the ffmpeg read on 131. makes me think maybe it's waiting on the thread for some reason... maybe tweaking the shutdown could make it faster? I could be totally off though and it's already great so don't waste a second on it. I'll poke around and if I find any way to improve it (unlikely) I'll post / PR it.
again great work on all this anon, it's truly fantastic. star trek level stuff

>>	Anonymous Wed Mar 15 10:33:05 2023 No.45106859 Is there a way to use Stream Translator for Twitter space? Streamlink doesn't seem to have a plugin for it.

>>	Anonymous Wed Mar 15 11:02:17 2023 No.45107639 >>45106859 You could try using audiowhisper to translate your desktop audio instead of pulling from a stream source

>>	Anonymous Wed Mar 15 11:03:30 2023 No.45107680 File: 147 KB, 300x300, 1677678276373506.png [View same] [iqdb] [saucenao] [google] >>45104622 And then lose it all getting sued by OpenAI because its a non commercial licence

>>	Anonymous Wed Mar 15 11:10:25 2023 No.45107918 >>45107639 audiowhisper doesn't work with faster-whisper right? Default whisper doesn't run on my 3070 TI for some reason.

>>	Anonymous Wed Mar 15 11:28:27 2023 No.45108422 >>45107918 Default whisper ought to work provided you don't use the large model, which needs 10+ gigs of VRAM.

Anonymous Wed Mar 15 12:12:56 2023 No.45110156
File: 53 KB, 739x1024, 1671145514209769m.jpg [View same] [iqdb] [saucenao] [google]

Imagine the amount of time some of these /jp/ gigasperg fags spent learning Japanese for the sole purpose of watching internet anime girls and we are now able to have the same experience for zero effort. Skynetfc wins again lol lmao get dunked on meatbags.

>>	Anonymous Wed Mar 15 12:56:58 2023 No.45112803 >>45110156 learning japanese is fun

>>	Anonymous Wed Mar 15 13:10:01 2023 No.45113544 Testing the live translator now, RTX 4080 FP16 - wish me luck

>>	Anonymous Wed Mar 15 13:14:51 2023 No.45113779 >>45113544 Few seconds behind live, seems slightly less accurate than the stock whisper implementation with VODs but that's to be expected really

>>	Anonymous Wed Mar 15 13:17:51 2023 No.45113927 >>45113779 Interesting experiment but for the time being I think I'd recommend just using vods over the live translator

>>	Anonymous Wed Mar 15 13:23:37 2023 No.45114190 Trying Live TL as well, RTX 3080. Speed of tl is within 6 seconds. Accuracy is very spotty.

>>	Anonymous Wed Mar 15 13:26:25 2023 No.45114348 >>45110156 >same experience lol lmao

>>	Anonymous Wed Mar 15 13:30:52 2023 No.45114570 >>45114190 The interval flag for calls to the language model is set to 5 seconds by default so that'll be why its behind by that much, course dropping it could adversely affect the already subpar accuracy of this jerry rigged live implementation

>>	Anonymous Wed Mar 15 13:54:51 2023 No.45115593 >>45114348 Even if it's not there perfectly now it will be within 6 months or less, keep coping unless you want to tell us about you are a master at Japanese dialects and nuanced jokes?

Anonymous Wed Mar 15 14:14:50 2023 No.45116329

>>45114570
Nahone
>>45115593
It's currently struggling with very basic Japanese. Jokes, names, abbreviations, slurred speech, and sometimes even loan words it either doesn't even attempts to translate or turns it into complete gibberish. Shit it even throws in random Cyrillic symbols for some fucking reason. None the less I'm impressed with it, even in it's current state. Also learn English you fucking faggot

>>	Anonymous Wed Mar 15 14:15:17 2023 No.45116349 >>45065726 https://www.youtube.com/watch?v=qh9bG3SnGrA Try KanaMari's latest one, they use a lot of slang and talk fast, so that'd be a good test

>>	Anonymous Wed Mar 15 14:15:48 2023 No.45116377 >>45073100 try this anon >>45116349

Anonymous Wed Mar 15 14:23:01 2023 No.45116702
File: 16 KB, 891x184, image_2023-03-15_222241187.png [View same] [iqdb] [saucenao] [google]

>>45061611
I have been playing around with Whisper too and I found this github where it uses Whisper to translate audio from livestreams in real time. I can't get it to work on my end, perhaps you should try it.

https://github.com/fortypercnt/stream-translator

>>	Anonymous Wed Mar 15 14:26:21 2023 No.45116844 >>45116702 anon... >>45072707 >>45073381 >>45073581

>>	Anonymous Wed Mar 15 14:26:27 2023 No.45116847 >>45116702 Mentioned further up the thread anon, we got it to work but its not all there yet unless we discover some tweaks that can refine its accuracy

>>	Anonymous Wed Mar 15 14:29:58 2023 No.45116997 If even Google can't get good machine translation, I don't think some random anon with free software will be able to crack it.

>>	Anonymous Wed Mar 15 14:30:25 2023 No.45117021 >>45116844 >>45116847 Oops sorry my bad

>>	Anonymous Wed Mar 15 14:31:16 2023 No.45117057 >>45072707 Wish it could do live subtitling while playing back the video on VLC (since this just uses streamlinks and whisper together).

>>	Anonymous Wed Mar 15 14:32:16 2023 No.45117105 >>45116997 We got "good machine translation", it obviously doesn't match human translation but it mogs Google.

>>	Anonymous Wed Mar 15 14:33:15 2023 No.45117151 >>45116997 Google is jobbing hard anon they're losing the AI war to Microsoft.

>>	Anonymous Wed Mar 15 14:33:27 2023 No.45117162 >>45117057 For that you want AudioWhisper, which uses stereo mix to translate desktop audio. If you're running video playback on VLC though you're better off just using stock whisper to generate an SRT file for the video.

>>	Anonymous Wed Mar 15 14:34:30 2023 No.45117219 >>45117151 And its worth noting that this model is developed by Microsoft's AI partner

>>	Anonymous Wed Mar 15 14:40:10 2023 No.45117474 >>45116997 Whisper is from OpenAI which is now owned by Microsoft

Anonymous Wed Mar 15 14:58:31 2023 No.45118261

Back for a bit before work.
>>45117219
>>45117474
microsoft is just a shareholder, not controlling, I think.
>>45104565
You don't have to translate, you can ask it to transcribe instead (default for whisper on VODs, option on stream-translator). Then filtering this could give a list of kanji easily, and maybe a list of words using an appropriate grammar regex? Might look into it later, any particular people you can to try it on?

>>	Anonymous Wed Mar 15 15:00:09 2023 No.45118336 >>45118261 Correct, hencewhy I said partner and not subsidiary

>>	Anonymous Wed Mar 15 15:00:27 2023 No.45118347 >Machine translating vods Because otakmori just wasn't bad enough already

>>	Anonymous Wed Mar 15 15:02:35 2023 No.45118451 >>45118347 If Otakmori released whole ass vods they wouldn't be so bad

>>	Anonymous Wed Mar 15 15:03:37 2023 No.45118514 >>45118451 This is also better than Otakmori half the time

>>	Anonymous Wed Mar 15 15:05:30 2023 No.45118606 File: 189 KB, 1158x1637, borger.jpg [View same] [iqdb] [saucenao] [google] >>45118451 Yes they would be so bad

Anonymous Wed Mar 15 15:07:21 2023 No.45118687

>>45118347
it was already possible to translate clips from your native language using machines with DeepL, but you had to transcribe it accurately. definitely seen that before too.
Anyway, I would only use this for personal use. Technically you can get sued for monetizing the output of this

>>	Anonymous Wed Mar 15 15:24:19 2023 No.45119419 >>45118606 Okay yeah fair enough

>>	Anonymous Wed Mar 15 15:25:24 2023 No.45119460 >>45118687 Yeah, OP here - really wouldn't use this for non private content/SRT releases like what I did as its obvious that you're using an AI and OpenAI could crack down on you hard

>>	Anonymous Wed Mar 15 15:29:37 2023 No.45119643 Their paper claims 36.2 BLEU on high. Supposedly 30-40 is "understandable to good". However, it feels more in the 10-30 range depending.

Anonymous Wed Mar 15 15:31:57 2023 No.45119739

>>45119643
Depends on the solution used, I find that downloading vods audio streams at the highest quality possible then running it through the large model gives pretty good results - janky sure, but if you have basic knowledge of Japanese already its basically just there to fill in gaps in your knowledge and you can catch the mistakes it makes easily

>>	Anonymous Wed Mar 15 15:42:42 2023 No.45120168 >>45119739 So you download the whole video, not just best-audio? Also, what params are you using/what specs?

>>	Anonymous Wed Mar 15 15:45:04 2023 No.45120260 >>45119643 also BLEU is a pretty anal corpus dependent metric which explains a few things

Anonymous Wed Mar 15 15:49:36 2023 No.45120476

>>45120168
Sorry that comment was scuffed as fuck, what I meant to say was that I download the whole video then strip out the video and leave the audio using ffmpeg to save on space, then run it through stock whisper with the following parameters:

--model large --language Japanese --device cuda --task translate --output_format srt

Far as spec is concerned the only relevant one is the GPU I run it on, that being an RTX 4080

Anonymous Wed Mar 15 15:59:04 2023 No.45120884

>>45120476
I would imagine just downloading best quality audio is sufficient?
Also, I did roughly this and got okay results, maybe a bit worse than what you said in places. Did not force driver=cuda but that should be the default if it's available based on the code. RTX 2070

>>	Anonymous Wed Mar 15 16:00:14 2023 No.45120929 >>45120884 oh but I only did medium due to VRAM issues which might explain it. I'll do a trial on faster-whisper large when I get it to work later.

>>	Anonymous Wed Mar 15 16:01:31 2023 No.45120981 >>45120884 Yeah using the -x option in youtube dl would be sufficient, dunno why I wasn't doing that - that still downloads the video though, it just automates the step I was doing with ffmpeg.

>>	Anonymous Wed Mar 15 16:10:19 2023 No.45121360 >>45120981 idk if that's necessary at all https://github.com/openai/whisper/discussions/41#discussioncomment-3713140

>>	Anonymous Wed Mar 15 16:12:29 2023 No.45121448 >>45121360 I just do it to save HDD space when I store archives to try with different parameters, not because there's any functional advantage to it.

>>	Anonymous Wed Mar 15 16:40:01 2023 No.45122685 >>45121448 have you used faster-whisper yet? I don't see a native cmd util for it annoyingly and was putting off doing it myself

>>	Anonymous Wed Mar 15 16:43:51 2023 No.45122843 >>45122685 Faster whisper is as the name implies, faster whisper. Not much variance there but I do find its a tad less accurate than the stock implementation, albeit much faster and with far less memory usage

>>	Anonymous Wed Mar 15 16:49:35 2023 No.45123081 >>45122843 How do you use it from cmd? Can only do it in stream-translator, need to copy some code or smth

>>	Anonymous Wed Mar 15 17:11:57 2023 No.45123995 >>45123081 I just used a disposable python script then dumped it when I didn't see much benefit over the stock whisper model for my purposes, faster whisper doesn't come with its own command line tools.

>>	Anonymous Wed Mar 15 17:24:01 2023 No.45124498 >>45123995 I'll make one myself then. Sadly my VRAM is not big enough to support large models without faster-whisper.

Anonymous Wed Mar 15 18:12:53 2023 No.45126698

>>45061611
This is amazing anon, thank you so much for your work, managed to get the pego stream to work and its really impresive. What are you gonna do with this? I feel a program or extension like what LiveTL did would be super popular (but i know jackshit about this kind of stuff so its probably super hard to make lol)

anyways, nice work!

>>	Anonymous Wed Mar 15 18:22:18 2023 No.45127174 >>45126698 Could potentially make a centralised repository of AI generated subs for holo streams or do AI threads where I (and other folks with capable hardware) pump em out on request.

>>	Anonymous Wed Mar 15 18:31:56 2023 No.45127652 >>45127174 I would eternally kneel to you, anon. I honestly cant believe we are at this point with technology where this is possible. I wonder if live translating is possible in the future.

Anonymous Wed Mar 15 18:37:14 2023 No.45127883

>>45126698
>>45127652
(other anon)
For an extension, it would be somewhat problematic since it would need to run locally on your GPU, so you'd need to do a bunch of setup listed in "rentry co live-tl" anyway, and it would just be calling a script from Chrome. Not sure how much benefit there is, and I'm personally not familiar with coding browser extensions.
For repositories, it seems more manageable. Probably just have something like stream links/metadata alongside the .srt/text files generated along with some data like build/parameters used.
Plus you can follow a model similar to how they distribute anime subtitle files.

>>	Anonymous Wed Mar 15 18:38:53 2023 No.45127955 >>45127883 We could potentially have an autoloader for pre-generated community accepted subs from said central repository, maybe even slightly tweaked and manually edited ones, that would be a bit more feasible as an extension.

Anonymous Wed Mar 15 18:40:41 2023 No.45128035

>>45127883 (me)
to get around the GPU+setup issue, you could have a single person computing it for everyone.
Then obviously you could piggyback off of LiveTL by using its functionality and actually sending automated messages in chat, but I don't think anyone would appreciate that and it would be shut down. But maybe a third-party messaging system that you can opt-in to? IDK

>>	Anonymous Wed Mar 15 18:41:22 2023 No.45128066 you guys think an AMD RX6600 could run this without being painfully slow? i'm kicking myself for buying an amd when I had the chance to get a 3060...

>>	Anonymous Wed Mar 15 18:41:48 2023 No.45128089 >>45127955 I was thinking about extensions for live streams in particular as opposed to VODs, but you're right about that. What you're saying is basically just re-implementing youtube community subs in a sense, and piping in AI-generated stuff, right?

Anonymous Wed Mar 15 18:45:45 2023 No.45128263

>>45128066
I think using faster-whisper and maybe only a medium sized model you should be able to achieve live translation. For VODs, faster-whisper and large should be a reasonable pace but not super fast. (I'm just basing this off of online comparison metrics for GPUs.)

The bigger annoyance is setup for AMD systems. My understanding is that pytorch with AMD GPUs only currently exists for linux and you have to do some non-standard build processes:
https://github.com/openai/whisper/discussions/105

>>	Anonymous Wed Mar 15 18:46:27 2023 No.45128296 >>45128089 Yeah essentially, livestreams are a bit of a tough nut to crack but VODS are a lot more do-able

Anonymous Wed Mar 15 18:48:33 2023 No.45128398

>>45128035
Yeah, livestreams are a tough nut to crack but VODs are a lot more doable.

Actually, the extension mentioned at the top of this thread has support for subtitle searching using OpenSubtitles.org and Amara.org, so the hard work may be done for us there

>>	Anonymous Wed Mar 15 18:49:34 2023 No.45128456 >>45128263 Problem is you get worse translation results with the medium model in my opinion, its fine for transcription but there's less room to manoeuvre for the kind of guesswork needed to translate when you have less parameters like that.

>>	Anonymous Wed Mar 15 18:51:23 2023 No.45128545 >>45128035 That person would need a hell of a workstation to make that possible, we're talking enterprise level shit or paid cloud computing

Anonymous Wed Mar 15 18:54:00 2023 No.45128648

>>45128456
you can try with the large, idk if it'll be quite fast enough.
>>45128545
i'm imagining people can sign up, not just 1 for all.
plus, I'm pretty sure modern GPU systems could support a few streams live simultaneously, maybe using faster-whisper. There's only so many holoJPs.

>>	Anonymous Wed Mar 15 19:03:51 2023 No.45129099 I'll be out for a bit but hoping to come back with some further documentation

Anonymous Wed Mar 15 19:17:35 2023 No.45129729

Seems to me that: https://chrome.google.com/webstore/detail/subtitles-for-youtube/oanhbddbfkjaphdibnebkklpplclomal?hl=en
In conjunction with: https://amara.org/

As a source for searchable community driven AI generated and manually editable subs may be our best bet here, we could make a bespoke repo and extension but really all the works already been done for us between those two.

>>	Anonymous Wed Mar 15 19:41:59 2023 No.45130834 Would a 3060 be good enough to handle just translating from downloaded stream VODs?

>>	Anonymous Wed Mar 15 19:51:36 2023 No.45131310 >>45130834 The 3060 12gb is one of the few cards on the low-mid end capable of running the large model owing to its VRAM capacity so yeah totally, though it will be considerably slower than higher end options it can definitely handle it.

>>	Anonymous Wed Mar 15 19:59:05 2023 No.45131711 >>45131310 what about the 3060ti? I was thinking of buying that one but if the 3060 can handle this better than the ti version I dont see much reason to get it, specially when the 3060 its cheaper where i live lol

Anonymous Wed Mar 15 20:18:09 2023 No.45132639

>>45131711
The 3060ti would be unable to handle the large model when it comes to regular whisper, though it could do faster whisper. That being said the TI is leagues better than the 3060 for gaming, so go with that if that's your priority (which I presume it is).

>>	Anonymous Wed Mar 15 20:26:14 2023 No.45132998 >>45132639 Some gaming yeah, but I also wanted to check out Stable Diffusion. Could be useful for reference images for drawing.

>>	Anonymous Wed Mar 15 20:44:37 2023 No.45133776 >>45132998 If productivity is a concern the 3060 12GB would ironically be the better pick owing to its greater memory capacity, just be aware it isn't really suitable for gaming above 1080p these days, and ray tracing is out of the question.

>>	Anonymous Wed Mar 15 21:32:50 2023 No.45136105 >>45132998 get the 12gig version of 3060. TI only has 8 which is going to be a bottleneck/limit for all sorts of large models unless someone has created efficient versions

Anonymous Wed Mar 15 22:05:05 2023 No.45137575

>>45129729
amara seems a bit corpo to me, although it would definitely work for publishing subs. trying to estimate how much data subs in en+ja would require when summed over all holo vtubers over all time, and the viability of alternative storage/distribution

>>	Anonymous Wed Mar 15 22:32:04 2023 No.45138776 >>45137575 Holo is a corpo so I think its a bit of a moot point here really

>>	Anonymous Wed Mar 15 23:30:56 2023 No.45141995 edit coming soon

>>	Anonymous Wed Mar 15 23:39:46 2023 No.45142479 >>45097660 No no Fubuki is furendo!

>>	Anonymous Wed Mar 15 23:41:18 2023 No.45142565 You madlads

Anonymous Wed Mar 15 23:53:26 2023 No.45143370

Updated
>rentry co live-tl
with VOD translation instructions. Maybe will add the following at some point:
>VOD translation with fast-whisper (only useful if you really need efficiency. e.g. if you do not have enough VRAM for large models)
>modifying whisper so that it saves partial data along the way, in case you need to terminate or restart or whatever

>>	Anonymous Thu Mar 16 00:05:22 2023 No.45144076 >>45143370 Doing gods work anon o7

Anonymous Thu Mar 16 00:09:57 2023 No.45144346

>>45141995
nice work anon. some ideas if you want 'em:
might want to clarify a bit on the python versions, I initially tried with 3.12 and pip couldn't resolve dependencies for faster_whipser. needed to downgrade to 3.10.9 to get it to work (needed <3.11 according to pip, don't remember which dep). could just be something weird in my setup or overly cautious requirements in the deps if you have it working tho.
also might want to add a bit about pulling dependencies with winget, it's available by default in W11 and should be around in W10 v1809+. got everything from there except cuDNN.

>>	Anonymous Thu Mar 16 00:14:47 2023 No.45144588 Im stupid and dumb, whats rentry co live-tl?

Anonymous Thu Mar 16 00:19:55 2023 No.45144862

can someone else check the faster-whisper code for stream-translator and give a second opinion on whether it is (currently) compatible with history_buffer_size or not? My reading tells me their code doesn't use history for faster-whisper but I can't fully understand what arguments are being passed to model.transcribe
>>45144346
Another anon had issues pulling the right version of CUDA using winget. Anyway I didn't use it but can add a sentence that one can use it near the beginning. If you have any usage tips drop them here and I can add it.
I'll add the Python warning
>>45144588
put a dot between rentry and co and a slash between co and live-tl then put it in your browser. 4chan thinks it's spam sometimes hence the weird format

>>	Anonymous Thu Mar 16 00:44:11 2023 No.45146169 >>45144862 I feel like line 143 of translator.py should be >if not faster_whisper_args and otherwise the code should throw errors. Probably being dumb.

Anonymous Thu Mar 16 01:06:21 2023 No.45147384

>>45144862
heh same anon actually, yeah nvidia doesn't have tags set up the same way as the other repos but it was still there.
you can install all the needed deps for the basic version by running these commands:
> winget install Git.Git
> winget install Python.Python.3.10
> winget install Gyan.FFmpeg
> winget install Nvidia.CUDA --version 11.8
things you MIGHT still need to do manually (winget usually handles adding stuff to path automatically but for me ffmpeg didn't work) :
> add the ffmpeg bin folder to PATH (as of current version it will be '%LOCALAPPDATA%\Microsoft\WinGet\Packages\Gyan.FFmpeg_Microsoft.Winget.Source_8wekyb3d8bbwe\ffmpeg-6.0-full_build\bin', if they update just find the new release in the packages folder). could probably do this as a command with something like 'setx path "%PATH%;%LOCALAPPDATA%\Microsoft\WinGet\Packages\Gyan.FFmpeg_Microsoft.Winget.Source_8wekyb3d8bbwe\ffmpeg-6.0-full_build\bin' but this is risky without knowing everything going on with your system's path, probably best to just do it from the GUI if winget didn't do it for you.
and finally for faster_whipser:
> download 'cuDNN v8.8.0 (February 7th, 2023), for CUDA 11.x' and extract contents of libs, include, and bin to the corresponding folders in '%PROGRAMFILES%\NVIDIA GPU Computing Toolkit\CUDA\v11.8'
> download 'Zlib 1.2.3' and extract zlibwapi.dll to '%PROGRAMFILES%\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin'
after that you should be good to go starting with item 2 in the rentry

>>	Anonymous Thu Mar 16 01:44:49 2023 No.45149571 >>45144862 (me) I think faster-whisper for stream-translator doesn't use history but I don't see a good way to do it so I'm not going to mess with internals idk about >>45147384 thanks will add

>>	Anonymous Thu Mar 16 02:39:29 2023 No.45152823 adding faster-whisper VOD soon but rebuilding some stuff

>>	Anonymous Thu Mar 16 03:12:47 2023 No.45154856 >>45152823 updated! plus some extra debugging techniques like cudnn recognition

>>	Anonymous Thu Mar 16 03:34:21 2023 No.45156094 >>45154856 gonna sign off. again, include "rentrylivetl" in a post complaining about the instruction document. maybe will think about mass translation reposities and live stream coverage etc. as discussed with anons at a later date

>>	Anonymous Thu Mar 16 03:54:29 2023 No.45157263 File: 260 KB, 1424x1768, 1670683285713112.jpg [View same] [iqdb] [saucenao] [google]

>>	Anonymous Thu Mar 16 04:49:12 2023 No.45160054 File: 276 KB, 1305x2048, 1657230282199987.jpg [View same] [iqdb] [saucenao] [google] Good thread don't die

>>	Anonymous Thu Mar 16 05:07:56 2023 No.45161104 >>45160054 do you have any content you want to see?

>>	Anonymous Thu Mar 16 05:55:51 2023 No.45163301 >>45160054 this, finally some good shit on the catalog

>>	Anonymous Thu Mar 16 06:51:23 2023 No.45165353 Bump

>>	Anonymous Thu Mar 16 08:00:12 2023 No.45167331 Do people still use ytarchive to download livestreams or is yt-dlp now better?

Anonymous Thu Mar 16 08:10:03 2023 No.45167610

>>45167331
yt-dlp is finicky if you don't initiate at the beginning of the stream but want to download it live. --live-from-start did some weird stuff as of some months ago, haven't updated and tried the newest stuff since I don't need to archive live stuff halfway through often
In all other respects though, yt-dlp is superior.

Anonymous Thu Mar 16 08:23:20 2023 No.45167959

>>45156094
hey anon, I've been playing with the source to try to fix the hanging problem I saw.
to recap the issue is the script hanging indefinitely when the stream ends or a single KeyboardInterrupt is received. have to spam interrupts to cause a hard exit in order to quit. not a huge deal really but will be an issue probably if there is ever an effort to crowdsource translations with daemons.
beware this could be a brainlet take, so if I'm dumb feel free to tell me or ignore
I think the KeyboardInterrupt or stream ending kills the streamlink process, which closes but leaves the ffmpeg process awaiting input; then the read for in_bytes in the main loop blocks waiting for bytes. everything sits until we spam enough interrupts to kill the ffmpeg process, then the model finally shuts down and it quits. might be a windows-specific problem since some of the process signaling stuff is missing there.
to fix, I edited your writer function to make sure the pipe gets closed down when either process dies. then everything works as expected and it shuts down within a couple seconds instead of hanging indefinitely. also added an except block so you don't get a traceback when quitting via keyboard interrupt. git diff -p:
> diff --git a/translator.py b/translator.py
> index 95247e6..d291b0b 100644
> --- a/translator.py
> +++ b/translator.py
> @@ -85,6 +85,8 @@ def open_stream(stream, direct_url, preferred_quality):
> ffmpeg_proc.stdin.write(chunk)
> except (BrokenPipeError, OSError):
> pass
> + if not ffmpeg_proc.stdin.closed:
> + ffmpeg_proc.stdin.close()
>
> cmd = ['streamlink', stream, option, "-O"]
> streamlink_process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
>@@ -182,6 +184,8 @@ def main(url, model="small", language=None, interval=5, history_buffer_size=0, p
> print(f'{datetime.now().strftime("%H:%M:%S")} {decoded_language} {decoded_text}')
>
> print("Stream ended")
>+ except KeyboardInterrupt:
>+ print("Quitting from interrupt")
> finally:
> ffmpeg_process.kill()
> if streamlink_process:
I can PR it if you want but it's super tiny, you might have your own way you want to do this, or I might just be crazy and the only one seeing this behavior.
thanks again for the fantastic script. def post if you wanna get a crowdsourced liveTL system going. if you're sticking with python, might be cool to handle the backend with fastapi + starlette websockets for vods / live.

Anonymous Thu Mar 16 08:55:20 2023 No.45168691

>>45167959
it's not my code. I figured that it was something to do with the streamlink though just based on how it behaved. You can definitely pull request the dude if you want. I think there are other poor choices in the code but I'm not a dev so not really sure.

>>	Anonymous Thu Mar 16 10:44:46 2023 No.45171409 >>45168691 Yeah this thing needs some serious tweaking to really be viable

>>	Anonymous Thu Mar 16 11:44:46 2023 No.45173209 oi not yet

>>	Anonymous Thu Mar 16 13:00:21 2023 No.45176705 Bump

>>	Anonymous Thu Mar 16 13:41:04 2023 No.45179478 >>45061611 >>And if you don't want to download the VOD and apply it manually you could use this extension or something like it to apply the subs directly to the youtube VOD: thanks

>>	Anonymous Thu Mar 16 13:41:33 2023 No.45179512 bump

>>	Anonymous Thu Mar 16 13:51:00 2023 No.45180169 Oh. I've been doing this a while with downloaded streams. The easiest ones to translate are ASMR, which can also be funny imo. I'll have to look at the live stuff, but I'm happy with vod's usually. Hope you get it working well!

>>	Anonymous Thu Mar 16 14:42:44 2023 No.45183307 >>45180169 it works OK. I don't have specs to run the large non-efficientized version though so what other anons might say is more reliable. also, do you have any wrappers/usedul utils or are you just directly applying whisper's native functions?

>>	Anonymous Thu Mar 16 15:14:26 2023 No.45184827 >>45183307 I'm not doing anything special, just a basic script to automate the process of running things.

Reply to thread

Name
E-mail
Subject
Comment
Action

Advanced search
Text to find
Subject [?]Search by post subject. Leave empty for any.
Username [?]Search for user name. Leave empty for any user name.
Tripcode [?]Search for tripcode. Leave empty for any.
Email [?]Search by email. Leave empty for any.
Filename [?]Search by image filename. Leave empty for any.
From Date [?]Enter what date to start searching from. Format is YYYY-MM-DD
To Date [?]Enter what date to start searching until. Format is YYYY-MM-DD
Image hash
Search in	All Posts OPs Only
Deleted posts	Show all posts Show only deleted posts Only show non-deleted posts
Internal posts	Show all posts Show only internal posts Show only archived posts
Order	New posts first Old posts first
Capcode	All Posts Only by Users Only by Mods Only by Admins Only by Developers
Results	Posts Threads
Action	[ Simple ]

/vt/ - Virtual Youtubers