[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]

/vt/ - Virtual Youtubers

Search:


View post   

>> No.30566244 [View]
File: 59 KB, 1073x1086, 1629503426541.png [View same] [iqdb] [saucenao] [google]
30566244

>>30565313
Serious, I'm still working on the project, given the very small amount of ressource for this precise application I thought it didn't cost much to at least share it here in case anyone had something to say
Still I'm aware of how it can fail at multiple steps yet, I need to clear my mess before sharing it, getting an interesting results may need a lot of tweaking too


For vulgarisation I often compare to how minecraft generate random terrain indefinitely but it always follows many rules to not be just random blocks everywhere but a coherent landscape. Similarly, with music.
For how it actually works, I have a lexical database with ponderated links between each words. It serves as the missing link between the songs used to train the model (manually labelled) and the other inputs like face tracking. External inputs get translated into words like Happy, Sad, Fearful, Angry, Surprised, Disgusted, Sleeping and their combos, visuals similarly.
Input -> language -> labels -> set of song for the model

After the input is identified as a set of ponderated words (vector), each track in the base model is pondered depending on cosinus distance between the input's semantic vector and the track's semantic vector. Then the model is used to
construct the Stochastic matrix which constructs the melody

Very simplified example but : when on a F note, knowing the previous one was A, you have 40% chance to go to note C, 25% back to A, 15% to G etc. depending on tonality) this part is the most tricky to make work well, there are many others things that juste the note to consider (its duration, harmonies, etc) for now I go for chiptune style since it's pretty consistent and and "simple" and can be easily generated by computer.

I thought that instead of using random seeds, using data like pictures and face tracking would be a new angle that has never been properly explored.

I don't have any identity I'm ready to attach to this right now, but it's a topic I've been researching for a while and will continue to progress on, this paper is a good start https://www.ehu.eus/cs-ikerbasque/conklin/papers/jnmr95.pdf to understand Music Prediction.

Navigation
View posts[+24][+48][+96]