[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 131 KB, 520x671, 03C04183-9C12-4A9D-ACA5-9A48F5840046.jpg [View same] [iqdb] [saucenao] [google]
15325842 No.15325842 [Reply] [Original]

Based on what I can find on Google, regression on images is quite difficult.

I have a set of a couple thousand images that for all intents and purposes are equivalent to spectrogram data.

I’m trying to add a couple layers to image net and have them spit out a regression value but getting terrible accuracy. Is image regression just not doable? Or very difficult? Any good papers or tutorials on how to make it work?

>> No.15325863

>>15325842
It's called Midjourney.

>> No.15325877
File: 91 KB, 513x900, E806C8CA-2121-4D31-8686-5F75408E5562.jpg [View same] [iqdb] [saucenao] [google]
15325877

>>15325863
Maybe I should be more clear: I want something that will take in an image and spit out a value between 0 and 1

>> No.15325913

>>15325877
Do you want frame interpolation Or image inpainting?

>> No.15325924

>>15325842
You should be searching for convolution neural networks...

>> No.15325931
File: 33 KB, 474x266, 1680627097766950.jpg [View same] [iqdb] [saucenao] [google]
15325931

>>15325913
He wants classification so neither. Regression means expressing some output value in terms of some input values as sums and products.

In this case it looks like he wants a value between 0 and 1 to determine the probability that whatever he's looking at belongs to some class or not, e.g patient has cancer

>> No.15325940
File: 2 KB, 105x125, 1680626272491446s.jpg [View same] [iqdb] [saucenao] [google]
15325940

>>15325842
If you have a spectrogram then you have the Fourier coefficients so just set up a fully connected network and use gradient descent. You're overcomplicating this by using existing neural network models because they're not designed to work with frequency domain representations.

>> No.15325974

>>15325931
Close. Classification is easy. I don’t want a 0 or 1, I want to know the exact value between 0 and 1. This is apparently still a hard problem, which is surprising to me.

I’m dealing with a bunch of material science data, and I don’t want to get too specific about what it is because I think only the small company I work for deals with it. But an analogy would be: I have spectrographic data. I want to know how loud it is on a scale from 0 to 1, not just “is it loud, 0 or 1?”

>> No.15325979

>>15325924
I tried using a CNN, that’s what image net is. It didn’t work very well, which could be because I’m retarded, but I can’t find anyone who seems to be able to make it work.

>> No.15325985

>>15325940
Sorry it’s not actually spectrogram data, just looks like it. It’s material science data, but it’s on a 2d plot like that. Just wanted to clarify that it’s not traditional images. Don’t want to get too specific about the data since I think it’s a very niche thing that only the company I work for does, but it’s basically responses to different voltages and frequencies of inputs.

I could try taking some kind of Fourier transform of the data and seeing if that gives me anything that might be simpler to train with.

>> No.15325997

>>15325974
The industry standard is convolutional networks with bells and whistles like dropout and whatnot but you probably don't have enough data so you will have to do some manual feature engineering or worry about over fitting.

There is no simple answer anyone can give you that will work for your specific use case. You'll need to figure out what operations you think will be useful and then connect them in some topology and then you can run gradient descent to figure out the parameters.

If that doesn't work then you can also try using evolutionary algorithms to perform a search over a wider class of functions but you'll again have to do some manual engineering. There is no off the shelf solution for what you're trying to do.

> https://evotorch.ai/

>> No.15326016

>>15325997
Ok good to know, thanks. I will try messing around with my data some more. I think there are probably some easy preprocessing steps I can do to make it better.

Roughly how much data do I need before overfitting stops becoming a concern? I have around 3.5k scans, but I think I can gather more if I need to. Probably not twice as many though.

>> No.15326044

>>15326016
i don't know, i'm not a statistician but in typical neural network training scenarios the sample sizes are much larger (probably by at least an order of magnitude).

>> No.15326062

>>15326044
Ok thanks. I might be able to get away with some data augmentation to reach that.

>> No.15326127

>>15325842
So you have images of data showing some kind of intensity ? Is it interpolation you want, where you have one image with the lowest intensity and one with the highest and all the other images are somewhere between? Our do you not know what the min and max intensity could be so you want to extrapolate using the images you already have? If it's a visual task can you just count the number of pixels of some intensity on each image to measure against the ones with the min or max numbers of a certain pixel intensity?

>> No.15326191

>>15326127
he does not want interpolation. the data is a grid of numbers aka a 2D matrix and he wants to extract information about magnitude/intensity of the sound waves as perceived by the human ear. 1 is louder than 0 which i guess would be silence.

this is not an interpolation problem, he's doing regression

>> No.15326356
File: 83 KB, 934x786, 64B358B2-9B17-4964-AFB1-822EC6816C1E.jpg [View same] [iqdb] [saucenao] [google]
15326356

>>15326191
Exactly. In this case it is actually how many small defects are in a material based on its response to certain currents, but it’s the same idea. 1 in this case would be some theoretical “all defects” material, although the real dataset only has up to ~0.5, and 0 is no defects.

>> No.15326560

>>15326356
This is very similar to tomography where they use Radon transforms

> https://en.wikipedia.org/wiki/Radon_transform

>> No.15326562
File: 1.02 MB, 500x200, Radon_transform_sinogram.gif [View same] [iqdb] [saucenao] [google]
15326562

>>15326560

>> No.15326799

>>15326560
Interesting, but it seems to be very specific to tomography because they are projecting 3d information onto a 2d plane. I don’t think that is occurring in my dataset.

>> No.15327303

t-sne

>> No.15327407

>>15325877
It's called Support Vector Machines, or if you are gigacoped autoencoders.