[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/jp/ - Otaku Culture


View post   

File: 161 KB, 907x747, 1daa9bff.png [View same] [iqdb] [saucenao] [google]
24033684 No.24033684 [Reply] [Original]

Article: https://pctool.net/2020/03/whether-axfc-closed-or-not
Management Notice: http://www.axfc.net/koubo.html

Recently Axfc Uploader, a filesharing site in Japan, is in danger of shutting down.
The webmaster is trying to find someone else to host the site, but nobody has had any interest so far.
For the entirety of April it was impossible to upload new files to the site.

Due to all of this, I think we need to backup as much data as possible from the site as a lot of it is one-of-a-kind.

The types of content Japanese users upload to this site include:
Hentai
Indie Games
Sheet Music / MIDI files
Utaite MP3s
MMD Assets and Motion data
StepMania songs
MUGEN characters
MAD, otoMAD resources
...and more.

It's very hard to run a scraper across the whole site because of two reasons: Captcha on every download, and password-protected files.

Thus, I'm asking /jp/ if anyone wants to help out with this task.
If we can compile together lists of links/passcodes to media (such as NND's mylist pages of uploads, various japanese websites, etc) and mirror everything together, we might be able to get most of the important data before the site shuts down for good.
Anyone interested?

>> No.24033711

>>24033684
Why don't you just offer to host the site?

>> No.24033729

>>24033711
The site uses 160TB in a ZFS RAIDZ2 array.
Hosting a server with a database like that, not even counting the acquisition price, is already out of my budget.

>> No.24033780

>Anyone interested?
I am. I only have a few TBs to spare personally, but I could assist in the manual aspect.

The best first step would be to get into contact with Japanese archivists, who both have a much hotter flame under their butt, and who probably have actual contact with the owners or those in the know. My Japanese is weak, but enough that I could follow such efforts.

If anyone here can get into contact with them, or direct them to this thread, that would be of great value.

>> No.24033816

>>24033780
Great to hear. I also have japanese communication skills, so if you know any data hoarders in Japan that'll be helpful too.

Currently i'm compiling a list of AXFC links found on Nicovideo for various subjects. Should I make a collaborative googledoc for it? Or just pastebin what I have once I've done a good amount?

>> No.24034062 [DELETED] 

>>>/g/

>>>/g/

trannishitfucks tranni shit mods

shitposting /g/ shit

rangeban these fucking reddit cunts

truly retarded shit moderation

>> No.24034456

where's the heidi dump spaghetti promised?

>> No.24034834

I could lend a TB or two but I don't have any skills to offer.

>> No.24035187

>>24034834
The main help would just be to download any axfc link you find and make a mirror of it with another cloud site.

I started a download list.
Most of these mylists have comments attached for axfc links, or have downloads inside the video itself.
I scraped these by typing site:https://www.nicovideo.jp/ "axfc" into google, then sorting the playlists/videos into categories.
https://pastebin.com/kEdJf8SZ
Please feel free to add new sections to the list for other sites, and add more videos/lists to the categories.
It's fine to have some overlap, the next step will be to convert these lists/videos into link+password+description/source format lists and finally download.

>> No.24035248

>>24033684
Why do the nips use this sort of shit anyway

Like actually? This site sucks and I've always hated using it.

>> No.24035257

>>24035248
looks nice and simple to me. much better than the sites that the rest of the world uses. material design, html5 and javascript were a huge mistake

>> No.24035264

>>24035248
Because it was essentially free file sharing indefinitely, and it was around well before Google Drive, MEGA, and Dropbox were the norm.

Why do people here use pomf.se clones or catbox.moe?

>> No.24035271

>>24035264
>>Why do people here use pomf.se clones or catbox.moe?
I don't! I never have.

And even if something has been around for longer, if it isn't better then I don't see why you'd use it. I'm not even saying put all your eggs in one basket, because there are actually many indefinite file hosts and there have been for quite a while.

>> No.24035288

>>24035271
i bet that you are one of those assholes who posts mega links

>> No.24035293

>>24035257
I mean there's nothing particularly fancy about 4shared, zippyshare, or mediafire but you can use them more easily than this site.

>> No.24035297

>>24035288
I usually posts zippyshare which is not indefinite but offers less culpability since all of the uploads are anonymous.

>> No.24035305

so how much original and rare content does that site have? no one is going to host 100tb of shit but maybe someone could save the important bits of it

>> No.24035352

>>24035305
Most niconico users used it for sharing resources that their video had. So MIDI files, UTAU databanks, sheet music, MMD models and motion data, MAD video resources and sound assets.
I think the best thing to do is just to download the ones that are linked to nicovideo/youtube and independent creator's websites. No need to rip files of random personal stuff.

>> No.24035412

>>24035305
>no one is going to host 100tb of shit
Doesn't archive.org do precisely that?

>> No.24035484

>>24035412
quite useless for any copyrighted stuff

>> No.24035552

>>24035484
Depends, actually:
https://help.archive.org/hc/en-us/articles/360014759692-Rights
https://archive.org/details/softwarelibrary_msdos_games

They don't directly say it, but they try to keep content of indefinite copryight status for non-commercial uses, as long as copyright holders aren't going out of their way to shut it down.

>> No.24035797

>>24033684
This is a testament to how shitty japanese software architecture is. Can never trust them to run a stable website

If you don't believe me, there is a significant amount of data on AXFC over the years that has been lost due to sysadmin incompetence

Here's a list of all the gachimuchi-related AXFC links I could find, where a lot of them simply return a 404. I know that most of them were not intentionally deleted because if they were intentionally deleted, I would expect to see a lot more consistency with deleted links per video
https://gachi.bepis.io/browse/gachi%20index.xlsx

To anyone wanting to archive the content, make sure to change the final CDN links to point to the "gemini.axfc.net" subdomain, as it seems to be the most likely CDN domain to work

As for purchasing the site, I am willing to drop $10k or $20k on it, however I am not sure if they would be willing to accept that

A lot of shit like the linux images, site software and migration/tech support I do not want as I would be throwing it away anyway. The biggest bottleneck would be hosting that amount of data
I only have 72TB in tape storage for example

Plus there's also the issue of the unknown minimum bid price, and them requiring a japanese legal entity

Might be too much effort for what it's worth

>> No.24035965

>>24035797
Thanks for the contribution.
From all the links I'm checking (right now I'm focusing on MIDI files), it's about a 60/40 chance of the file giving a 404 if linked with the old format, and a CDN that is more likely to fail the older the upload.

It's possible to say that majority of the damage has already been done long before axfc shuts down, due to these file loss cases.

It might not hurt to try emailing them anyway, the webmaster looks desperate enough with all the troubles of the site. Even if it means only a partial archive could be done with the lack of data avaliable.

>> No.24037379

Anyone who browses 5ch or other Japanese BBS/forum style sties:

If you know if there are public efforts on 5ch or other sites to archive the site, linking them here or vice versa would be extremely helpful.

>> No.24038874

>>24037379
There's a 5ch thread about it here: http://leia.5ch.net/test/read.cgi/poverty/1590820156/

Using the search function I was able to scrape 1,600 MIDI files.
https://cloud.netcavy.net/s/9yRSPxFwMt7LZne
There's still at least triple the amount still up on there, but they're all password-protected, or in zip files bundled with mp3/wav.
They also IP ban you from downloading after a while if you do it too much too quickly, so you need to get a proxy list.
It seems to also be a smoother experience when using a japanese VPN.

If anyone wants to do random .zip files, about 120,000 of them can be downloaded without the need for hunting on other websites for passwords. Maybe check the descriptions to determine if it's junk or not:
https://www.axfc.net/u/search.pl?search_str=&id_start=&id_end=&extv=zip&size_min=&size_min_si=2&size_max=&size_max_si=2&dl_min=&dl_max=&date_start=&date_end=&num=999&sort=1&sort_type=uid&sort_m=DESC&md5=&sha1=&key=1
Only large files require CAPTCHA.

>> No.24043820

You can post this on the datahoarder subreddit but I don't think they will be interested

>> No.24045237

>>24033729
Is the owner looking for somebody with the hardware and knowledge he can give the website to or just a buyer? I'm not knowledgeable on this topic but can't he just put first 8tb in a torrent and cycle them every x days so people can archive it in parts and let the thing die?

>> No.24046081

>>24045237
there's probably a legal issue of personal information if he just freely distributes all the data randomly, since majority of content is behind passwords. So I'm guessing a buyer specifically judging how they were marketing the advertisement potential

>> No.24049272 [DELETED] 

Bump.

>> No.24049285

>>24049272
Why?

>> No.24052365

>>24049285
Can't tell if retard from /v/ or actually concerned OP trying to keep the thread up due to the lack of interest

>> No.24055481

>>24045237
>>24046081
It also could be an issue that the maintainer simply doesn't have the motivation to repackage it for archiving, or would encounte technical difficulties, if the issue with CDNs/broken links is to be believed.

Coming as someone who has been in a similar position, where a site I was running was basically unmaintained from my own apathy, there is a much greater appeal to it being someone else's problem entirely.

>> No.24055527

>>24038874
>They also IP ban you from downloading after a while if you do it too much too quickly, so you need to get a proxy list.
hahaha these dumb mother fuckers and everyone that uploaded to them exclusively deserve to burn

>> No.24055546

>>24055527
It's meant to be limited, not free access to download anything and everything.

>> No.24056263

>>24055481
This sounds so much more complicated and frustrating than sad panda. But if archives are protected by password who cares if someone downloads them? I mean you could brute-force them in theory but who has the motivation and hardware to perform the task?
Someone pointed out how difficult it would be writing a script to autodownload stuff but people have done it for every other site locking content behind accounts and captchas (see jdownloader). If I remember correctly jdownloader even has a script that restart your router as soon as you meet your IP limit so this could indeed be scriptable if somebody provided a database with file names, file size and url. No need to have160tb free on your system if you organize different groups to archive different files about different topics

>> No.24058231

>>24056263
You can download whatever you want from this site easily once you get the unique code generated by the site's verification script.
The code can be seen in the url after the "dr=" section once you generate a download link.
The CDN format is https://[cdn server].axfc.net/d/[Code]/[Filename.ext]

>>24052365
I didn't bump so probably the former

>> No.24064962

> but nobody has had any interest so far
>the site uses 160TB
Are you telling me the whole Japanese net can't be assed to spare 160TB?

>> No.24065660

>>24064962
There might be a cultural thing as people from the west care more about preservation and hoarding than japanese people
See for example all artists deleting their stuff and getting mad at anons archiving on sad panda

>> No.24071836

>>24064962
>>24065660
Yeah, the only reactions on the japanese side go into these main categories:
"Axfc? thats nostalgic"
"Ah, it can't be helped. It's old after all"
"Upload your MMD models and UTAU banks to multiple filehosts when sharing resources from now on!"

Only the last motion gives a useful outcome but that's only from now to the future, not repairing what has been added in the past.

>> No.24073278

Oh, what a shame. I've gotten a lot of nice things out of Axfc uploader. I don't even remember what most of them are, but I know I have them. At least my personal archives will be preserved.

>> No.24081864

>>24056263
>This sounds so much more complicated and frustrating than sad panda.
Actually the main dev of Sad Panda, tenboro, seems to have had a similar situation where he didn't wish to deal with the task of potentially migrating to a new country to continue the sad panda part of the website, and didn't want to deal with a turnover to new administration. Though either happened the latter, or he eventually became willing to do the migration. Good on him either way for putting giving the site a continued lease on life.

In either case, technical debt was a big issue, but it is almost always an issue that can be fixed with willing volunteers. The owner being unable or unwilling to either organize people for that task, or let someone organize for them, is a much more difficult task.
Doesn't matter who can fix up the kingdom if they don't hold the keys, and the gates are kept closed.

>> No.24081891

>>24081864
Oh, and to add on, the latter is a much difficult task not only because the owner is being uncooperative, but because archiving at that point can involve dumping, limitation/lock evasions, and other actions which the admins might be less than appreciative of (especially if hosting costs were a motive for the shutdown), and might outright hamper.

>> No.24085196
File: 27 KB, 485x270, wisdom_of_the_ancients.png [View same] [iqdb] [saucenao] [google]
24085196

>>24073278
>At least my personal archives will be preserved.

>> No.24085214

>>24085196
nvm fixed it :)

>> No.24096881
File: 1.88 MB, 1908x2042, rms_fa.jpg [View same] [iqdb] [saucenao] [google]
24096881

This sounds like a job for archive.org. They preserve lots of old data collections.

>> No.24097386

>>24096881
People could probably get it on archive.org once they data is available. It's up to them to see how much legal pressure they get from hosting it, which would probably be small from a Japanese file upload site.

The main issue is that there is currently no means to obtain the archive in the first place, unless either the current owners or new owners provide it.

>> No.24098930

>>24096881
No it doesn’t.
You have to archive it yourself.

>> No.24102162

>>24096881
>>24097386
>>24098930
Didn't they try asking them to save sad panda and refused because they prefer 3d lolis over 2d? Or I'm mixing hoarder associations?

>> No.24108296

>>24071836
>Ah, it can't be helped. It's old after all
And here I thought that Japanese are about respecting past and preserving it. Do they really don't give a damn? What a bunch of lazy NEETs.

>> No.24117944

>>24102162
I tried to look it up, but found no evidence of such an attempt.

However, an archive of lolicon porn on USA servers would be contentious regardless.

>> No.24127964

>>24108296
>Do they really don't give a damn?
When it comes to online archiving, the vast, vast majority of people simply don't value the content enough to think of backing it up preemptively, and of those people only a minority has the technical means to back things up in a consistent and redundant manner.

When most people back up, what they're really doing is something like >>24073278 , where they only back up what they consider to be directly relevant to them. This is actually how most works are lost throughout history: not because of people going out of their way to burn books, but because of people being uninterested in copying a currently extant work. When that work is destroyed, there is usually no one to even mourn for it.

Individual Japanese online archivists are around, but they also are not the norm.

>> No.24142057

Why preserve a bunch of archives no one has the password to?

>> No.24151837 [DELETED] 

>>24033684

>> No.24152094 [DELETED] 
File: 8 KB, 608x115, 1567933380486.png [View same] [iqdb] [saucenao] [google]
24152094

>>24151837
How is this not reportable

>> No.24153551 [DELETED] 

>>24033684

>> No.24153961 [DELETED] 

>>24033684

>> No.24154541

>>24153961
what?

>>24142057
this basically sums up that we won't be able to archive the site except for whatever things people personally downloaded over the years. I've downloaded and archived the things I needed the most, so should others. Not much else to do.

>>
Name
E-mail
Subject
Comment
Action