[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/sci/ - Science & Math


View post   

File: 653 KB, 1150x1500, __makise_kurisu_steins_gate_and_1_more_drawn_by_cheshikk__aaf24647f99ac6d4c65e809c2be27191.jpg [View same] [iqdb] [saucenao] [google]
16185485 No.16185485 [Reply] [Original]

Can anyone recommend good books for learning bioinformatics/genomics/data science?

>> No.16186040

>>16185485
I'm not sure you'll find any. I suspect 90% of useful information is gay shit like how to deal with a broken PDB file or use some kind of MATLAB for genetics. Maybe some professor blogs?

>> No.16186047

The field emerged in the post book era. Best you can find are online documentations of various tools.

>> No.16186214

>>16185485
Go to libgen and type in 'bioinformatics.' I'm not sure which specific book to recommend you, but I can't imagine there's a massive difference between them, based on the subject matter.

>> No.16186554

>>16185485
This is extremely broad, what is the question you want to answer or what do you want to learn?
In case this thread dies I'll leave you with this,
For genomics research look into going from reads to genome to annotation. To do this usegalaxy has a course you can follow.

Bioinformatics Algorithms: An Active-Learning Approach by Phillip Compeau & Pavel Pevzner is nice with Rosalind as the authors developed Rosalind.

If you don't like coding nor running thigns on a server you can look into the biostarhandbook.

>> No.16187534
File: 82 KB, 1024x823, 1716170333759050m.jpg [View same] [iqdb] [saucenao] [google]
16187534

Bump.

>> No.16187926

>>16186554
Sorry for the vague post. I will be doing proteomic and genomic research this fall at university.

>> No.16188310
File: 1.42 MB, 855x1096, pevsner.png [View same] [iqdb] [saucenao] [google]
16188310

>>16185485
This one was used in our courses back when I was doing my bachelor's degree. PhD student now

>> No.16188318

>>16185485
Also, are you going to do your analyses in R or Python?

>> No.16188438

>>16186214
Sad to see that libgen mirrors are dying out. Only 2 remaining

>> No.16188516

>>16188438
I've saved so much money through buying an iPad and using libgen. That's unfortunate.

>> No.16188689

>>16188318
Python

>> No.16189030

>>16188310
Thank you based anon

>> No.16189181

>>16186040
Those are trivial issues

>> No.16189185

>>16185485
>Can anyone recommend good books for learning bioinformatics/genomics/data science?
The fact that you asked this as an OP on 4chan with a weeb image proves that you're too stupid to understand those fields anyway.
You're a waste of food.

>> No.16189191

>>16185485
https://rosalind.info/problems/locations/
https://www.biostarhandbook.com/
http://www.bioinformatics.org/wiki/Books#Biology_with_computers

BLAST off, kid o7

>> No.16189209

>>16189185
You need to be at least 18 years old to post

>> No.16189429

>>16185485
Ngl, they’re pretty interdisciplinary, but my course in high school had us work with BLAST, R, QIIME2, FASTA, and some other tools for a semester-long population genomics project recreated from a paper, while concurrently reading through a selection of landmark papers in the field to get an understanding of the biology. I never took any of the bioinfo courses in college, but I suspect they’re somewhat similar.

Really, beyond an intro college course understanding of biology, textbooks get kind of useless for bioinfo. The approach is closer to programming.

>> No.16189544

>>16189181
>Those are trivial issues
Behold, the cry of the incompetent professor.
>you just gotta find yourself a good grad student, amirite? ;)

>> No.16189848

>>16189544
>just one more experiment
>could you run this analysis for our collaborators
>it'll benefit your career

>> No.16189958

Pevsner is kind of a big name in the field, like >>16186554 and >>16188310 suggest. Otherwise, you'll learn as you do it.

>> No.16189994

>>16188310
>2015
Is this book still relevant? I only ask because of the rate of change in technology.

>> No.16190064

>>16189994
It's 1000 pages. Goes through biological background, theory behind the analyses and some R examples. It hasn't aged that much.

>> No.16190072

>>16190064
I'm a newbie. Is R or Python more utilized in this field? All I used at my first run of proteomics was NCBI, phylogeny.fr, some protein modeling sites and MetaCyc.

>> No.16190075

>>16190072
R is the de facto standard. Python is gaining popularity, but if you can think of it, R has the bioinformatics package for it

>> No.16190082

>>16190075
Thanks for the help, anon! Having a little data science under my belt will really help me in elucidating natural antimicrobial compounds that can be synergistically used with common drugs.

>> No.16190477

>>16189191
Ayy I used the Biostars myself and it's honestly not bad as beginner material.

>> No.16190537
File: 125 KB, 1000x1231, 61J7L39NBpL._AC_SL1500_.jpg [View same] [iqdb] [saucenao] [google]
16190537

>> No.16190579

>>16186047
>the post book era.
Tell me you're a nigger without telling me you're a nigger.

>> No.16190672

>>16189209
>You need to be at least 18 years old to post
wtf is even that argument you're trying to make?
The OP is posting children's media and still wants to be taken seriously.
>>16188516
>I've saved so much money through buying an iPad and using libgen. That's unfortunate.
>>16188438
>Sad to see that libgen mirrors are dying out. Only 2 remaining
There are larger and better sites out there anons. I'm just not mentioning them here because 4chan is full of feds who have a habit of shutting down piracy sites. Everything you say on 4chan IS going to be read by a fed.
>>16186047
>The field emerged in the post book era
No such thing anon.
I've found wonderful stuff in second hand book shops that you can't get online. Your lack of awareness is sad.

>> No.16190722

>>16190537
>academimc press
Yikes

>> No.16191059

>>16189191
Nice links, thanks

>> No.16191164

>>16186047
Bioinformatics has been around for decades

>> No.16191354

>>16188310
Any book that's like this but geared towards python?

>> No.16191398

>>16191354
Nothing comes directly into my mind. You could of course try to implement the book's algorithms independently in Python or search for corresponding bioinformatics packages. I'm sure someone has written those and shared as packages, at least the most common ones.

Or you could be master of both Python and R. I personally don't write Python that much, but that's probably because my uni teaches R for biologists

>> No.16191656

>>16188438
F

>> No.16192143

>>16186047
not in the post-book era, most of my most important shit involving this field has been stuck in .pdf files, most of them actually free online (mostly, gotta have a VPN for a couple of these or at least change your DNS to avoid link being stopped midway)
btw I'm a bio masters, not coming in from the CS side, so the order of these is from my experience
>Pevsner's Computational MolBio, BioComp Algorithms
HHMI Professor teaching their flavor of biocomp, has a whole homework track in Rosalind where you can both compare efficiency and test yourself for the most basic level
>Compeau, Biological Modeling: A Short Tour
A co-author on algorithms made a nice modeling tour, I worked though a digital copy he has on his site while taking a course for the next entry
>MBoC
Essential reading, anyone serious on molecular biology really should take a course on this at the grad level, even a chapter is good background on what tech is available to observe specific regions of the cell. It's more essential to me in developmental biology analysis than a fucking DevoEvo course. IDC if it's not directly in comp bio, it's essential after fully moving into the field and seeing people rely on basic ass shit for multiomics.
Pevzner's nucleotide assembly algo goes well with entering Graph ML and multiomics applications, but finding a good textbook is kinda hard, since ideally you'd have experience with language models (attention mechanisms are VERY similar) and everything gets abstracted, so I'm going to just branch off some more specialized textbooks.
>Goodfellow's deep learning (I hated the math for this, but w/e)
>Eferon/Hastie's Computer Age Statistical Inference (combined with Zar as reference you can write a paper without a stats nerd as a co-author)
These two (and Zar, but Zar was a lab copy we share) helped me work on my thesis, but I really wish there was a better resource for representation graph learning other than "take a course in it" but it's certainly helpful.

>> No.16192149

>>16192143
also here's a course you can follow pretty well on Graph ML
https://www.youtube.com/playlist?list=PLoROMvodv4rPLKxIpqhjhPgdQy7imNkDn
just note that there are tons of advances in a lot of the methods he mentions in the last couple of years, esp with attention and temporal systems

>> No.16192222

>>16192143
Based anon

>> No.16192330

Surprisingly good thread for /sci/

>> No.16192694

>>16192143
Thanks for the info anon. I'm pre-med, but pursuing research with the school in microbial ecology to learn more about antimicrobial proteins (those in wormwood extract in my case) is on tap for this fall, so I have a lot of interaction with microbe research alongside medical micro (we have a BSL4 lab as well).

>> No.16193124

>>16186040
>how to deal with a broken pdb file
Burn it

>> No.16193665

>>16190537
This looks quite concise and the code examples are written in Python.

>> No.16193786

This is now /big/ - Bioinformatics General

Post more bioinformatics related content in this thread. Share your work and tips, post good resources, ask your questions related to bioinformatics and computational biology

>> No.16193922

>>16192694
Best of luck to you!

>> No.16194044

>>16185485
What about learning from open source code?

>> No.16194084

>>16194044
Can work. But if the code has no comments, variables are badly named and no proper documentation, it will be a nightmare to trudge through

>> No.16194214

>>16193786
Don't ruin the thread. We're not doing that.

>> No.16194472

>>16185485
what is it like kissing kurisu on the lips? scientifically speaking.

>> No.16194681

>>16194472
Soft, exciting and nervous. You'll probably reduce the nervousness by showing her your bioinformatics books though

>> No.16195715

>>16194681
what if i don't have any?

>> No.16195978

>>16188310
Would you recommend this book plus perhaps a Python book to learn both R and Python?

>> No.16196142

>>16195978
Pevsner is certainly worth it because it is very thorough. I don't know your skill level in Python, but I'd say take some practical book like Automate the Boring Stuff with Python as a starter and then Numerical Methods with Python. Numpy, scipy and matplotlib documentation will become very familiar to you.

>> No.16196534

>>16190537
Is this good?

>> No.16196813

>>16185485
Data science is a meme at this point. Either become a machine learning engineer if you care about the ML side or become a statistician if you care about the analytics side.

>> No.16196817

>>16196813
You're oblivious to genomics, aren't you?

>> No.16196826

>>16196817
Again, another application of statistics with the prerequisite of biology.
https://ocw.mit.edu/courses/hst-508-genomics-and-computational-biology-fall-2002/resources/02inov12n1/

>> No.16196829

>>16196817
You must be someone who thinks methods like principal component analysis is data science and was never used before in stats.

>> No.16196851

Good thread before this fag came >>16196813
Thanks for the info guys

>> No.16196884
File: 100 KB, 611x353, 1705676217058972.png [View same] [iqdb] [saucenao] [google]
16196884

>>16196813
ML in -omics is using fundamentally different datastructures to other tasks (hell, even GO term labels aren't particularly intuitive for prediction). Somewhere between financial transaction or social media temporal networks (where both use a fundamentally different definition of how time is represented) and NLP with far higher dimensionality.
You learn the underlying data structures/algos with compbio
You need to learn the statistics and underlying molecular biology to actually understand what's being answered (this goes into being able to decern what conclusions are appropriate depending on the experimental design)
these would be essential to develop the base genomic tools that wet lab scientists rely on, and fits in undergrad schedules. Moving on top of that can get you into epidemiology/med, working in hackathons for drug discovery or other techbro memes, a guaranteed paper name as the resident biostatistician, or learn how to graph and vectorize sequencing data for graphML, which is still a developing field for even natural network data.
Is GWAS bioinformatics fucking stupid? Yes, and you should not make that as a career goal, but there's a massive field in computational bio (and chem) that requires degree specialization and planning. You gotta find a niche, not "just take ML or Stats." You do that, and you become one of the 5000 others who post absolute crap on MDPI and shitjournals that wetlabs fucking hate. There is palpable disgust if you include the term multiomics in a presentation at physiology conferences because of the sheer volume of the crap that uninformed ML students in Tongji or Tianjin thinking they can model cancer.

>> No.16196893

>>16185485
Scientifically speaking how do we make anime real so I can finally make love with my Kurisu?

>> No.16196894

>>16196893
very carefully

>> No.16196896

>>16196894
It can't happen quickly enough

>> No.16197110
File: 26 KB, 332x500, yau.jpg [View same] [iqdb] [saucenao] [google]
16197110

>>16196851
Not the asshole, but I don't see how the Kurisu coomer is related to that post.

On to the topic, anyone have experience reading this book?

>> No.16198246
File: 828 KB, 490x714, 1697377451066306.png [View same] [iqdb] [saucenao] [google]
16198246

>>16185485
I love my wife

>> No.16198273

>>16188310
Do you find that the information in this book is still relevant 9 years after it was published?

>> No.16198753

>>16198273
Only thing that is a bit outdated is the lack of machine and deep learning in that book. Otherwise it is still relevant. 95% of the statistical methods relevant to a bioinformatician were invented decades ago.