Notes on A Biology Primer for Computer Scientists

Since most of my education has taken place above the organism level, and since my current job concerns sub-organism processes, I want to get more familiar with those sub-organism things. So I’m reading and taking notes on my blog about the stuff I’m reading. First off is the PDF compiled by Franco Preparata from Brown University called “A Biology Primer for Computer Scientists” at https://web.stanford.edu/class/cs173/papers/bioprimer.pdf section 1 life is defined as being able to replicate, and that’s possible with DNA section 2 chemical composition chemical makeup of life is largely composed of carbon, hydrogen, nitrogen and oxygen. molecules have different binding strengths and therefore different energy levels to break them molecules are held together by many types of bonds, one of which is the covalent bond an agent that aids a chemical reaction is a catalyst; a biological catalyst is an enzyme an enzyme itself is a molecule; and enzyme has a specific shape that matches a specific reagent it will catalyze building blocks of living organisms are biomolecules; basic ones are sugars, fatty acids, amino acids, and nucleotides. two important types of sugars are ribose and deoxyribose amino acids are particularly important, of which there are 20 types amino acids make up proteins; polysaccharides made of large carbohydrates; section 3 nucleic acids building blocks of nucleic acids are nucleotides a nucleotide is made up of three components: a base B, a sugar S and a phosphoric acid P. there’s 8 different nucleotide types (G, A, C, and T in DNA, and G, A, C, U in RNA) the S sugar of each nucleotide is called ribose; nucleotides polymerize as nucleic acids, either DNA or RNA DNA Is double stranded; RNA Is single stranded in DNA the two strands orient opposite directions with 5’ to 3’ and the other strand going 3’ to 5' in DNA, each strand has nucleic acids that bind to the complementary nucleic acid in the other strand section 4 fundamental cell processes three major processes occur in the cell: DNA replication, DNA-RNA transcription and RNA-protein translation. section 5 DNA replication DNA replication is the process by which a double-stranded DNA sequences produces two double-stranded sequences identical to the original one replication always proceeds from the 5’ end to the 3’ end - therefore goes in opposite directions in each strand (leading strand is 5’-3’, lagging strand is 3’-5') DNA polymerases facilitate the replication For the leading strand a string of about 200 bases indicates where to start For the lagging strand each Okazaki fragment is initiated by an RNA string synthesized by a specific enzyme section 6 DNA-RNA transcription transcription only uses the so called “genomic” strand, the one that goes from 5’ to 3' machinery that does transcription is called RNA-polymerase. it separates two DNA strands along a short area and transcription occurs along the short exposed strand, and DNA strands rejoin as the process proceeds In DNA replication, DNA is replicated in its entirety, whereas transcription is selective both in space (only certain substrings of DNA are transcribed) and time (depending on environment). Different types of RNA: mRNA (messenger; involved in RNA-protein translation), rRNA (ribosomal; participate in the structure of the ribosome), tRNA (transfer; assume a rigid 3d configuration acting as linkages between mRNA and protein chains), snRNA (small nuclear; excision of introns and splicing of exons) section 7 RNA-protein translation aka Protein synthesis DNA is segmented into triplets of nucleotides; each triplet == codon; each codon is individually translated into an amino acid there are 64 codons there are 20 amino acids (so each amino acid is encoded by more than 1 codon) translation occurs in the ribosome. ribosome can be compared to a tape reader (an mRNA sequence) that also produces an output tape (a protein). tRNA is also required (and are specific to a codon/amino acid pair?) section 8 protein structure proteins are polymers (polypeptides) amino acids fully specify a protein, BUT it is its spatial arrangement that determines its function how polypeptides fold is not fully understood protein structure levels primary: the linear sequence of amino acids secondary: local folding patterns such as alpha helices and beta sheets tertiary: complete 3D shape of a single polypeptide chain quaternary: arrangement of multiple polypeptide subunits within a protein complex

January 7, 2025 · 4 min · Scott Chamberlain

cowsay v1

cowsay is a command line program written in Perl. The original version had a final release in 2016 (that’s the version of many installed cowsay programs) and there’s a number of forks of that release in Perl. There are also many many versions of cowsay in other programming languages, like the one I maintain written in R, unimaginatively called cowsay. I wrote about cowsay here back in 2014. I didn’t think this would ever be 300+ stars popular, but here we are. Given that people seem to actually use it - or at least star it - seems worth putting some more time into it. ...

December 9, 2024 · 4 min · Scott Chamberlain

Software rules and Quarto

At work I’ve been using Quarto quite a bit for website and books for work projects. One of the projects I’ve been working on lately that uses Quarto is the WILDS Contributor Guide (WILDS = Workflows Integrating Large Data and Software). This guide (a Quarto book) is mostly a guide for our own immediate team members, but aims to a) be a guide for any contributors to our open source software work, and b) demonstrate good open source software practices for the greater Fred Hutch community where we work. ...

September 26, 2024 · 4 min · Scott Chamberlain

Refactoring notes

I worked on a refactor of an R package at work the other day. Here’s some notes about that after doing the work. This IS NOT a best practices post - it’s just a collection of thoughts. For context, the package is an API client. It made sense to break the work for any given exported function into the following components, as applicable depending on the endpoint being handled (some endpoints needed just a few lines of code, so those funtions were left unchanged): ...

March 20, 2024 · 3 min · Scott Chamberlain

Moved to Hugo

This blog is now using Hugo. Important - if you subscribe to the RSS for this blog you likely have to delete/remove the old one and add the new RSS link. It is: https://recology.info/index.xml

March 20, 2024 · 1 min · Scott Chamberlain

Shiny file inputs

I wrote the other day about overcoming an issue with Shiny. Another issue I ran into concurrently was about file inputs. The issue was that file inputs (i.e., shiny::fileInput) was difficult to clear. That is, after a user uploads a file, it was easy to get some of the various parts cleared/cleaned up, but not others: (Not Easy) The UI components of fileInput (the text of the file name, the loading display) (Not Easy) The data behind the fileInput handler (Easy) Displaying some feedback in the UI after handling file input Load libraries ...

March 8, 2024 · 3 min · Scott Chamberlain

Shiny button weirdness

I’ve been working on Shiny app at work for the past few months. One of the many frustrating things about Shiny lately has been around buttons. Well, it wasn’t really about buttons, but that’s where it started. Load libraries library(shiny) library(bslib) library(crul) Helper function, returned a random UUID from an httpbin server httpbin_uuid <- function(...) { con <- crul::HttpClient$new("https://hb.opencpu.org") res <- con$get("uuid") jsonlite::fromJSON(res$parse("UTF-8"))$uuid } A bslib ui component ui <- page_sidebar( title = "My dashboard", sidebar = list( actionButton("submit", "Submit"), actionButton("reset", "Reset") ), textInput(inputId = "name", "Your name"), textOutput("uuid") ) Here’s the server part that was giving me trouble. As I said this was an inherited repo, and the server side handling for many buttons was done with eventReactive as below. Using eventReactive meant that button clicks only sometimes triggered the server side code. ...

March 4, 2024 · 2 min

Avoiding the word footgun(s)

I recently opened an issue in a repository for a package I’m working on to think about potential footguns and how to avoid them. That word “footguns” got me thinking; does using phrases/metaphors for a certain topic in a way lend credibility to it? For example, we use a lot of sports metaphors in the US, especially baseball (swing for the fences, anything related to bases, curveballs, heavy hitter, etc.), and that says something about the place of baseball in our culture. ...

March 3, 2024 · 2 min

Weird thoughts

My parents just found this email they had printed out from me from May 19, 2006, when I was 26. I chatted about some family stuff, then had this rambling string of weird thoughts below. I thought others might appreciate a good cringe - or cringy laugh - at my expense. It’s especially funny because I’m most def an atheist. I don’t know, those Tucson sunsets really are transformative. … Actually, some deep thoughts: ...

October 12, 2023 · 2 min · Scott Chamberlain

Job searching notes

notes to self for next job hunt (some of which may be generally useful): don’t apply to general tech companies anymore for many reasons. heard back from very very few but that may be b/c I don’t know many people at general tech companies never been able to get through interviews; they’re presumably looking for computer science grads (not me) most of their missions are probably not stuff I’d be happy about at the end of the day. despite missions of doing xyz, it’s probably really about $$ don’t apply to pharma companies any more. there’s lots of good software jobs in that sector, but i’ve struck out 3 times, and so that’s a clear pattern my background/whatever isn’t something they want next time only apply where I have a connection that can refer me or dig around for a referral. it’s super easy to apply for jobs, especially if you don’t write a cover letter; however, the less time I spend surely the less likely I am to hear back make sure (and I’ll probably fail to do it again this time, ugh) to write down what questions I was asked, how I answered, and how to improve on that answer. then study and reference those questions and answers for the next interview its good to have multiple offers at the same time, but then deciding is harder - & I don’t love to negotiating - so maybe don’t worry about multiple offers at the same time next time around I have relatively low expectations in any interview b/c I don’t do technical interviews well - I also try to seek out orgs that do not have crazy technical interview processes - eg., Roche had a whiteboard technical interview that I totally bombed, but was unsurprising in hindsight since the interviewer was an ex-Googler. I’m more of a thinker than a quick responder, making it hard to do well in very fast paced (for me) tech interviews. Though I know i have been a good software engineer where I’ve worked, so these fast paced tech interviews are probably selecting for a certain kind of brain function I guess? seek out orgs with interview processes that have take home assignments - or at least timed coding tests on something like hackerrank - instead of live whiteboard/zoom tech inteviews my last job Deck had a take home test Axiom DS has a take home test approach AdHoc uses a take home test approach Invitae had a hackerrank test, not a take home but better than live coding test cover letters? I still don’t know whether these are worth doing or not. the advice seems to be mixed. they sure take a lot of time, so I hope they’re not necessary for most hiring managers; given my bullet above about spending more time on fewer applications, I could find time for a cover letter on every application if ther’s not that many some data about this last job search: ...

October 9, 2023 · 3 min · Scott Chamberlain