Archive for the ‘Scholarly Thoughts’ Category

Book Battle: Fathers and Sons vs Eve Green

I recently read Eve Green, by Susan Fletcher, on the strength of it garnering glowing reviews and having won a major award. I was utterly underwhelmed, and looked to Amazon to see what real people thought of it. Opinion is divided: it’s either a beautiful, mysterious evocation of Wales or a dull trudge through an unlikeable, self-involved character’s tedious past. I fall squarely in the latter camp, but I wondered if that was partly because I was so smitten with Fathers and Sons, by Ivan Turgenev, which I read immediately before.

So, I’m pitting the two books against each other, but I decided that it wasn’t fair to choose the battleground myself; I want a good clean fight, here. Being the geek that I am, I had no trouble knocking up a simple random word generator, to decide on the categories on which each book shall be judged. I used the ‘All Adjectives’ word list to generate 5 random categories.

Round 1: Clear
A tricky start for Eve Green, as it comes out flailing with some oblique references and flowery language; for some people these are it’s strengths, but Fathers and Sons lands a stinging blow with a title that tells you exactly what to expect, followed up by a flurry of descriptive passages that are the epitome of clarity. Verdict: Fathers and Sons.

Round 2: Immense
Interpreting ‘immense’ literally, both books move on the defensive, as neither will break a toe if dropped on a foot. A few cagey jabs later, and this damp squib of a round is over. Verdict: Draw.

Round 3: Fluffy
Fathers and Sons is on the ropes, reeling from an unprecedented attack by a cuddly toy dog from Eve Green, but it rallies towards the end of the round as Eve Green‘s darker heart asserts itself. The spectre of death haunts both of these distinctively un-fluffy novels, and it’s another tied round. Verdict: Draw.

Round 4: Curious
After a quiet couple of rounds Fathers and Sons gradually builds up a strong sequence of curious punches: smack – inter-generational dynamics; smack – our place in the universe; smack – frustrated desire. Eve Green is curious about human nature on a smaller scale, and counters with a few hits of loneliness and the nature of evil, but now looks like a broken book. Verdict: Fathers and Sons.

Round 5: Wandering
The episodic nature of Fathers and Sons comes out swinging in this round, but its attack weakens as it becomes clear that the trajectory of Bazarov’s fate has been far from aimless. Eve Green takes advantage with a few time-travelling blows, finishing with a pointless and devastating granny’s-dead-Cornish-sailor of an uppercut. Verdict: Eve Green.

The winner: Father and Sons. A victory for both literature and websites with random word generators.

CEB Journal Club: Andam et al. (2010)

Members of the Computational and Evolutionary Biology (CEB) group at the University of Manchester participate in a monthly journal club, where a paper of broad interest is discussed. Here, I briefly describe the paper and its context, and summarize our conclusions about the methodology and results presented. (I have attempted to represent the discussion and consensus of the group, but any inaccuracies are my own.) For your reading convenience, this post is available as a pdf pdf.

Biased gene transfer mimics patterns created through shared ancestry. Cheryl P. Andam, David Williams, and J. Peter Gogarten (2010) PNAS 107: 23, 10679-10684. PubMed: 20495090
(Presented by James Allen at Jabez Clegg, 28th July 2010)

The paper in a sentence: The authors describe a specific case of a gene that makes a bacterial enzyme, which has been horizontally transferred between species in a biased manner, such that the molecular evidence resembles that of a gene transferred by descent from parent to offspring.

Background: Until relatively recently, genetic information was thought largely to have been transferred from parent to offspring, analogous to a branching tree structure. The applicability of this analogy for all forms of life is under debate, however, given the discovery of the extent of other mechanisms for gene transfer in bacteria and other single-celled organisms. Horizontal gene transfer (HGT) refers to the process where genetic data from one organism is transferred to another which is not necessarily related, nor even necessarily the same species; the prevalence of HGT calls into question not only the ‘tree of life’ metaphor (suggesting, perhaps, that a network analogy is more appropriate), but also the (already rather labile) concept of species.

The paper in detail: The authors present one key result, which is supplemented by evidence from three other sources which would not be convincing in isolation, but here provide valuable circumstantial support. The results are based on a particular enzyme, which has the important property (for this analysis) that it has two distinct types. The main result is that the tree in figure 1 in the paper, generated by looking solely at this enzyme, has two distinct sub-trees, representing each of the the two types. Each one of these sub-trees closely resembles the tree that most likely characterizes the vertical inheritance of genetic data, i.e. the ‘species tree’ in figure 2. It is not easy to quantify whether one tree structure resembles another, particularly with the number of species used here; the authors look at the distances along the tree branches that separate all pairs of species, which discards information about some of the tree structure, but does not prevent them from convincingly demonstrating that the sub-trees for each type resemble the species tree. Moreover, in the species tree, the species with the same type of enzyme are grouped together within broader groupings at the phylum or class level; i.e. there are patches of red and green branches (representing the two types) in figure 2. This is evidence for biased HGT because it shows that HGT occurs not in a random fashion, but more often between more closely related species.

Another line of evidence presented is that a scenario of gene gain and loss that would explain the trees is far less likely than one where some degree of HGT occurs; the authors gloss over the fact that this demonstrates that HGT, rather than biased HGT, has most likely occurred. Additionally, the genes that surround the enzyme’s gene are found to be similar for both types, which would not be the case if the genes were being repeatedly gained and lost; again, this is evidence for HGT, not necessarily biased HGT.

The final piece of supporting evidence comes via simulations of biased and unbiased HGT, which result in data that resembles the real data. Some of the choices for the simulations are questionable, in particular the modelling of reciprocal transfer events, meaning that genes from two species are swapped. This does not reflect the biological reality, where the transfer generally happens in one direction only. Also, an extreme bias is modelled, using an exponential function, so that transfers are likely to occur between only the most closely related species – this may well be realistic, but the use of this particular model is not justified by the authors. Finally, the unbiased and biased transfers are simulated sequentially, which was perhaps done as it is often easier to show that something is changing, rather than staying the same, but is an uncommon approach that makes it difficult to interpret the results.

Journal club conclusion: While not wholly convinced by some of the evidence presented, particularly the approach to simulation, we believe that the main conclusions of the paper are valid: in the case of this particular enzyme, the horizontal gene transfer is biased, such that transfer is more likely between more similar species, and thus the molecular data provides the same signal as transmission through vertical inheritance. It remains to be shown how widespread this phenomenon is; if HGT generally reinforces, rather than contradicts, vertical inheritance of genetic material, then the tree of life analogy may well be useful for practical purposes, even if does not reflect the true evolutionary history.

T1DBase: type 1 diabetes, and my part in its downfall

Apropos of a new T1DBase publication (Burren et al. 2011) (in which I am kindly acknowledged), I thought I’d write a bit about some of the work I did there (Hulbert et al. 2007). I envisage this being the first of maybe three instalments, so before going into detail about the specific projects that I worked on, I’ll explain what T1DBase actually is, and why I’m proud to have worked on the project. For your reading convenience, this post is available as a pdf pdf.

T1DBase is a resource for the type 1 diabetes (T1D) research community, and it has strong ties to the JDRF/WT Diabetes and Inflammation Laboratory (DIL) in Cambridge, which is headed up by John Todd. (When I worked at the DIL we collaborated with the ISB and a group at UPenn, but this is no longer the case.) Type 1 diabetes is an auto-immune disease, that primarily manifests in childhood, so was formerly known as juvenile diabetes. The symptoms are similar to those of type 2 diabetes, but the aetiology is quite different (Todd 2010), and type 1 diabetes is genetically more similar to diseases like rheumatoid arthritis and coeliac disease (Smyth et al. 2008).

I worked on T1DBase for three years, from Jan 2006 to Dec 2008, which was a period of massive change in our understanding of the genetics of type 1 diabetes, primarily due to the emergence of genome-wide association studies (GWAS). The DIL was heavily involved in one of the first landmark studies (Todd et al. 2007; Wellcome Trust Case Control Consortium 2007), as part of the WTCCC (Wellcome Trust Case Control Consortium; don’t worry, I think that’s the last of the acronyms). Results from that and subsequent GWAS (e.g. Cooper et al. 2008; Barrett et al. 2009) generated a host of new T1D susceptibility regions, and a better (although still far-from-complete) appreciation of the genetics of this complex disease. (I’ve cited GWAS publications that I was involved in, or that were written by colleagues at the DIL, but T1DBase also gets data from a range of other sources; see the website for more information.)

The people behind T1DBase curate the GWAS results, and make them available as raw data and, more usefully, as region summaries that tie to analyses of genes and variants (i.e. SNPs), as well as cross-referencing with mouse and rat data. It sounds so simple when you write a sentence like that, but there are, of course, very many challenges involved, both in terms of making sense of a huge amount of biological data, and in working out how to effectively present the results. And that’s not to mention the day-to-day work of maintaining a website, and programming collaboratively and efficiently. I very much enjoyed working on the T1DBase project; I learnt loads, both about disease genetics and programming, and it was always a fun environment to work in (with regular tea breaks, too…) And it was nice to be in a job where, in some small way, I was able to constructively contribute to important and useful research into type 1 diabetes.


  • Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, Erlich HA, Julier C, Morahan G, Nerup J, Nierras C et al. 2009. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nature Genetics 41(6): 703-707. PubMed: 19430480
  • Burren OS, Adlem EC, Achuthan P, Christensen M, Coulson RMR, Todd JA. 2011. T1DBase: update 2011, organization and presentation of large-scale data sets for type 1 diabetes research. Nucleic Acids Research 39(Database issue): D997-D1001. PubMed: 20937630
  • Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE, Downes K, Barrett JC, Healy BC, Mychaleckyj JC et al. 2008. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nature Genetics 40(12): 1399-1401. PubMed: 18978792
  • Hulbert EM, Smink LJ, Adlem EC, Allen JE, Burdick DB, Burren OS, Cassen VM, Cavnor CC, Dolman GE, Flamez D et al. 2007. T1DBase: integration and presentation of complex data for type 1 diabetes research. Nucleic Acids Research 35(Database issue): D742-746. PubMed: 17169983
  • Smyth DJ, Plagnol V, Walker NM, Cooper JD, Downes K, Yang JHM, Howson JMM, Stevens H, McManus R, Wijmenga C et al. 2008. Shared and distinct genetic variants in type 1 diabetes and celiac disease. The New England Journal of Medicine 359(26): 2767-2777. PubMed: 19073967
  • Todd JA. 2010. Etiology of type 1 diabetes. Immunity 32(4): 457-467. PubMed: 20412756
  • Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V, Bailey R, Nejentsev S, Field SF, Payne F et al. 2007. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genetics 39(7): 857-864. PubMed: 17554260
  • Wellcome Trust Case Control Consortium. 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145): 661-678. PubMed: 17554300

A most excellent black-and-white bear

On the day that the Giant Panda genome is released, I was surprised to discover that the animal was unknown in Europe until 1869. The best bit is the quote by Armand David, the zoologist priest who first scientifically recorded its existence, describing it as a “most excellent black-and-white bear”.

Countdown Timer – JavaScript

Countdown to :

The JavaScript demonstrated here (countdown.js) can countdown to a particular date. If the date is annually recurring (e.g. a birthday), then it’ll countdown to the next occurence; if it’s a specific year (e.g. a holiday, in the vacation sense of the word), when the date has passed it’ll show the time elapsed since that date. It counts down to a fraction of a second after midnight on the given day (JavaScript gets it’s time information from the client’s clock).

This isn’t particularly novel, and much of the code is copied, merged, and adapted from other similar scripts on the web (none of which did exactly what I wanted). I haven’t seen another script that takes a date and automatically works out whether it’s annually recurring, and if not, whether to count down or up, but I daresay it’s been done many times over. The script is heavily dependent on giving it a date in the right format, “Month Day_of_Month[, Year]”, where Month should be specified as text to avoid any confusion about the order of date components. Not an issue with a list box, but if you let users enter data in text boxes you’ll have to do a bunch of checking and formatting, which is altogether too tedious for me to have bothered with.

I don’t think the script needs much commentary – it works out when the day is in relation to today and does some simple maths to display that information in a human-readable format. JavaScript works in milliseconds, which is why we divide by 1000 in various places. The script is actually pretty wordy because I find JavaScript counter-intuitive, and tend towards clarity rather than brevity; but if you prefer the latter it’d be easy to condense it.

It’s easy to add a little pizazz to the countdown by displaying a picture relevant to the date selected, just change the src of the image when the user selects from the list box.

A Selective (Professional) Biography

Hello world,

I think it’s nice to give the code I write a bit of context, by including some autobiographical detail.

I have a BSc in Maths and Artificial Intelligence from the University of Sussex, which taught me plenty of theoretical, pure maths (which has, somewhat perversely been rather useful in practice), and how to program (in a range of uncommon, not very useful, languages, although I hear Lisp is making a comeback).

I then worked for a few years developing Microsoft Excel and Access stuff, so I’m a dab hand with a spreadsheet, and the colour schemes of my databases have been widely admired. I also did some SQL Server work and some website development, which back then involved writing HTML, Javascript and CSS. Now, I write in XHTML, and regard Javascript as a necessary evil – I think it’s a horribly counter-intuitive language.

Seeking a new, more interesting direction, I did an MRes in Bioinformatics at Birkbeck College, which I very much enjoyed, to the extent that I am now halfway through a PhD in the subject, at the University of Manchester. In between these two degrees I worked for 3 years in a lab at Cambridge University. I’ll describe the work I’ve done in bioinformatics in greater detail in subsequent blog posts; here I’ll just say that I’ve done lots of Perl development, and that I am very good at building and querying databases (MySQL and Postgres) and rifling through massive datasets.

That’ll do for now, I’m getting a bit bored writing this, so you’ve done well to make it this far…