Method for fixing incorrect java versions on linux using update-alternatives


I just came across this and needed to post it somewhere for posterity. Nothing fancy, but a handy tip.

After spending a while trying to work out where my newly updated java version had gone (officially installed through the Ubuntu app store), I came across this old thread which made me aware of the “update-alternatives” command which as you can see totally sorts the problem. I don’t know how far it reaches with other programs/environments, but seems good for versioning. Need to bear it in mind next time a problem like this comes up.

java JRE version control

Getting Started in Bioinformatics coding – Python


Biologists often ask me about where to start learning how to code, and where I learned myself (however limited that may be!). Around 6 and a half years ago, I first began working with perl by editing/personalising a transcriptome pipeline which, like a major proportion of bioinformatics from back then, was probably written for a single purpose without much thought to implementation by other users. After that, I picked up an O’Reilly book (Google O’Reilly + your chosen language) and eventually did a coursera course which taught me some better practices than what I picked up from other people’s code and Google.

A significant complaint that I hear is that a lot of the tutorials that you can find online, in print, and in person (CodeAcademy, Coursera, O’Reilly books etc.) are targeted to learning the core principles, and as soon as you have finished, it quickly slips out of your brain. This is why I now recommend to the Undergraduate and Masters classes, or anyone else who asks, which comes in either in Book or Online form. They teach all of the core concepts in a beginner and advanced course, but are tailored to the kind of tasks you are probably considering taking up the art for. What’s more, the exercises which you perform are real-world applicable which means that a few weeks down the line when you want to do some computational analysis, you already have something saved on your personal computer that likely does half of what you want to do, and you can refresh your memory on.

I’ve just looked at the website and there seems to have been a whole host of new material added since I ran through it a couple of years ago, so it’s definitely worth a look!

New Random Forest machine learning bassed 16S classifier looks to improve upon RDP accuracy


I haven’t had chance to check it out yet, but looking at the numbers I definitely will give it a run. Unfortunately I didn’t see a comparison against other classifying methods, but looks interesting. Although people need to be more imaginative with their software names though!

16S Classifier: A Tool for Fast and Accurate Taxonomic Classification of 16S rRNA Hypervariable Regions in Metagenomic Datasets

PLoS One Link

The effect of anthropogenic arsenic contamination on the earthworm microbiome.


The typeset manuscript has been uploaded to Environmental Microbiology now. Check it out at their site or read it here, and feel free to send me a email if you want to discuss it!


Earthworms are globally distributed and perform essential roles for soil health and microbial structure. We have investigated the effect of an anthropogenic contamination gradient on the bacterial community of the keystone ecological species Lumbricus rubellus through utilizing 16S rRNA pyrosequencing for the first time to establish the microbiome of the host and surrounding soil.

The earthworm-associated microbiome differs from the surrounding environment which appears to be a result of both filtering and stimulation likely linked to the altered environment associated with the gut micro-habitat (neutral pH, anoxia and increased carbon substrates). We identified a core earthworm community comprising Proteobacteria (∼50%) and Actinobacteria (∼30%), with lower abundances of Bacteroidetes (∼6%) and Acidobacteria (∼3%). In addition to the known earthworm symbiont (Verminephrobacter sp.), we identified a potential host-associated Gammaproteobacteria species (Serratia sp.) that was absent from soil yet observed in most earthworms.

Although a distinct bacterial community defines these earthworms, clear family- and species-level modification were observed along an arsenic and iron contamination gradient. Several taxa observed in uncontaminated control microbiomes are suppressed by metal/metalloid field exposure, including eradication of the hereto ubiquitously associated Verminephrobacter symbiont, which raises implications to its functional role in the earthworm microbiome.


A great set of tutorials on de novo genome assembly by @lexnederbragt


Check out the documents for the class he runs at the University of Oslo here. Great place to start on learning about it.

It covers:

  • 01_Assembly_using_velvet
  • 02_Mapping_reads_to_an_assembly
  • 03_Assembly_using_celera
  • 04_Evaluating_assemblies_with_Quast
  • 05_Evaluating_assemblies_with_FRCbam
  • 06_Assembly_using_SPADES
  • 07_Assembly_using_newbler
  • 08_Assembly_improvement_using_REAPR

BOSC 2011


This week I was fortunately able trip it out to Vienna for BOSC, the Bioinformatic Open-Source Conference. As the title would indicate, the focus of the occasion was to showcase the latest in not-for-profit, academic software that is out there and have a thorough crack at laying some plans for the future. Collaborations have been formed which would be nigh-on impossible to be done over the internet and truly reflect the open nature which the attendees preach, with everyone jumping in and trying to contribute.

I was in attendance in the wake of the Biolinux Team funded by NEBC (NERC (Natural Environmental Research Council) Environmental Bioinformatics Centre), aka the Wallingford arm of NBAF (NERC Bioinformatic Analysis Facility) which is funding my PhD. (N.B. Acronyms are fun!). It was a fully immersive experience being surrounded by the big hitters in the field and I saw the Biolinux group of about half a dozen which I was acquainted with, grow by multiples of itself.  By the minute there were fantastic projects being squozen into an ever decreasing amount of free space in my brain. Re-occurring themes in the form of Galaxy, Taverna and the Cloud have henceforth replaced the natural meanings of words in my mind so that I can’t ever have a bar of chocolate, go into a small Greek restaurant or even look at the sky the same way again (more on these later).

The BOSC event was preceded by the obviously named Codefest. This allowed the attendees two days in which to get together and hack together, producing an intensive and productive work environment with results which I’m sure reflect this. This went on in Metalab, a basement hacker space near the Rathaus in down-town Vienna which was kindly opened up to us in return for our buying of their caffeinated lemonade (supplemented by an accidental donation due to a room full of heavily qualified scientists being unable to calculate a restaurant bill!) Again, the opportunity to coalesce ideas and projects didn’t go amiss. Biolinux is a Ubuntu-based OS designed for bioinformaticians and biologists alike, and is a platform where they can perform their analysis in an environment already brimming with bioinformatic tools that have already gone through what many will attest can be some damn tricky set-up procedures. The latest buzz in the Bioliunux camp is the future expandability by venturing into the cloud computing environment to handle the data explosion which is all too well understood in the molecular biology world. Creation of this hybrid followed the Ronseal school of naming: CloudBiolinux.

Galaxy, which is a fantastically powerful and simple workflow system for working with genetic data from a web-server, got underway into being incorporated into Biolinux as soon as its awesomeness was realised (Its awesomeness is already well established, however I think meeting the people involved was the catalyst required!). Another system that quickly became hot-currency was CloudMan. This is a simple way of getting into the world of cloud computing on the Amazon cloud without all the dallying around in setup and allowing straightforward scaling of the processing the user requires in the elastic cloud environment. Natively in allegiance with Galaxy, Biolinux’s tail was quickly wagging.

My association with Biolinux and personal scientific preferences obviously give a strong bent to my observations of what was going on, however the whisperings I heard regarding other tools being developed in the same room were fully presented in the coming days once the official BOSC talks started. I’m writing this from a hazily jet-lagged memory on a cramped National Express Coach and so won’t go into each project discussed (Brad Chapman has already done this here) but here are some notable talks that stuck in my head and general themes found:

The proceedings started off on the first day with UGENE from unipro. This is a flashy and high quality integration of popular bioinf tools with a whole heap of visualisation kit included. Very nice, shiny and user-friendly. They also commented on their optimisation for multicore and Graphical Processors (GPUs). Apparently, running on a GPU will speed up a HMMER2 alignment by 30x vs an intel core duo).

Integrated workbenches and workflow tools were very much the flavour of the day, with Mobyle and Galaxy also giving talks in this category soon after. Mobyle follows along the same lines as UGENE, with powerful visualisations and an alignment viewer in the style of Daliance, called Jalview. The Galaxy system was heavily cited throughout the whole event with various projects incorporating and contributing to its vast prowess. Galaxy also had a talk dedicated to its genome visualisation tool Trakster which again is a javascript browser built for NGS, and also has a mechanism in which to share and publish the visuals generated. The Dalliance browser itself is a stand-alone web-based genome browser which bridges the gap between web and desktop based genome browsers and looks damn good whilst doing it.

There was lots of talk of Taverna integration with Galaxy, as one workflow tool to another, both in presentations given and in the cutely named ‘Birds of a feather’ meetings at the end of the day.

Intermine was up early in the first day and in my opinion was one of the best presented talks of BOSC. This is a data warehousing system, designed to pull together all of the data on your target (ie species) and make it accessible and highly queryable in a supercool (my words) interface. Designed for FlyMine, it’s also widely used as YeastMine, RatMine and others.

Other notable talks of the first day were the WebApollo genome annotation program, which allows multiple users to annotate simultaneously in a very google-doc-esque manner (as Peter Cock put it) and the Biomanycores project which is (although not exclusively, at least heavily) involved in porting bioinformatics on to graphical processors to make use of the multiple core construct they are built upon. (Personally, I find GPUs damn cool but that’s neither here nor there).

Day Two was very definitively a cloud day, starting from a cloud based pun from the chair which received an appreciative groan from the whole audience, and was followed by an interesting and dynamic talk by Amazon’s Elastic Cloud computing Evangelist. This set the scene for various cloud based programs including OBIWEE, a natural linux shell environment and the afore mentioned CloudMan, which ports (as a proof of concept) the galaxy system cloud-wards.

The CloudBiolinux presentation was a personal highlight, which shown how an individual can quickly set up an EC2 server and have the whole suite of tools available on biolinux in a cloud environment, and with the full graphical user interface found on the standard desktop biolinux.

The events ran to a close with semantic web discussions, a talk from Debian Med and updates on the various bioinf standards ie biopython, GMOD and EMBOSS. The closing words were given to a Microsoft representative, proclaiming the .net framework for doing bioinformatics on a Microsoft platform. It took some balls to stand in front of a clearly sceptical crowd and there weren’t half as many burning Bill Gates effigies as was expected.

To summarise, BOSC was a fantastic experience to see the very latest in the world of bioinformatic systems and to see the cogs that keep the biology wheels turning. There was so much new stuff pushed into my brain that I’ve only managed to extract a small portion to put down in type here, but I hope it conveys the knowledge and experience gained. Until BOSC 2012…

Edit: I have just come across this blog from Steffen Möller regarding the same events I have described. Summary: Everyone loves BOSC!

PhD Diary –4 Months down


I’m almost four months into the land of PhD-dom now and am going to document the emotions which I have encountered. I know you don’t care but I’m doing it all the same. It’s for prosperity. Or ego. One of the two anyway.

Do you remember that bit in the original matrix when Neo is being made to jump between the two buildings? When all the characters are watching and saying “What if he makes the jump? What if he’s the One?”…“no-one ever makes their first jump”…“but what if he does?”…. I knew full well I wasn’t going to make the first jump but that’s what I pretty much spent the first few weeks going through. I think really, it was to block out the absolute failure I subconsciously expected. I had no illusions of grandeur or that everything was going to be golden and I’d buck the trend of agonising workload and destructive self-doubt that I’d been told to expect from those who had come before me. It might be a sign of my ego or it might be something everyone has, I don’t know. But the “I might just nail this” came ever so fleetingly on a few occasions, always bracketed in epic failure. During first PCR I did, the few seconds it took to warm up the gel dock (visualisation-thingy) and wait to see the inevitable awesome results that I had made were exhilarating, and only matched by the cataclysmic comedown when it didn’t work. (This was only compounded by the later realisation that I’d ballsed up within the first 30 minutes of the previous day and the rest was a rather futile experiment on expensive water).

The “It might all work out” was massively overshadowed by long phases of  “what the hell am I doing here?”, “I cant handle this!”, “I didn’t even have a proper interview!”,  “How did I scam my way into this?”.  Coming out of some of the early meetings with about 10% of the information floating around my head and 30% more in hastily scribbled, incomprehensible notes are not situations founded in confidence building. You’ll be glad to hear that the getting lost in meetings thing cleared up (on the whole!) with equal doses of reading up on stuff and the more obvious asking questions if you’ve lost the thread, the latter of which I had to have directly pointed out to me.

But plodding along brought some success. Small things worked out and I got somewhere. These led to bigger successes and more confidence in what I was doing. I think I’ve always been good at the keep-on-keeping-on, but the pyramid building of small successes bolstered my belief in it all being worth it. From slow beginnings I think I’ve now hit on a reasonable rate of things working out and intend to push it as much as I can.

So in day-to-day and minute-to-minute events, there are a lot of ups and downs. I spend most of my time at a computer and the thing has almost gone through the window on multiple occasions. I have big plans for cool stuff later in my project and I’m getting bogged down in the early stuff. Every second I spend not working I’m thinking about how I should be working (even whilst writing this) and I’ve woken up in the middle of the night on multiple occasions after a dream-experiment has failed or a deadline had been missed.

But the thing is I have not for one moment regretted the situation I’ve found myself in. I absolutely love what I do and there is honestly nothing else I’d rather be doing. I may complain, swear at the computer, bemoan having to get out of bed in the morning when it’s cold, but I would hate more than anything else to have it taken away from me, or to never have had it in the first place. A friend said recently that if it came to a choice between his girlfriend and his science, he’d choose the science. Whilst I cant comment on the integrity of that statement, I can totally relate to the sentiment. This is what I do and it’s what I intend to do for as long as I possibly can.

And that’s all I’ve got. I think I’ve finally settled into what I’m doing and I love it. I don’t really sleep or eat any more, but I wouldn’t swap it for the world. The last few months have been sinusoidal, with friends leaving this mortal coil and new ones entering mine. I’m 1/9th of the way through my PhD and so everything will probably change in time.

But for now, work’s ace and I wanted put that down on paper.


What I do for a living


“So my life for the next 3-4 years is going to be looking at earthworms (yes, earthworms) and all the wonders they hold (and hopefully wangling a trip to Portugal)”

Now yes, I know. As an unemployed graduate this doesn’t seem like a compelling title for an engaging blog post, but with the almost imminent commencement of my PhD studies I can write about the kind of stuff I’m going to be upto whilst still being given the measly sum of £1 per hour allocated to me by Jobseekers Allowance (£35 per week / 5 working days / 7 hours a day (9:00-17:00 minus 1 hour unpaid lunch)). This is before the honorary title of dole-scum will be so patronisingly removed from my shoulders by an enthusiastic “Oh, you’ve found a job? Well done you!” like a grandmother commending a strangers child on finding an unusually shaped stick in the park (Trust me, I’ve done this 3 times before. Always the same).

You guys out there will probably know that I’ve got myself a tidy little Genetics degree and intend to crack on with that in a professional capacity. Nothing too flashy, standard 2:1 etc. Landing on my feet I have found myself with a PhD offer in Cardiff where I’m currently residing and after some hair-raising moments over the last few months it’s all set in stone now. You may remember my previous post on here was about the massively flawed supposition that you could use genetics in the seduction of the ladies (if you don’t, then go read it now. It’s a hoot) and so I’m not going to try and hype this up but lay out in all its horrific glory. So here goes:

Bacteria is pretty important. It does all kinds of stuff. You’ve got buckets of them in your gut and you need them there or else you would be on a quick train to a slow death. Ok, death is pretty extreme and Humans can normally get away without dying and just being pretty ill, but a lot of other animals don’t. All the bacteria in your gut works together and creates what’s called a microbiome (pronounced micro-bi-ome, microbes being those small buggers and -ome being the suffix meaning all the microbes involved, but in a genetic capacity).  The gist is that these gut microbes are pretty damn important. Some scientists out there are saying that their genomes should be included in with the standard human ones because they’re essential and without the bacterial sequences human life wouldn’t be what it is. Same with other animals, which is what I’m going to be looking at.

So my life for the next 3-4 years is going to be looking at earthworms (yes, earthworms) and all the wonders they hold (in a microbiomic context). Now what follows is my understanding prior to actually starting studying this. I’ve only a superficial understanding but I can tell you’re excited so I’ll get straight to the good bit, but before we start, Pop Quiz:

What was Charles Darwins fastest selling book?……….

Wrong.  It was “The Formation of Vegetable Mould through the Action of Worms” in 1881. That was fun wasn’t it?

Anyway,  Earthworms are pretty damn important. If Darwin said so it must be true. They’re almost ubiquitously found in soil around the world so they must be pretty good at adapting to the environments they find themselves in. They aerate the soil, mix it around and break down leftover organic matter. The bacteria they carry round are nitrogen fixing so they’ve got a role in the nitrogen and carbon cycle too. The common worm has a knack, however for living in some pretty extreme places, old mines being one of the best. The same worm species can live in your garden and also in abandoned mines where there are all kinds of nasty lead, zinc and other heavy metals killing off everyone else. The worms can store up the metals in their tissue without too much damage to themselves but also convert a much higher proportion into other forms more or less available to other plants and animals, which they excrete. They’re basically just cleaning up the place. Now whilst mines are fun and everything, hopefully some of my work will involve comparing normal, down-the-road worms with the worms from this little volcanic island off Portugal with geysers and all. There you find all kinds of metals from inside the earth and CO2 concentrations of upto 20% in the soil. And still the worms are happy as Larry. My job is to find out how that works (and hopefully wangle a trip to Portugal).

How it works is anyone’s guess (the worm part, not me getting a trip to Portugal). The current thinking is that as there isn’t much difference between the actual worms from these different habitats, then it must be the microbiome that is doing the hard work of dealing with the toxins. I hope so anyway, or else this will be a very short non-PhD.

Earthworms have three sets of separate microbial populations: One in the gut, one in the ‘blood-like’ stuff that they have flowing round them and then a nephridial population (nephridia being a kind of a proto-kidney). The first two get built from the bacteria around them in the soil when they’re born and from other worms around them. This is known as horizontal transfer. But the nephridia has its microbiome transferred directly from the parent with no outside influence. This has led to it being referred to as a symbiont, and I can tell that all you evolutionarily acute people out there have guessed the kicker all ready. Being vertically transferred means that there should be some kind of selection pressure on the symbiot over generations and so distinct earthworm populations could have wildly divergent microbial symbionts. This could be (and is what we are expecting to be) a key factor in the earthworm’s adaptation to hostile environments. We’re even going to have a go at wiping out all the bacteria of the normal (control) earthworms and transferring the ‘specialised’ earthworm microbiomes in and seeing if it enables them to survive. Only time will tell.

Doesn’t sound like a lot, but there are quite a few facets to this, and it should take a solid 3+ years to get the whole thing done and dusted. I’m pretty excited about the whole thing and very much enjoy talking the ear off of anyone who shows the remotest interest so be warned.


Good. Glad to get that out of the way.



I’m going to start out by stating my position on the subject. If you don’t believe in evolution then you’re an idiot. That may sound a little harsh and I will add an addendum; You’re either an idiot or uninformed.

As one of the most ingenious and well documented ‘theories’ ever created (alongside the likes of gravity and relativity), it astounds me that so many people can dismiss it as out of hand as if they’re talking about faking the moon landing or Bush orchestrating 9/11. Clearly idiots.

Firstly, ‘theory’ does not mean “we’ve just come up with this idea…”, that’s hypothesis. Theory means there is this idea and it’s backed up by this evidence but we don’t have every angle that could ever be conceived explained, set in concrete and sat on a sturdy plate in the back room that can be wheeled out for photographs by the media monkeys. Evolution by natural selection should probably be called a law rather than a theory, like thermodynamics. Something like: “The unit which is most advantaged in the population is survives at a higher rate and increases in future generations”. The rest is just details.

That’s where the phrase ‘beyond all reasonable doubt’ comes into play. There is more evidence to prove evolution as the origin of all life on this innocuous planet than any reasonable person could disregard. You would either have to be an unreasonable person or someone who has never been informed of the facts that they’re overlooking. Both of which I believe occur in fairly equal proportions.
PersonA is uninformed. Which is fine. I spent years rubbishing The Smiths before actually sitting down and listening to the wonders that they produce. But that’s the thing. Eventually, after spending years with my fingers in my ears I gave it a chance and realised I was wrong. That could be PersonA with evolution
Person B is much more dangerous. They are smart. They are well-read. They are articulate. Statistically they’re probably American but it takes all kinds. They’ll have been brought up all their life to disbelieve evolution by stuck-in-their-ways parents and therefore when they’ve grown up and are presented with a trickle of facts which contravene their deep down indoctrination they block out whatever they don’t like, like a cartoon character with a cork in a leaky boat. “well, this AND this cant both be true so, as I had this stupid belief first I’ll stick to it and burn the other one”
Typical subconscious thought stream.

Anyway, there is plenty of literature, websites and bloggers spouting about the evo/creation divide and you are welcome to peruse that at your leisure but here’s my crack at evolution in the basic terms of Action Man (yes, that’s right, Action Man. It’s tangential but go with it).

So, imagine yourself as head of toy design at Action Man HQ. You’ve just invented the Action Man doll; a simple muscley bloke with a gun. It’s an instant hit with the young lads (and some girls). Your bosses are on your back already. The first one was such a success that they need a new design. A version2. But what? You get the thinking cap on, slave away all night and then you’ve got it:
“Let’s give him a bigger gun!”
Genius. Congratulations. They love it. It’s made, it’s sold and it’s got itself a nice place in the toy market. Sales of the original drop but that’s ok, because v2 is better. So you’re on a roll now. But what next? Step up the big brains. How about we release 2 at the same time? That’s all they need to hear. Cash signs in their eyes, the bosses ask for the designs.
“We have one that has a hang-glider, and one that has a bazooka”.
Brilliant. That goes out. Kids love it. But it has a bit of a knock-on effect on the v2. No-one wants a big gun when you can have a bazooka and so sales drop. Eventually, amongst all the bazooka and hang-gliding Men being bought people forget about the lowly v2. The company stops making it and it fades from existence only to fossilise in car-boot sales. The original chugs along nicely. Under the radar but it always has its niche.
By this point the design team is on overdrive, spewing out ideas all over the place. Hang-gliders with machine-guns on them, bazookas with laser-sights, dune-buggies with machine-gunning bazookas. Stocks in hyphens are going through the roof. The company churns them out to the shops and lets the public decide which ones are worth spending their hard earned time playing with. Some succeed, some fail. Some get replaced by newer models. Who needs a hang-glider when you can have a jet-pack? A few years on and there are hundreds of varieties in production. They have all kinds of weapons, vehicles, animals, teammates. There are ones that fly, swim, even talk. What a natural process of development, eh?

Now imagine you’ve never heard of Action Man before. First time you’ve seen it or anything like it and you’re a bit dim and don’t understand this process called ‘development’. Being confronted with so many different types of plastic doll you’d be confronted with two polarised opinions:
1 Somebody started with a simple doll and built it up with some becoming obsolete along the way (as above).
2 This one brilliant guy simultaneously came up with hundreds of designs, all of them solid gold, every single one capturing a niche in the market without one failure or significant overlapping characteristic.

Sound familiar?

So there’s no actual intellectual process behind evolution but I think the analogy covers the main points. Evolution is a process of trial and error and what works sticks around and what fails is naturally dropped. It’s not a complex theory or requires any schooling to comprehend. It’s the most simplistic concept known to mankind. Ok, so there are some bits that still have to be worked out and the infinitesimally small nuances can cause some mind bending but the overall concept is sound. And that’s all there is.

I refer to my original position.