BOSC 2011


This week I was fortunately able trip it out to Vienna for BOSC, the Bioinformatic Open-Source Conference. As the title would indicate, the focus of the occasion was to showcase the latest in not-for-profit, academic software that is out there and have a thorough crack at laying some plans for the future. Collaborations have been formed which would be nigh-on impossible to be done over the internet and truly reflect the open nature which the attendees preach, with everyone jumping in and trying to contribute.

I was in attendance in the wake of the Biolinux Team funded by NEBC (NERC (Natural Environmental Research Council) Environmental Bioinformatics Centre), aka the Wallingford arm of NBAF (NERC Bioinformatic Analysis Facility) which is funding my PhD. (N.B. Acronyms are fun!). It was a fully immersive experience being surrounded by the big hitters in the field and I saw the Biolinux group of about half a dozen which I was acquainted with, grow by multiples of itself.  By the minute there were fantastic projects being squozen into an ever decreasing amount of free space in my brain. Re-occurring themes in the form of Galaxy, Taverna and the Cloud have henceforth replaced the natural meanings of words in my mind so that I can’t ever have a bar of chocolate, go into a small Greek restaurant or even look at the sky the same way again (more on these later).

The BOSC event was preceded by the obviously named Codefest. This allowed the attendees two days in which to get together and hack together, producing an intensive and productive work environment with results which I’m sure reflect this. This went on in Metalab, a basement hacker space near the Rathaus in down-town Vienna which was kindly opened up to us in return for our buying of their caffeinated lemonade (supplemented by an accidental donation due to a room full of heavily qualified scientists being unable to calculate a restaurant bill!) Again, the opportunity to coalesce ideas and projects didn’t go amiss. Biolinux is a Ubuntu-based OS designed for bioinformaticians and biologists alike, and is a platform where they can perform their analysis in an environment already brimming with bioinformatic tools that have already gone through what many will attest can be some damn tricky set-up procedures. The latest buzz in the Bioliunux camp is the future expandability by venturing into the cloud computing environment to handle the data explosion which is all too well understood in the molecular biology world. Creation of this hybrid followed the Ronseal school of naming: CloudBiolinux.

Galaxy, which is a fantastically powerful and simple workflow system for working with genetic data from a web-server, got underway into being incorporated into Biolinux as soon as its awesomeness was realised (Its awesomeness is already well established, however I think meeting the people involved was the catalyst required!). Another system that quickly became hot-currency was CloudMan. This is a simple way of getting into the world of cloud computing on the Amazon cloud without all the dallying around in setup and allowing straightforward scaling of the processing the user requires in the elastic cloud environment. Natively in allegiance with Galaxy, Biolinux’s tail was quickly wagging.

My association with Biolinux and personal scientific preferences obviously give a strong bent to my observations of what was going on, however the whisperings I heard regarding other tools being developed in the same room were fully presented in the coming days once the official BOSC talks started. I’m writing this from a hazily jet-lagged memory on a cramped National Express Coach and so won’t go into each project discussed (Brad Chapman has already done this here) but here are some notable talks that stuck in my head and general themes found:

The proceedings started off on the first day with UGENE from unipro. This is a flashy and high quality integration of popular bioinf tools with a whole heap of visualisation kit included. Very nice, shiny and user-friendly. They also commented on their optimisation for multicore and Graphical Processors (GPUs). Apparently, running on a GPU will speed up a HMMER2 alignment by 30x vs an intel core duo).

Integrated workbenches and workflow tools were very much the flavour of the day, with Mobyle and Galaxy also giving talks in this category soon after. Mobyle follows along the same lines as UGENE, with powerful visualisations and an alignment viewer in the style of Daliance, called Jalview. The Galaxy system was heavily cited throughout the whole event with various projects incorporating and contributing to its vast prowess. Galaxy also had a talk dedicated to its genome visualisation tool Trakster which again is a javascript browser built for NGS, and also has a mechanism in which to share and publish the visuals generated. The Dalliance browser itself is a stand-alone web-based genome browser which bridges the gap between web and desktop based genome browsers and looks damn good whilst doing it.

There was lots of talk of Taverna integration with Galaxy, as one workflow tool to another, both in presentations given and in the cutely named ‘Birds of a feather’ meetings at the end of the day.

Intermine was up early in the first day and in my opinion was one of the best presented talks of BOSC. This is a data warehousing system, designed to pull together all of the data on your target (ie species) and make it accessible and highly queryable in a supercool (my words) interface. Designed for FlyMine, it’s also widely used as YeastMine, RatMine and others.

Other notable talks of the first day were the WebApollo genome annotation program, which allows multiple users to annotate simultaneously in a very google-doc-esque manner (as Peter Cock put it) and the Biomanycores project which is (although not exclusively, at least heavily) involved in porting bioinformatics on to graphical processors to make use of the multiple core construct they are built upon. (Personally, I find GPUs damn cool but that’s neither here nor there).

Day Two was very definitively a cloud day, starting from a cloud based pun from the chair which received an appreciative groan from the whole audience, and was followed by an interesting and dynamic talk by Amazon’s Elastic Cloud computing Evangelist. This set the scene for various cloud based programs including OBIWEE, a natural linux shell environment and the afore mentioned CloudMan, which ports (as a proof of concept) the galaxy system cloud-wards.

The CloudBiolinux presentation was a personal highlight, which shown how an individual can quickly set up an EC2 server and have the whole suite of tools available on biolinux in a cloud environment, and with the full graphical user interface found on the standard desktop biolinux.

The events ran to a close with semantic web discussions, a talk from Debian Med and updates on the various bioinf standards ie biopython, GMOD and EMBOSS. The closing words were given to a Microsoft representative, proclaiming the .net framework for doing bioinformatics on a Microsoft platform. It took some balls to stand in front of a clearly sceptical crowd and there weren’t half as many burning Bill Gates effigies as was expected.

To summarise, BOSC was a fantastic experience to see the very latest in the world of bioinformatic systems and to see the cogs that keep the biology wheels turning. There was so much new stuff pushed into my brain that I’ve only managed to extract a small portion to put down in type here, but I hope it conveys the knowledge and experience gained. Until BOSC 2012…

Edit: I have just come across this blog from Steffen Möller regarding the same events I have described. Summary: Everyone loves BOSC!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s