Archive for the ‘BioRAM linux’ Category

Caos NSA and Perceus: All-in-one Cluster Software Stack

Friday, February 6th, 2009

caos-nsa.png

As part of my best working practice, and as a devoted follower of linux technologies, paradigms and server-side computing I am a committed reader of the Linux Magazine. Yesterday they had a pretty good article on a linux distrubution that I have hitherto been unaware of (and should have known something about) and a rather basic introduction as to how to get the software running.

I am not sure what this means – at the moment BioRAM linux is slowly evolving, but is not really planned as a complete cluster software stack, but more as a bioinformatics cluster “central”. Am I re-inventing the wheel?

I have suspicions that BioRAM linux may be dead in the water (already) although it has a current, up-do-date and functioning application stack. I guess that the rPath linux made available through rBuilder may be a little off the bleeding edge curve, and I should perhaps again focus my energies elsewhere… This is rather negative thought, and anyone who works with me will know that negativity is bad and a positive spin must be added to everything ….

My feeling, Caos NSA looks like a very viable software distribution for a formal evaluation and review. To review such a tool we need a set of objectives, and some deliverables that will allow for an objective assessment as to whether the software is fit for purpose. Naturally, it is also rather important that we can package and install a set of software applications that will be needed across the cluster.

I’ll get back to this thought later, but I’m excited about the opportunity to take Caos NSA for a run over the weekend!

I need development help …

Thursday, January 29th, 2009

linuxbanner.jpg

I love rPath and rBuilder, I love bioinformatics and computer clustering is something that just rocks. I am now stuck in a rather uncomfortable place where I can move neither forwards nor backwards. The problem is I think quite simple. I wish to have native linux clustering within bioram-linux.I have looked at the Condor solution from University of Wisconsin. I have used this for a long time and it can be made to install.

Building (condor) from source is however quite beyond me, and there are a load of compile errors in my environment that I just can’t get beyond. A pre-compiled binary is therefore perhaps an easier option, but I have run into a slew of broken dependencies … I guess the easier path of using a standard linux distribution would be easier …In reconsidering the logic of using Condor I have gone back to the (binary distribution of) Sun GridEngine.This looks very much easier to deploy, but has some problems of its own … how can I establish a meaningful and working environment at build time, or do I do something truly ugly (and hard deploy a de-facto working environment).

Running a bioinformatics development unit as a hobby is sometimes more than I’m for (oh right, I’m not paid …), and sometimes it would be good to speak with some people who have an idea about what to do and how to do it. Are there any volunteers out there?I am certain that we can have a working GridEngine 6.2u1 in the BioRAM by the end of the day – it’ll be subject to certain requirements (second network card hard coded to address of 10.0.0.1), but I guess that I can live with it, and I guess that you will all have to live with it too…

linux distributions, custom images and respinning

Tuesday, January 27th, 2009

ssv_tux_stick_plain.jpg

I have been discussing rPath over the last few days and have put an not inconsiderable amount of effort into getting a bioinformatics linux distribution off the ground. rPath is perhaps not the easiest way to go; fedora offers its own way using the Revisor tool. I’ve been a fan of RedHat since my first excursion into linux 15 years ago, and have a Fedora workstation at home, but getting software packaged into .rpm is something that still makes me more than just a little crazy (Staden .rpm production caused much hair loss – I am certain of it!)

It is cool to see that on Slashdot this morning there is a link for a new Suse project going into alpha for exactly this purpose as well. SUSE Studio looks like an interesting tool and something that I should invest some time in exploring.

Bioinformatics is great; I am paid (a modest fee) to do something I enjoy and I would love to see more people using responsibly a wider range of tools. It is sad that so many so called “bioinformaticians” don’t know how to compile C code; have little understanding of the tools available and do not explore religiously the new publications in bioinformatics, nucleic acids research etc for new applications that will push the boundaries as to what they can do. I vehemently oppose the concept that meaningful bioinformatics can be achieved using a windows workstation alone!

I guess I should ask you bioinformaticians out there – how would you roll a meaningful bioinformatics distribution for use on either a workstation or for deployment across a cluster of tens or hundreds of servers?