Archive for March, 2009

EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

Monday, March 23rd, 2009

G Barton et al., BMC Bioinformatics 2008, 9:493doi:10.1186/1471-2105-9-493

emaas.jpgEMAAS is another environment for handling and analysis of gene expression data. The authors have set about the development of a distributed e-support system for the management and analysis of microarray data; to provide access to complex methods and to apply (from a biologist’s POV) non-trivial technologies to handle large multi-variate datasets.

Whilst other solutions have missed the point and taken an easy approach to solving the problem, the EMAAS approach is rather more complicated and relies instead on integration of internet accessible tools, standard statistical packages (R/Bioconductor) and web-resources (CELSIUS, GEO). The decision to aim for a modular and flexible framework is excellent and makes this in my opinion a very much more interesting project. The completeness with which tools and environments has been included is breathtaking; the depth of IT and analytical platforms required is rather daunting.

In contrast to the manuscript reviewed in the last post, this resource’s source is available under a suitable GPL license, and some of the demo server also works. I have some problems with the resource (Flash for a start), but this is one smooth implementation and is packaged in such a way that I could take it for a spin if I so wished!

This manuscript is heavy to read, but a damned fine resource is described underneath the technical fluff. This is a great resource and this earns a great recommendation from the bioinformaticsblog.

SiPaGene: A new repository for instant online retrieval, sharing and meta-analyses of GeneChip® expression data

Monday, March 23rd, 2009

Adriane Menßen et al., BMC Genomics 2009, 10:98 doi:10.1186/1471-2164-10-98

sipagene.png

This manuscript describes a new database, data warehouse and analytical platform for the handling of Affymetrix based gene expression data. The authors identify the need for a database that is convenient, facilitates online analysis and provides user-specific sharing options, and further qualifies their understanding of an unmet database need with the statement that “… existing tools do not use the whole range of statistical power provided by the MAS5.0/GCOS algorithms”.

I agree with the authors that there is such a gap within the database arena for a MIAME compliant database that provides both data warehousing and data analytical capabilities; the addition of user-specific access rights is great, but the MAS5 and GCOS methods undoubtedly have their place, but their usage alone is perhaps naive?

The authors fill a number of quite heavy pages with their description of a refreshingly heavyweight database infrastructure (Java, ancient Oracle) that is currently biased towards their local research environments interest in immunology, inflammation, regeneration and cancer. Such alengthily described database is then populated with only 1000 arrays.

This manuscript is of interest, the approach is nice; a combined warehouse and analysis environment. I have some problems with the database though. “Non-academic commercial use is restricted” is a waste; I would never consider paying for this resource when fantastic solutions from SAS JMP Genomics / GeneData / … with full support, testing and scalability are available with a lower TCO. To see what has been done, how well it performs and to play with a resource is nice.

I suspect that this is another fail – the online demo will not even work

sipagene_miss.png

So, nice try, but no cigar. The manuscript is nice, convincingly written and more professional than some solutions out there. The web presentation looks fugly, and is also broken. The politics of code availability is plainly stupid – those who can pay will not because the implementation is not sufficiently good – Charite, please make the code a little more available!

crashing visitor stats – not enough new content

Monday, March 23rd, 2009

distraction.jpg

I have been distracted – after a rather busy week on the road, and a week of catching up with paperwork and a few rather critical tasks within the corporate bioinformatics environment and the bioinformaticsblog is left feeling a little blue and rather unloved. There appear to be pretty good numbers of visitors; but the numbers have crashed over the last couple of weeks – there is a dearth of new content.

A quick check of the server logs show that “aroma.affymetrix” is again one of the top search strings that bring people to this place – my planned “aroma.affymetrix tutorials for bioinformatics dummies” is still in progress, and awaiting release. It is pretty crazy, but the next most popular theme for the bioinformaticsblog search is “bioinformatics iPhone”. I am not sure what I should be reading into this; but it seems that this is a hot topic for those of you out there writing BSc and MSc projects in bioinformatics at the moment.

I am really worried by some of the terms that bring people here – who got here through a search for “I love bioinformatics”? A good sentiment, but one that you should be careful in announcing – I will not name and shame (yet)! An equally good query is the “what to do with my life bioinformatics”. I guess that many of us have thought about this, but someone has actually used Google to solve their problem! Related queries include “what to do bioinformatics in industry” and “best working practice bioinformatics”.

Yikes – is this a good thing or something I should be very afraid of?

Bioinformatics and best working practices are somehow interlinked within industrial bioinformatics; and this is likely to include enterprise biocomputing solutions, but will probably not involve any aroma.affymetrix or iPhone. There does appear to be a collective angst in bioinformatics; but it really is a great place to work in-between the corporate reorganisations ;-)

Phenoforms, social classes and sitting in front of a computer with cookies?

Monday, March 16th, 2009

fat_people.jpg

Blogs, webpages and rants are out there to be read, to inspire and to establish dialogue. This blog page at the rather bluer than necessary Torygraph has an unnecessarily harsh dig at the obese poor. As someone who has lived with the indignity of X(n) sized trousers I can read this article with a mix of mirth and anger.

While the proletariat with TV dinners may show susceptibility to obesity, is there not a correlation with BMI and career. At bioinformatics meetings there is typically a Gaussian distribution of phenoforms and I would argue that whilst sitting at a computer as a productive “middle-class” bio-IT professional that background consumption of coffee (with full-fat milk), donuts and other fat and carbohydrate enriched snacks and a slightly more sedentary than absolutely necessary work style can lead to more issues.

The article in the telegraph has a cheap dig at a consequence, not a cause? Why is obesity such an issue – it seems to be the easy availability of pre-processed foods; easily digested and stored by the body. The lazy are more susceptible to the easy gratification from these well (synthetically) flavoured foods, and a viscious cycle is born. I am uncomfortable with the politicisation of this problem – let us consider the stereotypical gentleman of 150 years ago; a comfortable diet of meats and fortified wines and the corresponding problems with gout, diabetes and girth …

Bioinformatician on the road

Friday, March 13th, 2009

fail-owned-laundry-fail
see more pwn and owned pictures

I have had a pretty amazing week in Munich and Freising and have learned things that I needed to know (and unfortunately a few things I wish I didn’t need to know – qPCR really is a bizarre technology). Some former colleagues working with a pharmacogenomics CRO on the edge of Starnberger see recommended the failblog as a good place to waste some time! So winding knowledge acquisition down and exploring their suggestion has yielded a little understanding as to the site- and some pretty hearty laughs! Excellent!

I’ll be back in Finland tomorrow – I can start blogging again then and we should start having a look at the bioinformatics of quantitative PCR – a horrible subject! This has at least given me some enthusiasm to implement an R-based RDML parser – another contribution to head to Bioconductor!

Preclinical drug development explained?

Tuesday, March 10th, 2009


A crash course for bioinformaticians presented by the Deutsches Museum in Munich.

Posted by ShoZu

Drig discovery at the Deutsches Museum! A great exhibition.

Tuesday, March 10th, 2009


Posted by ShoZu

Bad food, the Burger King XXL

Tuesday, March 10th, 2009


I have been out of Germany for too long! As a postdoc this meal was a favourite, but after 5 years in Finland it now seems excesive! I am on the road in Munich for the next few days at a bioinformatics solution provider receivinh training for an enterprise system the corporation licensed and will attend a couple of sessions from qPCR 2009 whilst here. How do we find those biomarkers eh?

Posted by ShoZu

Knoppix to the rescue

Sunday, March 8th, 2009

knoppix

The MacBook Pro that I use by choice remains dead to the world and even formatting the disk has failed. I am therefore travelling in Germany with the not-really-fit-for-purpose Windows XP corporate laptop that I was issued with. I have a connection to the web, so how can we enjoy a few-hours of interuption-free time to do something interesting. My solution was to buy a German computer magazine (ct for those in the know) and to my absolute delight the company laptop will book from CD/DVD prior to the disk encryption password stuff… I am now stuck with a computer that thinks it has a German kezboard, (there we go again) when it in fact has a Finnish keyboard; the letters are all hopelessly mixed up, but I feel strangely empowered. Confusingly, all websites to the UK are blocked from this hotel so I am unable to collect my mail (private), upload images from my cell phone or other nice things, but a linux computer almost makes up for it all.

Bioinformatics, backups and disk disasters …

Thursday, March 5th, 2009

broken_mac.jpg

I guess that as a bioinformatician and as someone who works hard to stress a computer that failures should be part of the deal and something that we can deal with. I guess that hardware failure and software failure are part of the rich cycle of life? Within the last year I have had a completely failed RAID system (thanks LaCie, there went 1.5TB of disk space and several hundred GB of data that needed to be recovered), a Levovo laptop that now communicates not with projectors, batteries or disks and a failed disk on my wife’s ancient Vaio. Yesterday on the train the disk on my *new* MacBook Pro gave up the ghost, some C code was compiling (that Taxonomy project again ;-) ) and it just sort-of waited and nothing happened.

Last night I tried all possible routes of disk disaster recovery; I cannot mount the disk using target mode on other macs, DiskWarrior refuses to even look at it, and with some overseas travel coming up a week without a fit-for-purpose computer is looking inevitable. I know that hardware fails, but why don’t I keep backups? Sure, all of my code is kept with an SVN repository, datasets are typically mirrored across different computers, but a load of stuff like photos and iTunes lived only on the laptop.

Apple computers are pretty good, pretty smart and make life rather easy. I think that I really should get a TimeCapsule or an external disk so at least I can start routinely copying the valuable parts of my computational existence. We have the information management organisations in industry who make sure that we can’t waste our time or lose our data and establish meaningful processes. Why can’t I learn from their example?

Now to find the time to buy a new disk, a backup disk and start the slow process of recovering what may or may not be recoverable!