Wednesday, 6 July 2011

ARC quirks and keeping track

Another week, another apology for having very little new to say.

Our excuse is that we have all been busy preparing for a meeting of NGS Collaborators (today, 6 July), a workshop on Moonshot, Grid and High Performance Computing (on Thursday) and for a Town Meeting on the future of e-Science and HPC Infrastructures and Applications in the UK (Friday).

On the plus side, when we have all recovered, there should have plenty to write about.

There has been a small amount of time available to work on Leeds' ARC grid software deployment - concentrating on the dull-but-useful task of tracking down and reporting minor bugs.

One such quirk appears when logfiles are rotated - that is renamed and compressed at regular intervals to conserve disk space. ARC continues to write to the original - now renamed - file rather than to a new one. We found the bug, and reported it and discovered that the developers were already aware and it will be fixed in the next release.

Which gives me a chance to opine...

In the 4-years-or-so that I have been involved with The Grid, I have needed to contact software developers all over the world.

With very few exceptions, the developers have been capable, helpful and responsive.
(and I am not going to to identify those very few exceptions)

The open nature of much of the development work, with bug databases and source code repositories readable by anyone, gives bug-hunters from outside the development team as much information as those inside. If used wisely, this information means better bug reports and faster fixes.

Unfortunately, bug-hunting is becoming harder. It is an unfortunate side effect of
the European Grid Infrastructure and European Middleware Initiative - and their remit to co-ordinate development activity from many disparate teams.

The individual development teams have their own development processes and tools....
On top of this is the all-seeing EMI GGUS tracking system that sends bug reports to the individual teams and the EMI wiki.

There are different interfaces for different tools, sometimes you need a certificate, sometimes you need an account. As someone slightly outside - but with an interest in - the development of grid software, I know how hard it can be to check if a bug has been reported, whether it has been fixed and when the fix will be available.

There is clearly work being done to improve the situation and no-one would claim that distributed, international software development is easy to do. What the grid really does not want to do is weaken whatever connections there are with the system administrators who deploy the software and the external developers who use it.

2 comments:

Brian Bockelman said...

Hi Jason,

Just a comment - the CVS site you linked to is in read-only mode, as all the gLite projects appear to have been migrated to CERN SVN.

Unfortunately, it makes it hard (for me) to find the various components. For example, the lcg-utils software is here:

https://svnweb.cern.ch/trac/lcgutil

but you have to go elsewhere to find FTS. It's a brave new world!

Brian

Jason Lander said...

Thank you. I've updated the post to point to the CERN TRAC.