Thursday, 29 July 2010

And relax

This time of year is always a bit quiet and it's an ideal opportunity to catch up on things and to get organised for the forthcoming conference season.

At the moment I've mainly been focusing on the organisation of the NGS Innovation Forum which will be held in November at STFC RAL.

One of the main new features of the IF will be a poster session on the Tuesday night which will enable NGS users to demonstrate how they have used NGS resources in their work. We're currently calling for abstracts and we'd like to strongly encourage all NGS users to submit a 200 word abstract for the event. There will also be a prize for the best poster as voted for by the delegates on the day.

Deadine for abstracts is the 10th of September!

I have also been busy confirming speakers so announcements will start appearing in the fortnightly email bulletin and on the website so keep an eye out on those! First announcement tomorrow (Friday) just to keep you all in suspense...

Thursday, 22 July 2010

NGS clouds take off!

On Tuesday we announced that the NGS cloud prototypes were ready and that we were seeking users. Well I'm glad to say that we have had an excellent response and we have identified several use cases to kick off with. NGS staff are now in discussions with users to take these forward and get them up and running ASAP.

If you would like to be amongst the first to use the new NGS cloud prototypes and to receive assistance in getting started then let us know by contacting the helpdesk with details of your research. Contact details are available on the NGS website.

Tuesday, 20 July 2010

Cloudy skies

There has been a lot of news about clouds recently, and looking out of my office window, I can certainly see plenty! I thought this was supposed to be summer?

Here at the NGS we are talking about clouds in a good way with the announcement that our clouds prototype is now live and we are looking for NGS users who would be interested in using the resources.

We have two cloud infrastructure prototypes available - one based in Edinburgh and one based in Oxford. Both sites have staff on hand to help you get started using the resources. If you are interested in finding out more please see the announcement on the NGS website.

Friday, 16 July 2010

Delivering data

There now follows a Public Service announcement from The National Grid Service Department of stating the bleeding obvious.

There is very little point in using Grid software on a machine in Daresbury to run an application on a computer near Didcot if the data you need is stuck on a server in Darwin.

That statement is not going to be a surprise to anyone. After all, the Worldwide LHC Computing Grid was built to ship the flood of data from CERN to somewhere where it could be stored and then on to somewhere where it can can be analysed.

When delivering data, there is definitely more than one way to do it: you could use GridFTP or SRB or iRODS or SRM, or SFTP or FTP or WEBDAV or HTTP or even, if you are feeling old fashioned, read and write to files on a local disk.

Things get more complicated when you need to collect data through one mechanism and deliver it through another. In practice, this almost inevitably means that the data is copied onto local storage before being sent to its final destination.

This is not practical if there is a lot of data and you are on a comparatively slow network connection.

This is one of the problems that the DataMINX Data Transfer Service (DTS) aims to solve.

The DTS is an international collaboration jointly funded by the Australian Research Collaboration Service and OMII-UK. It isn't really NGS R+D but it is built on earlier work from the NGS and staff from the NGS have provided much of the development effort.

The idea behind DTS is that you give the job of delivering your data to the DTS in very much the same way as you would give the job of delivering a favourite Aunt's birthday present to a parcel courier service.

A courier will have a network of planes, trains, vans and delivery drivers to collect the parcel and carry it to its destination. You just have to book your collection. Auntie just needs to sign for the parcel.

Delivery in the DTS is done by pools of worker nodes with fast network connections and the wherewithall to send and receive data using the many network protocols. An internal messaging system that allows requests for data transfers to be made and for the status of the transfers to be reported.

In software terms, the developers of DTS have deliberately avoided reinventing the wheel - something for which the Grid has a not-entirely-undeserved reputation. Where possible, they have adopted and adapted existing widely-used libraries. For example:

There is much more to DTS than can be covered in a blog post. If you want to know more: a powerpoint presentation describing of how DTS works can be found, with the source code, on the projects web site ( and a formal paper describing the work due to be published in Philosophical Transactions of the Royal Society A in late July or early August.

[With thanks to David Meredith of the DTS project.]

Wednesday, 14 July 2010

Head in the clouds

That's certainly where my head has been for the last 2 weeks as I've been on holiday! A big thank you to Jason and Jens for keeping the blog ticking over whilst I've been away.

However the title doesn't just refer to holidays (even though that's what most people are thinking of at this time of year) but also cloud computing. When I was away JISC released 2 reports on Cloud for Research.

The first report is entitled "Using cloud computing for research" and aimed to
  • document use cases for cloud computing in research for data storage and computing;
  • develop guidance on the governance, legal and economic issues around using cloud services for storage and computing in academic research;
  • make recommendations to JISC on possible further work in the area for data storage and computing.
The second report is entitled "Technical review of cloud computing for research" and focused on the following areas -
  • the current status of cloud computing in research communities;
  • state-of-the-art cloud technologies in academic, commercial, and industrial domains;
  • technical guidance on the use, adoption and migration to cloud computing for research;
  • recommendations for future technical and standardisation work to JISC.
The reports and the resulting recommendations make for some interesting reading.

Sunday, 11 July 2010

Organising Virtual Organisations

In one version of the future of the grid, we will be awash with Virtual Organisations (VOs).

There will be VOs representing everything from whole institutions and research areas, through regional grids, right down to individual research groups.

Grid service providers will pick and choose the VOs that they are willing, able - or even paid - to support and each and every supported VO will have to be added to the system.

Adding support for a VO is not entirely simple...

Each Virtual Organisation needs a Virtual Organisation Membership Service (VOMS). Unsurprisingly, the VOMS maintains the list of who is in the VO and though the magic of digital certificates can act as the definitive source for this information.

The Grid service must 'know' about the VOMS server before it can support the VO. It must also be able to associate the VO with local usernames and groups. So for each VO, you must
  • Add the contact details for the VOMS server to the directory /etc/grid-security/vomses
  • Add the public key for the VOMS server to the directory /etc/grid-security/vomsdir.
  • Create accounts and groups to be associated with the VO.
    This can be complicated where the grid site is part of a network where usernames and groups are managed centrally.
  • Add entries to the LCMAPS gridmapfile and groupmapfile mapping VO membership to local usernames and groups.
    The exact location of LCMAPS configuration files depends on your local configuration - they could be within the $GLITE_LOCATION directory or within /etc/grid-security.
  • If you are providing a 'pool' - a set of accounts set aside for a particular VO - add each account in the pool to the gridmapdir.
  • Apply local tweaks - such as modifying the monitoring/osg-user-vo-map.txt used for configuration of some versions of the Virtual Data Toolkit - to reflect your local VO to account mapping.
This is the kind of task that really needs to be automated.

If you use the YAIM tools to manage your site, you can add the VO details to the vo.d directory and the user accounts to user configuration. Our colleagues at Glasgow Scotgrid use the widely used CFENGINE automation tool to prepare, configure and run YAIM when creating VOs.
The NGS provide a script called ngs-voms-configure with the VDT installer from the NGS area on NeSCForge.

Ngs-voms-configure was written at a time when NGS partner sites were expected to support a common set of recognised VOs. It collects lists of VOs from a central service, locates and downloads their certificates and (optionally) creates accounts and updates any files that need updating.

The ngs-voms-configure script needs to be modified if - for example - your site uses more sophisticated methods for creating accounts. It also has problems collecting certificates when there are strict outgoing firewall rules in place.

One of the current R+D projects is the ngs-vo-tool - which extended the automation provided by ngs-voms-configure for the brave new world of VOs everywhere.

We are planning to use 'VO Cards' - downloadable blobs of XML containing almost everything you need to know about a VO and allow the mappings between VO and local pool accounts to be defined in a configuration file that contains sections like...


vocard =

local_user = ngs0001-1000
local_group = ngspool
The script is being written in python, developed at Leeds and - as of a few minutes ago - the local repository is being mirrored to the NGS code repository at NeSCForge in a module called ngs-vo-tool.

It is not yet complete. When it is, we hope it will be of use in really organising virtual organisations .

Friday, 2 July 2010

Joined up thinking in the NGS

Back in June, in the posting on the pitfalls of licensing on a Grid, I said that we were working on a new way of supporting those users who had licenses for applications such Amber, Castep, DL_POLY, GAMESS (US) and PC-GAMESS/Firefly.

It demonstrates how the various services that the NGS and its partner sites provide can be linked together to make using the Grid that little bit simpler.

Those who really don't have the time or inclination to wade though past postings only need to know:
  • these applications are made available to existing license holders by various NGS partner sites.
  • that the sites all have some kind of access control in place to enforce this.
  • that we use groups within a virtual organisation (VO) called as a way of recording who has signed up for what- so the access control lists can be kept up to date.
If you have a certificate in your browser, you can visit to see what we think you can do.

Membership information within the VO has so far been managed by pointing and clicking and fiddling in the VO web interface. This is slow and error prone - especially for those of us who still prefer to communicate with computers by typing.

Now - thanks to the efforts of the NGS staff at Manchester - we can update the groups automatically from tags associated with entries in our User Account Service. - the definitive database of who is, and was, an NGS account holder.

For example, all NGS users who we know have signed the academic license for Castep will have the tag application-castep assigned to them. When we turn automatic updating on - hopefully on Monday 5 July - any user with this tag will be granted membership of the Castep group.

If you are an NGS user with a license for one of these applications please let the helpdesk know.