Friday 26 August 2011

Sign Here

I would not want to describe the paperwork that goes with University life as something out of
Terry Gilliam's Brazil or Yes Minister.

It would be ill-advised: I haven't completed this month's NGS/B/11347/2(a) (permission to use ironic over-exaggeration within a blog) and submitted it to the appropriate authorities.

Depending on how deeply your institution loves its paperwork, there will be forms to be complete when claiming travel costs, or buying a new HPC system, or obtaining a replacement biro. Inevitably, somebody else needs to sign these forms to show that there has been due diligence and that the trip to Didcot, million pound compute cluster or cheap plastic pen are fully justified.

Somebody else isn't just anybody. When you present your form to the powers-that-be, the powers-that-be will carefully compare the signature with their collection of scribbles from the great-and-the-good.

Only when you have the right name in the right place on the right form will you will receive a new pen and a firm lecture about being more careful in future.

As we have said on a number of occasions, grid security is built on chains of trust. It also relies on the right signature being used in the right place. In our case, these are digital signatures represented by X.509 certificates rather than the spiders-web-on-acid scrawl of a senior University manager.

A certificate in your local list of trusted certificates - typically in /etc/grid-security/certificates - can be accompanied by a file defining its signing-policy. You can see some examples of signing policy files in the UK eScience Certification Authority pages on the website.

The signing policy is particularly influential at the very far end of the chain of trust: the root certificates. The private key associated with root certificates are kept in a Very Safe Place and are taken out only to sign the certificates of Certification Authorities (CAs).

CAs sign the certificates for the rest of us. The UK has two CA's - the main eScience CA and a SARoNGS CA.

Over the last weeks, thanks to the efforts of the dragon-slayers at the Software Sustainability Institute, we finally found out why certificates from our 'SARoNGS' CA were being rejected by the NGS's Workload Management Service.

There was nothing wrong with the certificates themselves.

The SSI developers quickly identify problems with the SARoNGS Certificate Revocation List (CRL) - a list of known-bad certificates that CA's should distribute.

SARoNGS certificates are designed to be short-lived - they expire before anyone gets a chance to do something bad with them - and the revocation list is empty. But all revocation lists - even empty ones - have expiry dates and ours had, unfortunately, gone stale.

Updating the CRL was comparatively easy but it did not solve the problem. The root cause of turned out to be the root certificate's signing policy.

The problem is that there are two signing policies - depending on whether you consider SARoNGS certificates acceptable.

SARoNGS certificates can be obtained using only a UK academic username and password whereas a full eScience certificates requires photo ID and a visit to your local Registration Authority.

The International Grid Trust Federation (IGTF) is responsible for ensuring that certificates are being created and managed in a trust-worthy way. It has strict rules on what constitutes sufficient proof of a users identity and - not to put too fine a point on it - an academic username and password are simply not good enough.

So the signing policy within the IGTF's bundle of UK eScience certificate information does not currently match the version we distribute. The IGFT version will not permit the eScience root to sign for the SARoNGS CA.

The root cause was a misplaced update that installed the IGTF version of the eScience root signing-policy - rather than the NGS's own.

We should have had the 'IGTF+' certificates - a modified version of the IGTF's certificate collection maintained by the NGS blogs' very own Jens Jensen, and incorporating the NGS's signing policy and some additional certificates.

The IGTF+ certificates are available in a number of formats from Jens's avowedly Web-1.0 certificate repository webpage.

[With thanks to James Perry, Steve Crouch and Rob Baxter of the Software Sustainability Institute]

Monday 22 August 2011

What do you do on the NGS?

It's been a busy month or so even though it's the holidays as this is the ideal time for me to contact many of the NGS users who promised to write user case studies for me regarding their research.

We have a wide range of reseachers who use the NGS as part of their every day research and it's important for the NGS to highlight that our resources aren't just used by the "typical suspects" such as physicists.

Over the last few weeks I've added another 2 user case studies to the NGS website -
Both these case studies demonstrate how the use of NGS resources is helping to speed up research enabling results to be produced and published faster than previously.

Edwards supervisor, Dr Anna Croft praised the NGS, "The NGS has been an excellent resource for many of our research projects. In particular, I have been able to use it with undergraduate researchers and give them a taste of what it is like to work on large computing infrastructures - an experience that has helped some of them secure PhD funding, both here and overseas, to continue in the computational area. When we had teething problems, the support staff were always friendly, helpful and got things working. Because of this support and the flexibility in requesting computing time, the NGS is one of our first ports of call for projects requiring a larger computing resource."

Thank you Anna!

Thursday 18 August 2011

A good-enough impression

Leeds - as a long standing NGS partner site - want to hook our HPC service into the Grid.

We hope to fill the gap left when the last of our NGS-funded clusters was turned off back in April. Our main requirement was that the grid front end should be completely separate from the HPC service. In addition, we wanted...
We had originally hoped to follow the particle physicists and deploy CREAM.
Unfortunately, EMI-1 was missing the components needed to make CREAM work with the SGE batch system used locally. The only software within EMI-1 that was SGE-friendly was Nordugrid's Advanced Resource Connector - ARC.

After a few months of work and in the great tradition of the grid: it is sort-of-kind-of-working-after-a-fashion. At the moment:
  • ARC's compute service - A-REX - is accepting jobs: for a very limited set of users and not from the workload management system.
  • ARC's information provider - ARIS - is publishing information about the system and this information is making its way to the NGS's BDII.
I'll will cover A-REX in a future post. This week, you are getting information about the information provider - and in particular, how it links into the NGS.

A bit of background. The NGS information service is a Berkeley Database Information Index service or BDII. BDIIs are built to collate information, some of which comes from other BDIIs. The NGS's central BDII, for example, collates information published by a BDII, or something that looks like a BDII, at each of the partner sites.

ARIS can do a impression of a BDII. Whether it is a convincing impression depends on what it is talking to.

ARIS produces information in its own Nordic-accented schema, designed to feed the ARC tools. This needs to be translated into GLUE format before a BDII will give it a second glance.

Based on documentation on linking ARC and EGI from Nordugrid,, this can all be done via a single ARC configuration file called /etc/arc.conf.  arc.conf consists of blocks, denoted by a [name in square brackets] each containing a set of name=value definitions.

arc.conf needs to be tweaked in three places.

Turn on publishing of Glue 1.2 format information - which  is close enough to the current common Glue version 1.3 - by adding to the '[infosys]' block.


 [infosys]
  ...
 infosys_compat=disable
 infosys_nordugrid=enable
 infosys_glue12=enable

Add in anything that Glue needs and ARIS does not via the '[infosys/glue12]' block:

 [infosys/glue12]
 glue_site_unique_id="NGS-LEEDS"
 ...
 provide_glue_site_info=true

And finally arrange for ARIS to collect its own output and present it as if it were a site BDII by a block called


 [infosys/site/NGS-LEEDS]
 unique_id=NGS-LEEDS
 url=ldap://ngs.arc1.leeds.ac.uk:2135/mds-vo-name=resource,o=grid

Our initial experiments suggest that the information produced by ARIS is good-enough to be accepted the NGS's central BDII but not good enough to fool our Nagios monitoring.

WLCG Nagios includes a number of BDII specific tests including one called org.bdii.Entries. org.bdii.Entries only looks for 'services' - or more accurately objects of the 'GlueService' type. While ARIS generates a lot of information, none of describes a GlueService.

What we don't yet know if it Nagios is being picky, or whether the existence of a GlueService is vital for some bit of grid wizardry.

Wednesday 10 August 2011

E's no good - Distinguishing between distinguished names


For most people, changes to the policies and standards that describe how the grid should work are met with a resounding 'so what'. For anyone involved in the day-to-day management of grid systems, it is an opportunity to join a collective sign-of-relief.

It is another example of where the 'political' aspects of international research collide with the technical solutions and the needs of researchers who don't give a damn how it works, as long as it lets them do their jobs.

X.509 certificates are complicated because what they represent is complicated - a link in a chain of trust between particular individuals or institutions.

Identities within certificates are tied to Distinguished Names or DNs. A DN is a lists of attributes - such as country, institution and personal name - that uniquely identify a single person, or computer, or service.

The way a DN is stored within a certificate is well-defined but completely incomprehensible to anything that is not a computer program. For many practical purposes, the DN needs to be presented so it can be understood by a person.

A glance at the OpenSSL X509_NAME_print_ex documentation shows how brain-twistingly complicated it can be translating a DN into something that a human being can read.

There is a more detailed explanation on the NGS Wiki. This is the quick tour..

Each individual attribute within a DN has a 'type' and a 'value'.

The type identifies what is being represented - a name, or an email address. It isn't really a name but but a unique sequence of numbers called an Object Identifier. Something like: 1,2,840,113549,1,9,1.

People, inexplicably, find sequences like 1,2,840,113549,1,9,1 hard to remember so for our benefit, 1,2,840,113549,1,9,1 is also known as "Email", "emailAddress" and - occasionally - "E".

The value is depends on the type. For 1,2,840,113549,1,9,1 - it is a string of letters represented in what is known as UTF-8. UTF-8 was developed to represent any letter from any language - but most Grid certification authorities deliberately restrict themselves to the 26 letters of the English alphabet, the numbers 0 to 9 and a few symbols. If they didn't, things would rapidly become even more complicated.

In human-friendly form, the DNs that Jens is working to abolish look very much like

 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/Email=support@grid-support.ac.uk
or maybe
 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/emailAddress=support@grid-support.ac.uk
or even, very rarely
 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/E=support@grid-support.ac.uk
Which variant you get depends on which version of which software is processing the certificate.

The problems appear when DNs are compared as strings of letters rather than in what could be called their 'raw' form.

Most software is smart enough to canonicalise these awkward examples by chosing One True Name for 1,2,840,113549,1,9,1 and substituting this before comparing. Not all software packages agree on which name is the One True Name.

It is now common practice to represent certificate chains in .LSC format - which are simply lists of human-friendly DNs. These may be simple to distribute and do not need to be updated every time the certificate is renewed.

The would be good enough - if it wasn't for that troublesome email address.

Monday 8 August 2011

On email address in host certificates

Every so often we get questions about email addresses in the names (distinguished names, ie DNs) of host certificates. The problem is that they are deprecated (see the last two paragraphs of section 4.1.2.6 of RFC5280), and they cause all sorts of problems with software which stringifies the DNs because there is no consistent way of doing it (or rather, there are too many consistent ways.) Arguably the software is not coded correctly, but in this case it'd be better to remove the email.

The email is there for historical reasons: when we rekey a certificate we have to give it the same name as before, so that's why it is still there. Dating back ten years or so, the original raison d'ĂȘtre was that before robot certificates, hosts would sometimes run stuff on behalf of users, ie. act as a client, and the email address was meant to give you something to contact when you read the DN in the log file.

The new policy will permit removing the email address from DNs. That's the easy bit.

The trick is to get the software to optionally (at the owner's request) remove the email address from the DN (because some people may genuinely want to keep it, for whatever reason.) Or rather, optionally keep it. The software cannot do this yet.

In fact, it'd be easier to just remove it for all host certificates, or maybe to handle those "manually" who still want to keep it, as with robots for example. If anyone out there has host certificates and depends on email being present in the DN, could you let us know via the usual channels, please? There are no known problems with removing the email address, only with keeping it, but there may of course be unknown problems - there are lots of weird and wonderful things out there.

As for timescale, it'll be ready at the latest when the new (rollover) CA certificates go live at the end of September.

Thursday 4 August 2011

E-infrastructure summer school - registration open now!

A brief hiatus here on the NGS blog as several of us are / have been on holiday. Back to normal service now hopefully!

A lot of my time since I returned from holiday has been devoted not to the NGS but to another project I am currently working on. Catchily named SeIUCCR (pronounced "sucker"), the project was funded by the EPSRC "Crossing the Chasm" call which called for networks and "advocates" to promote the wider uptake of UK e-infrastructures by researchers in engineering and the physical sciences.

Part of the SeIUCCR project is an e-infrastructure summer school which is due to take place in Abdingdon near Oxford in September. The residential summer school will offer an introduction to e-infrastructure including Clouds and Grids to UK PhD students and postdocs over 4 days. The summer school is fully funded including travel expenses and applications are open now.

If you (or anyone you know) would like to apply then be quick as applications will close at 9am on Monday 15th of August. Details of the summer school can be found on the SeIUCCR website and a detailed agenda is available from the registration site.