Thursday, 18 August 2011

A good-enough impression

Leeds - as a long standing NGS partner site - want to hook our HPC service into the Grid.

We hope to fill the gap left when the last of our NGS-funded clusters was turned off back in April. Our main requirement was that the grid front end should be completely separate from the HPC service. In addition, we wanted...
We had originally hoped to follow the particle physicists and deploy CREAM.
Unfortunately, EMI-1 was missing the components needed to make CREAM work with the SGE batch system used locally. The only software within EMI-1 that was SGE-friendly was Nordugrid's Advanced Resource Connector - ARC.

After a few months of work and in the great tradition of the grid: it is sort-of-kind-of-working-after-a-fashion. At the moment:
  • ARC's compute service - A-REX - is accepting jobs: for a very limited set of users and not from the workload management system.
  • ARC's information provider - ARIS - is publishing information about the system and this information is making its way to the NGS's BDII.
I'll will cover A-REX in a future post. This week, you are getting information about the information provider - and in particular, how it links into the NGS.

A bit of background. The NGS information service is a Berkeley Database Information Index service or BDII. BDIIs are built to collate information, some of which comes from other BDIIs. The NGS's central BDII, for example, collates information published by a BDII, or something that looks like a BDII, at each of the partner sites.

ARIS can do a impression of a BDII. Whether it is a convincing impression depends on what it is talking to.

ARIS produces information in its own Nordic-accented schema, designed to feed the ARC tools. This needs to be translated into GLUE format before a BDII will give it a second glance.

Based on documentation on linking ARC and EGI from Nordugrid,, this can all be done via a single ARC configuration file called /etc/arc.conf.  arc.conf consists of blocks, denoted by a [name in square brackets] each containing a set of name=value definitions.

arc.conf needs to be tweaked in three places.

Turn on publishing of Glue 1.2 format information - which  is close enough to the current common Glue version 1.3 - by adding to the '[infosys]' block.


 [infosys]
  ...
 infosys_compat=disable
 infosys_nordugrid=enable
 infosys_glue12=enable

Add in anything that Glue needs and ARIS does not via the '[infosys/glue12]' block:

 [infosys/glue12]
 glue_site_unique_id="NGS-LEEDS"
 ...
 provide_glue_site_info=true

And finally arrange for ARIS to collect its own output and present it as if it were a site BDII by a block called


 [infosys/site/NGS-LEEDS]
 unique_id=NGS-LEEDS
 url=ldap://ngs.arc1.leeds.ac.uk:2135/mds-vo-name=resource,o=grid

Our initial experiments suggest that the information produced by ARIS is good-enough to be accepted the NGS's central BDII but not good enough to fool our Nagios monitoring.

WLCG Nagios includes a number of BDII specific tests including one called org.bdii.Entries. org.bdii.Entries only looks for 'services' - or more accurately objects of the 'GlueService' type. While ARIS generates a lot of information, none of describes a GlueService.

What we don't yet know if it Nagios is being picky, or whether the existence of a GlueService is vital for some bit of grid wizardry.

No comments: