[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [regrep] Re: [regrep-comment] Data Grids
We have received a public comment from Dr. Reagan Moore, Director of the DICE Center at University of North Carolina, Chapel Hill.
I have sent him a brief message acknowledging recipt of the message and that we will be sending him a formal response on the comment list after due consideration.
I would like to propose that we interactively discuss this public comment (see below) at our next telecon and formulate a response.
My initial thoughts are summarized as follows:
- It would be very desirable to have a RegRep ebRS standard interface on iRODS and to use ebRIM as format for metadata for iRODS metadata. Both the ebRS protocols and ebRIM model are highly extensible so this could be done in an extension profile of RegRep 4.
- Part of the challenges posed by the public comment fall within the purvue of implementation rather than specification. For example, handling of diverse content and environment may be handled by Repository extension plugins. SInce repository interface is not exposed in RegRep 4 it becomes an implementation rather than specification issue. One may question whether RegRep should have a public Repository interface as part of a future specification version.
- Policy enforcement is something that current spec partially addresses via XACML policies and with the BPMN based governance policies. I am quite interested in exploring what if anything the standard should consider including to support rule based policy enforcement.
- It would be good to invite Dr. Moore to attend a TC meeting as an invited expert to discuss the iRODS requirements and any gaps with RegRep specification.
Please share your thoughts on this email thread until we meet to discuss this interactively in a telecon. Thank you.
On 05/18/2012 11:01 AM, Reagan Moore wrote:Data grids implement the ability to submit, query and retrieve the contents of a registry and repository. An example is the integrated Rule Oriented Data System, iRODS, available as open source software atThe iRODS software has been under development since 2006 in projects funded by the National Science Foundation and the National Archives and Records Administration. It incorporates registry and repository management functions that were first implemented in the Storage Resource Broker that was developed between 1996 and 2005.
The iRODS software is used to support data sharing environments, digital libraries, archives, and repositories. Examples include French National Library, Australian Research Collaboration Service (national data grid), CyberSKA radio astronomy data, National Optical Astronomy Observatory data grid, genomics data grids (Wellcome Trust Sanger Institute, Broad Institute), satellite data (NASA Center for Climate Simulations), Ocean Observatories Initiative sensor data, EUDAT data replication, etc.
Some of the challenges that are faced when managing petabytes of internationally distributed data containing hundreds of millions of files include:- managing interactions with heterogeneous storage systems (Windows, Mac, Unix file systems, tape archives, web sites, databases)- enforcing assertions about collection properties (policy enforcement through a distributed rule engine)- automating administrative functions (migration, replication, integrity checking, metadata loading)- providing efficient data transport mechanisms- supporting the wide variety of clients requested by user communities (web browsers, web services, load libraries, I/O libraries, file system interfaces, workflows, dropbox style synchronization, digital libraries, portals, webDav, grid tools, Unix tools, etc.)
The capabilities supported by iRODS include:- submission of files into a repository- management of descriptive metadata, system metadata, provenance metadata for files, users, storage systems- queries on metadata, browsing on files- registration of files from remote systems, web sites, archives- data management functions such as replication, aggregation, distribution, caching- policy enforcement for domain specific requirements (access controls, derived data product generation, automated metadata extraction, data processing, etc.)
Given a well defined API, it is possible to port the ebXML access mechanisms on top of the iRODS data grid. The major concern is that the ebXML protocol is a constrained subset of the operations required by the above listed projects.
Reagan MooreDICE CenterUNC-CH
-- Regards, Farrukh Najmi Web: http://www.wellfleetsoftware.com