OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

regrep-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Data Grids


Data grids implement the ability to submit, query and retrieve the contents of a registry and repository.  An example is the integrated Rule Oriented Data System, iRODS, available as open source software at
http://irods.diceresearch.org
The iRODS software has been under development since 2006 in projects funded by the National Science Foundation and the National Archives and Records Administration.  It incorporates registry and repository management functions that were first implemented in the Storage Resource Broker that was developed between 1996 and 2005.

The iRODS software is used to support data sharing environments, digital libraries, archives, and repositories.  Examples include French National Library, Australian Research Collaboration Service (national data grid), CyberSKA radio astronomy data, National Optical Astronomy Observatory data grid, genomics data grids (Wellcome Trust Sanger Institute, Broad Institute), satellite data (NASA Center for Climate Simulations), Ocean Observatories Initiative sensor data, EUDAT data replication, etc.

Some of the challenges that are faced when managing petabytes of internationally distributed data containing hundreds of millions of files include:
- managing interactions with heterogeneous storage systems (Windows, Mac, Unix file systems, tape archives, web sites, databases)
- enforcing assertions about collection properties (policy enforcement through a distributed rule engine)
- automating administrative functions (migration, replication, integrity checking, metadata loading)
- providing efficient data transport mechanisms
- supporting the wide variety of clients requested by user communities (web browsers, web services, load libraries, I/O libraries, file system interfaces, workflows, dropbox style synchronization, digital libraries, portals, webDav, grid tools, Unix tools, etc.)

The capabilities supported by iRODS include:
- submission of files into a repository
- management of descriptive metadata, system metadata, provenance metadata for files, users, storage systems
- queries on metadata, browsing on files
- registration of files from remote systems, web sites, archives
- data management functions such as replication, aggregation, distribution, caching
- policy enforcement for domain specific requirements (access controls, derived data product generation, automated metadata extraction, data processing, etc.)

Given a well defined API, it is possible to port the ebXML access mechanisms on top of the iRODS data grid.  The major concern is that the ebXML protocol is a constrained subset of the operations required by the above listed projects.

Reagan Moore
DICE Center
UNC-CH


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]