OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office] metadata in new ECMA OXML spec


Bruce,

 From below:

> 1) like ODF, they use DC

OK.

> 2) also like ODF, this is VERY close to RDF. But rather than reuse 
> the  standard, there's some NIH here (RDF has an equivalent mechanism 
> to  assign a datatype to a property, but they've created their own; 
> more  below)

Prefer standard mechanism + possibility of alternatives.

>
> 3) unlike ODF, they also use Extended DC (dcterms)

Yes to Extended DC.

>
> 4) also unlike ODF, they separate metadata about the document from  
> metadata about the application.

Well, actually there is:

a. metadata about the document (creator, etc.)
b. metdata about the application
c. metadata about particular content in the document

While I don't think separation in markup is all that important I do 
think it is important to recognize the different types of metadata.

While some users will be satisfied with a + b, I think it is important 
to enable metadata as I describe in c.

Some applications or even applications in particular circumstances, may 
not need to process the metadata I describe in c but if it is simply 
preserved, it can persist for later use by an application or perhaps 
another application.

Hope you are having a great day!

Patrick
Bruce D'Arcus wrote:

> FWIW, here's what the standard doc metadata looks like in OXML:
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <cp:coreProperties
>    
> xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core- 
> properties"
>   xmlns:dc="http://purl.org/dc/elements/1.1/";  
> xmlns:dcterms="http://purl.org/dc/terms/";
>   xmlns:dcmitype="http://purl.org/dc/dcmitype/";
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
>   <dc:title>Title</dc:title>
>   <dc:subject/>
>   <dc:creator>John Doe</dc:creator>
>   <cp:keywords/>
>   <dc:description/>
>   <cp:lastModifiedBy>doejb</cp:lastModifiedBy>
>   <cp:revision>6</cp:revision>
>   <dcterms:created  
> xsi:type="dcterms:W3CDTF">2006-06-13T14:33:00Z</dcterms:created>
>   <dcterms:modified  
> xsi:type="dcterms:W3CDTF">2006-06-15T23:42:00Z</dcterms:modified>
> </cp:coreProperties>
>
> So:
>
> 1) like ODF, they use DC
> 2) also like ODF, this is VERY close to RDF. But rather than reuse 
> the  standard, there's some NIH here (RDF has an equivalent mechanism 
> to  assign a datatype to a property, but they've created their own; 
> more  below)
> 3) unlike ODF, they also use Extended DC (dcterms)
> 4) also unlike ODF, they separate metadata about the document from  
> metadata about the application.
>
> Here's an example of the latter:
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <Properties  
> xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended- 
> properties"
>    xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/ 
> docPropsVTypes">
>   <Template>Normal</Template>
>   <TotalTime>272</TotalTime>
>   <Pages>1</Pages>
>   <Words>57</Words>
>   <Characters>329</Characters>
>   <Application>Microsoft Office Word</Application>
>   <DocSecurity>0</DocSecurity>
>   <Lines>2</Lines>
>   <Paragraphs>1</Paragraphs>
>   <ScaleCrop>false</ScaleCrop>
>   <HeadingPairs>
>     <vt:vector size="2" baseType="variant">
>       <vt:variant>
>         <vt:lpstr>Title</vt:lpstr>
>       </vt:variant>
>       <vt:variant>
>         <vt:i4>1</vt:i4>
>       </vt:variant>
>     </vt:vector>
>   </HeadingPairs>
>   <TitlesOfParts>
>     <vt:vector size="1" baseType="lpstr">
>       <vt:lpstr/>
>     </vt:vector>
>   </TitlesOfParts>
>   <Company>MU</Company>
>   <LinksUpToDate>false</LinksUpToDate>
>   <CharactersWithSpaces>385</CharactersWithSpaces>
>   <SharedDoc>false</SharedDoc>
>   <HyperlinksChanged>false</HyperlinksChanged>
>   <AppVersion>12.0000</AppVersion>
> </Properties>
>
> I actually like the notion of separating out the metadata like this,  
> though it's probably not important enough for us to change. 
> Supporting  dcterms makes sense though.
>
> Picking up on the the comparison with RDF and the need for a model,  
> again, they've invented their own solution for extension. From the 
> spec  (part 4*):
>
>> 7.4 Variant Types
>> Variant types define storage elements for a comprehensive list of 
>> data  types. These elements serve as the framework for representing 
>> and  round-tripping complex properties and custom file properties. 
>> Each  variant type is defined as an element where the element name 
>> indicates  the type and element value represents the data stored. 
>> Variant type  elements may contain other variant type elements as 
>> child elements.
>
>
> In other words, these are complex extension properties. The  
> vt-namespaced stuff above is this sort of content.
>
> I actually find the spec here quite confusing.
>
> Bruce
>
> *  <http://www.ecma-international.org/news/TC45_current_work/tc45-2006 
> -338.pdf>
>
>
>
>

-- 
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work! 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]