OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [docbook-apps] automated keywords and chunking


Hi Bob, 
Yes that footnote processing code is pretty complex. I took a stab at this first, basing my ideas on the code David sent earlier. It works for my toy examples, but maybe it's na´ve or only works for my situation. To repeat the original problem: 
I'd like to automatically create keywords for each output file that consist of the primary index terms in that file.  

<xsl:template name="user.head.content">
 <xsl:call-template name="keywordset"/>
 </xsl:template>

<!-- include indexterms as keywords in the current chunk only
        unless this is the lowest chunking level in which case,
        include *all* indexterm children
-->

<xsl:template name="keywordset">
  <!-- what is the current section level depth? -->
  <xsl:variable name="section_level">
    <xsl:number value="count(ancestor-or-self::d:section)"/>
  </xsl:variable>

  <!-- get the terms appropriate for this chunk -->
  <xsl:variable name="indexterms">
    <xsl:choose>
      <xsl:when test="$section_level = $chunk.section.depth">
        <xsl:copy-of select=".//d:indexterm/d:primary"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:copy-of select="./*[not(self::d:section)]//d:indexterm/d:primary|./d:indexterm/d:primary"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:variable>

  <!-- get unique set of index terms -->
  <xsl:variable name="indexterms-unique">
    <xsl:for-each select="exslt:node-set($indexterms)/*[not(. = preceding-sibling::*)]">
      <xsl:value-of select="normalize-space(.)"/><xsl:if test="not(position() =  last())">, </xsl:if>
    </xsl:for-each>
  </xsl:variable>

  <!-- if index terms are present, put them in the meta keywords tag -->
  <xsl:if test="not($indexterms-unique = '')">
    <meta name="keywords">
      <xsl:attribute name="content">
        <xsl:value-of select="$indexterms-unique" />
      </xsl:attribute>
    </meta>
  </xsl:if>

</xsl:template>

> -----Original Message-----
> From: Bob Stayton [mailto:bobs@sagehill.net]
> Sent: Thursday, September 24, 2009 1:04 PM
> To: Tim Arnold; DocBook Apps
> Subject: Re: [docbook-apps] automated keywords and chunking
> 
> Actually, David is correct, and it is more complicated than I originally
> stated.. The context node when user.head.content is called is a chunked
> section element, but it also includes *all* of its children, including the
> sections that will be in their own chunks and should not be processed for
> the keywords for the current chunk.
> 
> The chunking process works by recursive chunking.  That is, when the current
> section is chunked, it is processed until a child section is reached, at
> which point the processor will recursively start a new chunk on that, and so
> on.  Only when the last nested section is chunked is the original chunk
> finished off.
> 
> This makes it very difficult to detect which indexterms are contained in the
> current chunk by just using XPath from the top of a chunk element.
> 
> Footnote processing in chunked output has a similar problem, as the footnote
> list at the end of a chunk should only include those footnote elements
> within the current chunk text, not the child sections.  See the template
> named 'process.chunk.footnotes' in html/chunk-common.xsl and follow where it
> goes to see how that works to exclude footnotes from other chunks.
> 
> Bob Stayton
> Sagehill Enterprises
> bobs@sagehill.net
> 
> 
> ----- Original Message -----
> From: "Tim Arnold" <a_jtim@bellsouth.net>
> To: "DocBook Apps" <docbook-apps@lists.oasis-open.org>
> Sent: Thursday, September 24, 2009 6:03 AM
> Subject: [docbook-apps] Fwd: failure notice
> 
> 
> >> Thanks you both for thinking about this. Sorry I originally posted to
> >> the wrong group--I'm still learning my way around. From the looks of
> >> the project in front of me, my copy of Bob's book is going to be dog-
> >> eared soon.
> >>
> >> I think David is right for my case--I'm chunking down to the third
> >> section level and I only want the keywords that are actually in the
> >> current html file. Otherwise I would be messing up the search engines
> >> by advertising keywords that are not actually present in the file. But
> >> on re-reading Bob's message, I see that that's already taken care of
> >> since user.head.content is called with the node being chunked.
> >>
> >> I'm pretty sure my head will hurt before it's over, but I'll play
> >> around with it and step though the code; it looks very close to what
> >> I'll need to do. Thanks for the inspiration!
> >> --Tim Arnold
> >>
> >> On Sep 23, 2009, at 1:39 PM, David Cramer wrote:
> >>
> >>> Tim  may have to worry about chunking and the chunking settings,
> >>> depending on his situation. Say you chunk everything. If you have a
> >>> chapter with several subsections, then the first chunk contains the
> >>> chapter's toc and any content before the first section. You probably
> >>> don't want that chunk's html page to have all of the indexterms in  the
> >>> all of the child sections. Or if you stop chunking at a certain  depth,
> >>> you DO want the section at which you stop chunking to contain  keywords
> >>> for the indexterms of all its child sections. If he uses <?bjhtml
> >>> stop-chunking?> to stop chunking at arbitrary places, then that would
> >>> have to be addressed too.
> >>>
> >>> I wanted to do something similar to what Tim is asking for, but  didn't
> >>> go to the trouble of addressing all those because it made my head
> >>> hurt.
> >>>
> >>> If I recall correctly, the following tries to assemble a keywordset
> >>> unless there already is a keywordset and gets all the indexterms in
> >>> the
> >>> chapter or section but not get any from any child sections. It also
> >>> removes duplicate index entries.
> >>>
> >>> Anyway, perhaps the following can be a starting point or source of
> >>> inspiration.
> >>>
> >>> David
> >>>
> >>> <xsl:template name="user.head.content">
> >>> <xsl:call-template name="keywordset"/>
> >>> </xsl:template>
> >>>
> >>> <xsl:template name="keywordset">
> >>> <xsl:if test="not(self::book) and not(self::part) and
> >>> not(./*/keywordset) and (./*[not(starts-with(local-name(.),'sect'))
> >>> and
> >>> not(self::chapter)]//indexterm or ./indexterm)">
> >>>   <xsl:variable name="indexterms">
> >>> <!-- Get all the indexterms in this section/chapter only
> >>> (i.e. not in child sections -->
> >>> <xsl:copy-of
> >>> select="./*[not(self::section)]//indexterm/*|./indexterm/*"/>
> >>>   </xsl:variable>
> >>>   <xsl:variable name="indexterms-unique">
> >>> <xsl:for-each
> >>> select="exslt:node-set($indexterms)/*[not(. = preceding- sibling::*)]">
> >>>   <xsl:value-of select="normalize-space(.)"/><xsl:if
> >>> test="not(position() =  last())">, </xsl:if>
> >>> </xsl:for-each>
> >>>   </xsl:variable>
> >>>   <xsl:if test="not($indexterms-unique = '')">
> >>> <meta name="keywords">
> >>>   <xsl:attribute name="content">
> >>> <xsl:value-of select="$indexterms-unique"/>
> >>>   </xsl:attribute>
> >>> </meta>
> >>>   </xsl:if>
> >>> </xsl:if>
> >>> </xsl:template>
> >>>
> >>>> -----Original Message-----
> >>>> From: Bob Stayton [mailto:bobs@sagehill.net]
> >>>> Sent: Wednesday, September 23, 2009 11:11 AM
> >>>> To: DocBook Apps; Tim Arnold
> >>>> Subject: [docbook-apps] Re: [docbook] automated keywords and  chunking
> >>>>
> >>>> [moving this over to docbook-apps mailing list where
> >>>> stylesheet issues are discussed]
> >>>>
> >>>> By keywords I presume you mean generating <meta
> >>>> name="keywords"> in the HEAD element of each chunks?  If so,
> >>>> then you don't have to mess with the chunking machinery, you
> >>>> can use the placeholder template named 'user.head.content'
> >>>> that is available for inserting custom content.  See:
> >>>>
> >>>> http://www.sagehill.net/docbookxsl/HtmlHead.html
> >>>>
> >>>> That template is called with the node being chunked as the
> >>>> context node, so you should be able to select
> >>>> descendant::indexterm/primary elements and process them into
> >>>> the value of the meta element.
> >>>>
> >>>> Bob Stayton
> >>>> Sagehill Enterprises
> >>>> bobs@sagehill.net
> >>>>
> >>>>
> >>>> ----- Original Message -----
> >>>> From: "Tim Arnold" <a_jtim@bellsouth.net>
> >>>> To: <docbook@lists.oasis-open.org>
> >>>> Sent: Tuesday, September 22, 2009 4:35 PM
> >>>> Subject: [docbook] automated keywords and chunking
> >>>>
> >>>>
> >>>>> Hi,
> >>>>> I'm using DocBook 5 and the html chunking stylesheets. I'd like to
> >>>>> automatically create keywords for each output file that
> >>>> consist of the
> >>>>> primary index terms in that file.  I can manage that in
> >>>> postprocessing
> >>>>> using Python and lxml, but I wanted to ask here to see if
> >>>> this has  been
> >>>>> done before. It seems like something people might like to
> >>>> do, but  after
> >>>>> looking at the chunking code and reading about it, it
> >>>> sounds  like it
> >>>>> might be pretty complicated to implement.
> >>>>>
> >>>>> Any ideas on this?
> >>>>> thanks,
> >>>>> --Tim Arnold
> >>>>>
> >>>>>
> >>>>>
> >>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: docbook-unsubscribe@lists.oasis-open.org
> >>>>> For additional commands, e-mail: docbook-help@lists.oasis-open.org
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis- open.org
> >>>> For additional commands, e-mail:
> >>>> docbook-apps-help@lists.oasis-open.org
> >>>>
> >>>>
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
> >
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]