OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: RE: DOCBOOK-APPS: use.id.as.filename may produce invalid filename s



This may work most of the time, but IMHO such procedures must be safeproof.
My German is somewhat rusty, but there are surely many cases where this may
produce clashing filenames, for example füllen and fällen would both lead
to f_llen.
A better solution would be to use alternative forms for these characters
while in ids: change ü for ue, ß for ss, etc. before chunking. Then again,
this can also produce clashes, z.B. weiss und weiß would both lead to
weiss.
I've had the same problem with Portuguese, and found no general solution. I
just switched to using just English in the ids, where non-ASCII letters
appear rarely, mostly in imported French words (naïve, façade, etc.). I
just avoid them or write them incorrectly, as most Americans actually do
(naive, facade).

Cheers.

=============================================
Marcelo Jaccoud Amaral
Petrobrás (http://www.petrobras.com.br)
mailto:jaccoud@petrobras.com.br
=============================================
On the other hand, you have different fingers.




                                                                                                           
                      Gisbert Amm                                                                          
                      <gia@webde-ag.de>        Para:     'Jirka Kosek' <jirka@kosek.cz>                    
                                               cc:       docbook-apps@lists.oasis-open.org                 
                      12/02/2003 13:12         Assunto:  RE: DOCBOOK-APPS: use.id.as.filename may produce  
                                                invalid filename     s                                     
                                                                                                           




> The best you can do is simply avoid these characters in IDs. Not all
> currently used system are prepared for characters outside ASCII in
> filenames, URLs and so on.

I want the IDs as similar to the headings as possible (much easier to
maintain).
Therefore an umlaut is nothing exotic in such an ID for me in German.

> > 2) Wouldn't it be reasonable to provide a parameter to
> avoid any special
> > characters in the generated filenames?
>
> IDs should be persistent and unique. What about, if you have two IDs
> like "huh" and "hüh"? After converting you will have two same IDs.

After converting I'll have one file called huh.html and one called h_h.html
Surely you've meant something like 'hüh' and 'höh' which will both result
in
h_h.html

Actually the IDs in my documents are much longer, therefore I don't see a
big problem.
If the IDs are non-trivial, there will be very seldom such similarities.
E.g. there is no word as 'Einföhrung' or 'Einfährung' which could clash
with
'Einführung'. And the IDs have to be one to one within the same document
anyway.

They only thing that could happen is that the first HTML file would be
overwritten with the second one.
And this would strike immediately. One would change the ID in question and
go ahead.

Regards
Gisbert Amm







[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC