[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] 2.0 Validations Module Proposal
Hi Ryan, all, I think a validation module would be quite nice to have. It would allow catching many issues where they really need to be caught: when translating. A few notes of things to possibly consider: - What regular _expression_ syntax should the module use? ICU?, .NET?, Perl?, Java? XSD? ECMA? other? - I notice the maxLength rule. How this would fit with the proposal that Fredrik put forward about length and size restriction? - Maybe the ‘custom rule’ could be defined with a clearer PR. For example, the case of the email pattern doesn’t tell you if there is a problem. Maybe a more generic way to work with custom pattern could be to see of a pattern in the source matches the same number of occurrences in the target. For the email example, it would mean a red flag if the email is not found in the target. - It seems noLoc would be very similar to <mrk id='1' translate='no'>...</mrk> A rational to justify both method would be nice to offer to the implementers. That’s all I have for now. -yves From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King In anticipation of closing down on 2.0, we have two new proposals for modules. In this mail, we are proposing the first of the two, a Validation module. Validating localized target data is a very important part of the business of outsourcing localization, especially when the extracted source content comes from software. Typically, there is a plethora of tools that content providers and localization suppliers use to perform a multitude of validations. There is a strong desire in the industry to bring some consistency to this space, but there are currently no accepted standards or interchange formats that facilitate this activity. We would like to propose a Validation module that would help with standardizing this crucial activity. The basic idea would be to define a small set of standard validation rules and standard descriptions for them that tool developers could consistently build business logic around. How a rule is applied to a string or sub-string would be done using regular expressions. These would all be contained in a Validations module. Here’s a draft of the Module for comment: Validations Module Module Specification Module NamespaceThe namespace for the Verification module is: urn:oasis:names:tc:xliff:validations:2.0 Module ElementsThe elements defined in the Validations module are: <validations>, <validation>, and <matchExpression>. Tree StructureLegend: 1 = one + = one or more ? = zero or one <validations> + | +---<matchExpression> 1 validationsCollection of validations to be applied by a validation engine Contains: - One or more <validation> elements Parents: <file>, <group>, <unit> and <segment> Attributes: - name validationSpecifies a validation rule, and a description and regular _expression_, which define how to apply that validation rule to the target text. Contains: - One <matchExpression> element Parents: <validations> Attributes: - id, rule, desc matchExpressionA regular _expression_ used to match the target text or substring to which the validation rule is applied. Contains: A regular _expression_ Parents: <validation> Attributes: - none Module Attributes The attributes defined in the Validations module are: name, id, rule, and desc. name |
Rule | Description |
maxLength:100 | Match string can’t be longer than # of chars specified. |
minLength:10 | Match string can’t be shorter than # of chars specified. |
noLoc | Match string shouldn’t be localized |
Etc. | Etc. |
Any custom rule | Any custom description |
Examples in XLIFF:
Using the following segment as an example
<segment>
<source> Contact me at someCompany: user@somecompany.com</source>
<target> Kontaktieren Sie mich unter someFirma: user@somecompany.com</target>
</target>
maxLength:100
. Matches “Kontaktieren Sie mich unter someFirma: user@somecompany.com“.
Match succeeds, so validation business logic checks to see if the string is less than 100 chars, that also succeeds, and the business logic then takes the appropriate action.
<val:validations>
<validation rule=”maxLength:100” desc=”Match string can’t be longer than # of chars specified.”>
<matchExpression>.</matchExpression>
</validation>
</val:validations>
noLoc
\bsomeCompany\b doesn’t match “someCompany” in the target text.
Validation business logic takes the appropriate action for the match failure.
<val:validations>
<validation rule=”noLoc” desc=”Match string shouldn’t be localized.”>
<matchExpression>\bsomeCompany\b</matchExpression>
</validation>
</val:validations>
Rules not defined in the Module can still be defined using the same mechanisms, though user agents that support the Validation Module may or may not have built-in implementation for them. An example might be to check if the target text contains a valid email address.
validEmail
\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b matches “user@somecompany.com”.
Validation business logic takes appropriate action for the match success.
<val:validations>
<validation rule=”validEmail” desc=”Match string is a valid email address.”>
<matchExpression>\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b</matchExpression>
</validation>
</val:validations>
Please let us know your opinion on this proposal.
Thanks,
Microsoft Corporation
(Ryan King, Kevin O'Donnell, Uwe Stahlschmidt, Alan Michael)
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]