ciq message

Subject: FW: [ciq] what is a street name?
From: Max Voskob <max.voskob@paradise.net.nz>
To: ciq@lists.oasis-open.org
Date: Sat, 12 Feb 2005 08:25:52 +1300
John,

Thanx for the info.

Basically, the benefit you outlined relates to a parsing and matching technology that needs to know
what is what in the street name. 
I have to agree that this an important feature for vendors of such systems.

I'm not that sure about the translation thing.
If I'm sending mail to Spain I'm likely to use their local names. I'm not aware about addresses
being actually translated.
E.g. address of US embassy in Korea : 32 Sejong-no, Jongno-gu, Seoul
or NZ embassy in Madrid: 3rd floor, Plaza de La Lealtad 2, 28014 Madrid, Spain

Another example is NZ Post Addressing guidelines.
They say that the only part of an international address NZ Post is interested in is the country name
which has to be in English, preferably in capital letters.

Can you give me a real use case when someone would want to translate an address?

Cheers,
Max



-----Original Message-----
From: John D. Putman [mailto:jdputman@scanningtech.fedex.com] 
Sent: Saturday, 12 February 2005 07:48
To: Max Voskob
Cc: John D. Putman
Subject: RE: [ciq] what is a street name?

My link to the mailing list is mucked up; so you can pass this on if you like.

The initial benefit is for matching / searching.  Heuristic rules and element name / abbreviation
data tables can be used to ATTEMPT to split a "raw" address up into these parts.  Matching /
searching algorithms (associated with address correction facilities or other such services) can then
better "target" their match / search attempts to the "higher-level"
entities in their reference data structures.

In the examples, this is the street name.  That is, it is once one has obtained, again by such
parsing, street name as a "target" within country, postal code and/or city and state.  But once
"there", comparisons can be made to the other "lower" level parts (directionals, suffix/type, etc.)
to attempt to achieve a match / hit.  These often center on another element you have not identified
- the edifice number (for instance, the "100" in "100 N Main St").  However, there ARE some
addresses with no edifice number!

Beyond that, phonetic and other algorithms (for instance, ones that have tables for common variants
of certain street parts - though that too may be INITIALLY handled / guessed in the front-end
parser, BUT checked again in conjunction with the reference data) operate on other parts for added
match confidence and/or correction.  The additional parts may suggest that another street is /
should actually be the target due to street name misspelling.

The actual delivery of parsed output once an address has been matched, can be used to populate other
data stores - where searching based on an individual or group of such parts can be accomplished.  A
customer may say - "I can't remember if I sent the package to 'Main' or 'Man' (street), but I'm
pretty sure it was a 'Boulevard'".  That, then, can be used in choosing / targeting a search toward
whichever address is a "Boulevard".  For the example, that may be neither of those, but could help
one say "I only find a "Mein Boulevard" ... is that it?"  

In addition, some (variable) address block formatting may be facilitated by having the parsed parts.
The more pedestrian use of that is for getting the whole address into a limited number of characters
/ space per line (splitting parts across lines, squashing spaces out of an address but capitalizing
the first letter of each part, etc.).  More esoteric uses involve converting the address into some
other language or format (English to Spanish, for instance - where one would need to convert
"Street" to "Calle" and place it at the "front" of a street address line, rather than have that as a
suffix).  Obviously for that purpose, it is better / easier to have / know each "part" to be
converted (translated and shifted appropriately).  I have also some time ago put the parsed parts to
good use for a back-end audit (in a rather lengthy and complex process to detect "false positive"
matches).

Thank you,
David Putman

-----Original Message-----
From: Max Voskob [mailto:max.voskob@paradise.net.nz]
Sent: Friday, February 11, 2005 11:38 AM
To: ciq@lists.oasis-open.org
Subject: [ciq] what is a street name?

Hi all,

Can anyone explain to me what's the benefit of splitting street names into multiple parts, such as:
- pre-direction (NORTH Baker st)
- leading type (AVENUE of John Banks)
- name (North BAKER st)
- trailing type (North Baker STREET)
- post-direction (Baker st NORTH)

This looks very normalised in the schema, but for what purpose?

Can anyone use those elements separately one from the other?
Any examples?

Cheers,
Max














To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to
http://www.oasis-open.org/apps/org/workgroup/ciq/members/leave_workgroup.php
.