Title
Relationship between traits_inst.lookup_collatename and the regex FSM is underspecified with regards to ClassAtomCollatingElement
Status
new
Section
[re.grammar]
Submitter
Hubert Tong

Created on 2017-06-25.00:00:00 last changed 90 months ago

Messages

Date: 2017-07-12.01:35:02

[ 2017-07 Toronto Monday issue prioritization ]

Priority 3

Date: 2017-06-25.00:00:00

For a user to implement a regular expression traits class meaningfully, the relationship between the return value of traits_inst.lookup_collatename to the behaviour of the finite state machine corresponding to a regular expression needs to be better specified.

From N4660 subclause 31.13 [re.grammar], traits_inst.lookup_collatename only feeds clearly into two operations:

  1. a test if the returned string is empty ([re.grammar]/8), and

  2. a test if the result of traits_inst.transform_primary, with the returned string, is empty ([re.grammar]/10).

Note: It is unclear if bullet 14.3 in [re.grammar]/14 refers to the result of traits_inst.lookup_collatename when it refers to a "collating element"; and if it does, it is unclear what input is to be used.

It is therefore unclear what the effect is if traits_inst.lookup_collatename substitutes another member of the equivalence class as its output.

For example, when processing "[[.AA.]]" as a pattern under a locale da_DK.utf8, what is the expected behaviour difference (if any) should traits_inst.lookup_collatename return, for "AA", "\u00C5" (where U+00C5 is A with ring, which sorts the same as "AA")?

History
Date User Action Args
2017-07-12 01:35:02adminsetmessages: + msg9350
2017-06-25 00:00:00admincreate