Skip to Main Content
IBM Power Ideas Portal


This portal is to open public enhancement requests against IBM Power Systems products, including IBM i. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Future consideration
Workspace AIX
Created by Guest
Created on Aug 31, 2020

AIX Unicode Regular Expression Support

IBM locale definitions were used to build all locale objects until 2003, when the Unicode Consortium developed the Common Locale Data Repository (CLDR) for building new UTF-8 locales. Those IBM defined locales (e.g IBM legacy locales) were named language[_territory][.codeset], and the language tag was fully upper case format.
* For example: EN_US (EN_US.UTF-8).

The New CLDR based AIX UTF-8 locales are built from CLDR source, and are named language[_territory][.codeset], and the language tag is fully lower case format.

* For example: en_US.UTF-8.

There is no short name alias for the CLDR UTF-8 locales.

The data in CLDR is gathered through the Unicode Consortium's Survey Tool. http://cldr.unicode.org/index/survey-tool.

Contributors from Unicode Consortium members, other organizations and the public at large
are invited to review the data for their languages and countries, and propose new translations of terms or modifications.

There are variations in locale behavior (for example, collation, date formats, etc.) between the older IBM locale definitions and CLDR definitions.

-Some open source products hard code [a-z][A-Z] as English lower case letters a-z (97-122)and upper case A to Z (65-90) so only ASCII characters are returned.

This does not conform to Open Group standards definitions of regex, which states:
https://pubs.opengroup.org/onlinepubs/007908799/xbd/re.html

*A range expression represents the set of collating elements that fall between two elements in the current collation sequence, inclusively. It is expressed as the starting point and the ending point separated by a hyphen (-).Range expressions must not be used in portable applications because their behavior is dependent on the collating sequence. Ranges will be treated according to the current collating sequence, and include such characters that fall within the range based on that collating sequence, regardless of character values. This, however, means that the interpretation will differ depending on collating sequence. If, for instance, one collating sequence defines ä as a variant of a, while another defines it as a letter following z, then theexpression [ä-z] is valid in the first language and invalid in the second.*

Hard coding these values would omit lower case accented a characters, which is not correct behavior per collation standards.

There are some Unicode regex engines:
-Perl
-PCRE Perl Compatible Regular Expressions
-Java

This enhancement request is for a native AIX Unicode Regular Expression Engine that would provide the level 1 conformance as described in the Unicode® Technical Standard #18: UNICODE REGULAR EXPRESSIONS
https://unicode.org/reports/tr18/
RL1.1 Hex Notation
RL1.2 Properties
RL1.2a Compatibility Properties
RL1.3 Subtraction and Intersection
RL1.4 Simple Word Boundaries
RL1.5 Simple Loose Matches
RL1.6 Line Boundaries
RL1.7 Supplementary Code Points

Idea priority Medium
  • Guest
    Reply
    |
    Oct 6, 2020

    This shall be a good feature.