GUIDELINE 6.57
Cutting Texts
ABAP_BACKGROUND
Usually the characters in a character string are displayed in a code
page using a fixed number of bytes. This means that it is always known
in the memory where a character begins and ends. However, in some code
pages, a character can be formed from a combination of several
separately saved characters.
This is done using the combined characters of some non-Unicode code
pages in non-Unicode systems, where a string of characters produces a
grapheme (the smallest unit of a writing system in a specific language).
This also applies to the characters of the surrogate area of the Unicode
character set, that are collectively represented in the Unicode
character display UTF-16 by two consecutive 16-bit replacement
codes ( Surrogate ). The surrogate area, for example, includes
several Chinese characters that are predominantly used in Hong Kong. The
ABAP programming area does not support this area. ABAP supports the
subset of UTF-16 covered by UCS-2 ,
in which each character occupies two bytes. One character in the
surrogate area occupies four bytes and is handled as two characters by
ABAP.
ABAP_RULE
Only cut texts between characters
Make sure that statements do not cut character strings in any places
with composite characters or surrogates .
ABAP_DETAILS
Operations that cut character strings include:
Subfield accesses with offsets/lengths or substring functions
The SPLIT statement
Every assignment to a character-like field that is too short, where one
side of the original value is cut off
If texts containing combined characters or surrogates are cut,
this can result in undefined characters that cannot be displayed. If
there is a risk of this occurring, you can define a suitable separation
position by using the method SPLIT_STRING_AT_POSITION of class
CL_SCP_LINEBREAK_UTIL .
Documentation extract taken from SAP system, � Copyright SAP AG. All rights reserved