Unicode With Presentation

Software systems use fixed-length bit sequences for internal character representation.
This length specifies the number of characters that can be displayed in total and a
Character Set Table is used to match the assignment between characters and bit sequence.
For example, the ASCII character set, which has 8 bits in length, consists of 256
characters.
If you were using the ASCII Character Set and you wanted other characters to be
processed, you would need to load a different character set table.
Extra work is therefore involved when different users use different character sets and
parallel text-based processing is required. Exchanging data between these is also not so
easy.
The Unicode Character Set Table has been defined for this purpose and it is large enough
to contain all the current character sets. It has a 16-bit sequence length, which results in
65,536 possible codes. SAP supports the Unicode Character Set since SAP Web
Application Server 6.10.
Predefined data types in Unicode programs include the character-type: C, N, D, T and
STRING. Structure types that contain components of these types would also form a part
of the character-type.
In non-Unicode systems, a character of this type is one byte.
In Unicode systems, it is as long as a character on the respective platform. X and

XSTRING-type variables are described as byte-type. Earlier everything was treated as
character type but from SAP Web Application server 6.10, there is a distinction made
between character type and byte type arguments.
For compatibility, character string commands in their standard form always expect
character-type arguments. The statements are then converted by the system, character by
character. The corresponding variants of these statements for byte sequence processing
are recognizable by the IN BYTE MODE addition. With this addition, the statements
expect byte-type arguments and are converted byte by byte.
The STRLEN function always expects character-type variables and returns their length in
characters. With type C variables, only the occupied length is relevant and trailing blanks
are not counted.
The XSTRLEN function returns the length of byte sequences. It always expects byte-type
variables and returns the current length for type XSTRING and the defined length in
bytes for type X..
The image on your screen shows sample code for both these functions..
Apart from the comparison operators shown on the left of the image, you will notice six
new operators that have been defined and which are identified by the prefix BYTE.
Usage of this is also shown in the sample code on the screen..
Depending on the platform, some data types require a specific alignment to be met. For
example, there may be a requirement to begin at a specific memory address. Within a
structure, during runtime, “alignment” bytes would be inserted by the system either
before or after the component with details of the alignment.
The system first creates a Unicode fragment view to check whether such conversion is
possible. The view groups together adjacent components and alignment gaps. This view
can be seen in the classic debugger. A sample is shown on your screen for reference.
If the fragments of the source and target structures match the type and length as the
length of the shorter structure, conversion is allowed. Else an error occurs in the Unicode
check.
If the target structure is longer than the source structure, the character-type components
of the remainder are filled with space characters. All other components in the remainder
are filled with the type-specific initial values. Alignment gaps are filled with null bytes.
Components from other types like P, F, String and XString are not considered but treated
individually.
Continuing with the rules for conversion, we will now look at some rules for conversion
from structures to elementary data objects.
 If a structure is contains only character-type data, the same is like a type C data
object during conversion
 If the structure is not completely character type, the single field must be type C
and the structure must begin with a character-type fragment that is at least as long
as the single field.
If the target field is a structure, the remaining character-type fragments are filled with
space characters and all other components with the type-specific initial value.
From our earlier learning in this course, you will remember that for character-type
variables, offset and length are interpreted character by character and types X and
XSTRING, the values for offset and length are interpreted byte by byte.
For structures the offset and length accesses are only permitted in Unicode programs if
the structure is flat and the offset and length specifications only contain character-type
fields starting from the beginning of the structure.
ABAP allows for Unicode character sets since SAP Web Application Server 6.10.
However, you must be careful to ensure that information about the internal length of your
characters does not spill over to your program.
While the ABAP Workbench supports you when working with existing code, you may
have to make certain adjustments. The syntax check has been extended to include
Unicode compatibility also.
To execute the relevant syntax checks, you must set the indicator Unicode Checks Active
in the program (or class) attributes. This is the standard setting in Unicode systems.
If the Unicode indicator is set for a program (or a class), the syntax check and program
are executed in accordance with the rules described in the Unicode online help. (This is
irrespective of whether the system is a Unicode or a non-Unicode system).
If the Unicode indicator is not set, the program can only be executed in a non-Unicode
system. For such programs, Unicode-specific changes of syntax and semantics do not
apply. However, you can use all the language enhancements introduced in connection
with the conversion to Unicode.
The abap/unicode_check parameter controls the execution of Unicode checks during the
syntax check and at ABAP program runtime in a non-Unicode system.
The parameter can take on the following values:
On: The Unicode checks are performed for each ABAP program. The system behaves as
though the program attribute Unicode Checks Active were set for all ABAP programs.
This option is normally used for preparing a conversion to Unicode.
Off: Unicode checks are only performed in those programs for which the program
attribute Unicode Checks Active has been set.
As of SAP Web Application Server 6.10, you can use the transaction UCCHECK to
check several Repository objects for Unicode compatibility at the same time. The
transaction always checks the active program version. You can also use it to apply the
Unicode Checks Active attribute to several programs (which must be original programs in
the system). However, you should only do this with programs that are actually Unicode-
enabled, otherwise the program terminates when it is executed.

Unicode With Presentation

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Unicode With Presentation

Cargado por

Copyright:

Formatos disponibles

Software systems use fixed-length bit sequences for internal character representation.

In non-Unicode systems, a character of this type is one byte.

In Unicode systems, it is as long as a character on the respective platform. X and

The parameter can take on the following values:

También podría gustarte