The New C Standard- P9

Thông tin tài liệu

6.4.2.1 General 796 Coding Guidelines The visual similarity of these letters is discussed elsewhere. 792 character visual similarity 795 There is no specific limit on the maximum length of an identifier. Commentary The standard does specify a minimum limit on the number of characters a translator must consider as significant. Implementations are free to ignore characters once this limit is reached. The ignored characters 282 internal identifier significant characters 283 external identifier significant characters do not form part of another token. It is as if they did not appear in the source at all. C90 The C90 Standard does not explicitly state this fact. Other Languages Few languages place limits on the maximum length of an identifier that can appear in a source file. Like C, some specify a lower limit on the number of characters that must be considered significant. Coding Guidelines Using a large number of characters in an identifier spelling has many potential benefits; for instance, it provides the opportunity to supply a lot of information to readers, or to reduce dependencies on existing reader knowledge by spelling words in full rather than using abbreviations. There are also potential costs; for instance, they can cause visual layout problems in the source (requiring new-lines within an expression in an attempt to keep the maximum line length within the bounds that can be viewed within a fixed-width window), or increase the cognitive effort needed to visually scan source containing them. The length of an identifier is not itself directly a coding guideline issue. However, length is indirectly involved in many identifier memorability, confusability, and usability issues, which are discussed elsewhere. 792 identifier syntax Usage The distribution of identifier lengths is given in Figure 792.7. 796 Each universal character name in an identifier shall designate a character whose encoding in ISO/IEC 10646 identifier UCN falls into one of the ranges specified in annex D. 60) Commentary Using other UCNs results in undefined behavior (in some cases even using these UCNs can be a constraint violation). These character encodings could be thought of as representing letters in the specified national 816 UCNs not basic character set character set. C90 Support for universal character names is new in C99. Other Languages The ISO/IEC 10646 standard is relatively new and languages are only just starting to include support for the 28 ISO 10646 characters it specifies. Java specifies a similar list of UCNs. Common Implementations A collating sequence may not be defined for these universal character names. In practice a lack of a defined collating sequence is not an implementation problem. Because a translator only ever needs to compare the spelling of one identifier for equality with another identifier, which involves a simple character-by-character comparison (the issue of the ordering of diacritics is handled by not allowing them to occur in an identifier). Support for this functionality is new and the extent to which implementations are likely to check that UCN values fall within the list given in annex D is not known. June 24, 2009 v 1.2 6.4.2.1 General 797 Coding Guidelines The intended purpose for supporting universal character names in identifiers is to reduce the developer effort needed to comprehend source. Identifiers spelled in the developer’s native tongue are more immediately recognizable (because of greater practice with those characters) and also have semantic associations that are more readily brought to mind. The ISO 10646 Standard does not specify which languages contain the characters it specifies (although it ISO 10646 28 does give names to some sets of characters that correspond to a language that contains them). The written form of some human languages share common characters; for instance, the characters a through z (and their uppercase forms) appear in many European orthographies. The following discussion refers to using UCNs orthography 792 from more than one human language. This is to be taken to mean using UCNs that are not part of the written form of the native language of the developer (the case of developers having more than one native language is not considered). For instance, the character a is used in both Swedish and German; the character û is used in Swedish, but not German; the character ß is used in German but not Swedish. Both Swedish and German developers would be familiar with the character a, but the character ß would be considered foreign to a Swedish developer, and the character û foreign to the German. Some coding guideline documents recommend against the use of UCNs. Their use within identifiers can increase the portability cost of the source. The use of UCNs is an economic issue; the potential cost of not permitting their use in identifiers needs to be compared against the potential portability benefits. (Alternatively, the benefits of using UCNs could be compared against the possible portability costs.) Given the purpose of using UCNs, is there any rationale for identifiers to contain characters from more than one human language? As an English speaker, your author can imagine a developer wanting to use an English word, or its common abbreviation, as a prefix or suffix to an identifier name. Perhaps an Urdu speaker can imagine a similar usage with Urdu words. The issue is whether the use of characters in the same identifier from different human languages has meaning to the developers who write and maintain the source. Identifiers very rarely occur in isolation. Should all the identifiers in the same function, or even source file, only contain UCNs that form the set of characters used by a single human language? Using characters from different human languages when it is possible to use only characters from a single language, potentially increases the cost of maintenance. Future maintainers are either going to have to be familiar with the orthography and semantics of the two human languages used or spend additional time processing instances of identifiers containing characters they are not familiar with. However, in some cases it might not be possible to enforce a single human language rule. For instance, a third-party library may contain callable functions whose spellings use characters from a human language different from that used in the source code that contains calls to it. Support for the use of UCNs in identifiers is new in C99 (and other computer languages) and at the time of this writing there is almost no practical experience available on the sort of mistakes that developers make with them. 797 The initial character shall not be a universal character name designating a digit. Commentary The terminal identifier-nondigit that appears in the syntax implies that the possible UCNs exclude the identifier syntax 792 digit characters. Also the list given in annex D does not include the digit characters. This means that an identifier containing a UCN designating a digit in any position results in undefined behavior. The syntax for constants does not support the use of UCNs. This sentence, in the standard, reminds constant syntax 822 implementors that such usage could be supported in the future and that, while they may support UCN digits within an identifier, it would not be a good idea to support them as the initial character. v 1.2 June 24, 2009 6.4.2.1 General 798 Table 797.1: The Unicode digit encodings. Encoding Range Language Encoding Range Language 0030–0039 ISO Latin-1 0BE7–0BEF Tamil (has no zero) 0660–0669 Arabic–Indic 0C66–0C6F Telugu 06F0–06F9 Eastern Arabic–Indic 0CE6–0CEF Kannada 0966–096F Devanagari 0D66–0D6F Malayalam 09E6–09EF Bengali 0E50–0E59 Thai 0A66–0A6F Gurmukhi 0ED0–0ED9 Lao 0AE6–0AEF Gujarati FF10–FF19 Fullwidth 0B66–0B6F Oriya digits C ++ This requirement is implied by the terminal non-name used in the C ++ syntax. Annex E of the C ++ Standard does not list any UCN digits in the list of supported UCN encodings. Other Languages Java has a similar requirement. Coding Guidelines The extent to which different cultural conventions support the use of a digit as the first character in an identifier is not known to your author. At some future date the Committee may chose to support the writing of integer constants using UCNs. If this happens, any identifiers that start with a UCN designating a digit are liable to result in syntax violations. There does not appear to be a worthwhile benefit in a guideline recommendation dealing with the case of an identifier beginning with a UCN designating a digit. Example 1 int \u1f00\u0ae6; 2 int \u0ae6; 798 An implementation may allow multibyte characters that are not part of the basic source character set to appear identifier multibyte character in in identifiers; Commentary Prior to C99 there was no standardized method of representing nonbasic source character set characters in the source code. Support for multibyte characters in string literals and constants was specified in C90; some implementations extended this usage to cover identifiers. They are now officially sanctioned to do this. Support for the ISO 10646 Standard is new in C99. However, there are a number of existing implementations 28 ISO 10646 that use a multibyte encoding scheme and this usage is likely to continue for many years. The C committee recognized the importance of this usage and do not force developers to go down a UCN-only path. The standard says nothing about the behavior of the _ _func_ _ reserved identifier in the case when a 810 __func__ function name is spelled using wide characters. C90 This permission is new in C99. C ++ The C ++ Standard does not explicitly contain this permission. However, translation phase 1 performs an 116 translation phase 1 implementation-defined mapping of the source file characters, and an implementation may choose to support multibyte characters in identifiers via this route. June 24, 2009 v 1.2 6.4.2.1 General 801 Other Languages While other language standards may not mention multibyte characters, the problem they address is faced by implementations of those languages. For this reason, it is to be expected that some implementations of other languages will contain some form of support for multibyte characters. Coding Guidelines UCNs may be the preferred, C Standard way, of representing nonbasic character set characters in identifiers. However, developers are at the mercy of editor support for how they enter and view characters that are not in universal character name syntax 815 the basic source character set. 799 which characters and their correspondence to universal character names is implementation-defined. Commentary Various national bodies have defined standards for representing their national character sets in computer files. While ISO 10646 is intended to provide a unified standard for all characters, it may be some time before ISO 10646 28 existing software is converted to use it. Common Implementations It is common to find translators aimed at the Japanese market supporting JIS, shift-JIS, and EUC encodings (see Table 243.3). These encoding use different numeric values than those given in ISO 10646 to represent the same national character. 800 When preprocessing tokens are converted to tokens during translation phase 7, if a preprocessing token could be converted to either a keyword or an identifier, it is converted to a keyword. Commentary The Committee could have created a separate name space for keywords and allowed developers to define identifiers having the same spelling as a keyword. The complexity added to a translator by such a specification would be significant (based on implementation experience for languages that support this functionality), while a developer’s inability to define identifiers having these spellings was considered a relatively small inconvenience. C90 This wording is a simplification of the convoluted logic needed in the C90 Standard to deduce from a constraint what C99 now says in semantics. The removal of this C90 constraint is not a change of behavior, since it was not possible to write a program that violated it. C90 6.1.2 Constraints In translation phase 7 and 8, an identifier shall not consist of the same sequence of characters as a keyword. Other Languages Some languages allow keywords to be used as variable names (e.g., PL/1), using the context to disambiguate intended use. 801 60) On systems in which linkers cannot accept extended characters, an encoding of the universal character footnote 60 name may be used in forming valid external identifiers. Commentary This is really an implementation tip for translators. The standard defines behavior in terms of an abstract machine that produces external output. The tip given in this footnote does not affect the conformance status of an implementation that chooses to implement this functionality in another way. The only time such a mapping might be visible is through the use of a symbolic execution-time debugging tool, or by having to link against object files created by other translators. v 1.2 June 24, 2009 6.4.2.1 General 805 C90 Extended characters were not available in C90, so the suggestion in this footnote does not apply. 215 extended characters Other Languages Issues involving third-party linkers are common to most language implementations that compile to machine code. Some languages, for instance Java, define the characteristics of an implementation at translation and execution time. The Java language specification goes to the extreme (compared to other languages) of specifying the format of the generated file object code file. Common Implementations There is a long-standing convention of prefixing externally visible identifier names with an underscore character when information on them is written out to an object file. There is little experience available on implementation issues involving UCNs, but many existing linkers do assume that identifiers are encoded using 8-bit characters. Coding Guidelines The encoding of external identifiers only needs to be considered when interfacing to, or from code written in another language. Cross-language interfacing is outside the scope of these coding guidelines. 802 For example, some otherwise unused character or sequence of characters may be used to encode the \u in a universal character name. Commentary Some linkers may not support an occurrence of the backslash ( \ ) character in an identifier name. One solution to this problem is to create names that cannot be declared in the source code by the developer; for instance, by deleting the \ characters and prefixing the name with a digit character. Common Implementations There are no standards for encoding of universal character names in object files. The requirement to support this form of encoding is too new for it to be possible to say anything about common encodings. 803 Extended characters may produce a long external identifier. Commentary Here the word long does not have any special meaning. It simply suggests an identifier containing many characters. 282 internal identifier significant characters Implementation limits 804 As discussed in 5.2.4.1, an implementation may limit the number of significant initial characters in an identifier; Implemen- tation limits Commentary This subclause lists a number of minimum translation limits 276 translation limits C90 The C90 Standard does not contain this observation. C ++ 2.10p1 All characters are significant. 20) C identifiers that differ after the last significant character will cause a diagnostic to be generated by a C ++ translator. Annex B contains an informative list of possible implementation limits. However, “ . . . these quantities are only guidelines and do not determine compliance.”. June 24, 2009 v 1.2 6.4.2.1 General 806 805 the limit for an external name (an identifier that has external linkage) may be more restrictive than that for an internal name (a macro name or an identifier that does not have external linkage). Commentary External identifiers have to be processed by a linker, which may not be under the control of a vendor’s external identifier significant characters 283 C implementations. In theory, any tool that performs the linking process falls within the remit of the C Committee. However, the Committee recognized that, in practice, it is not always possible for translator vendors to supply their own linker. The limitations of existing linkers needed to be factored into the limits specified in the standard. Internal identifiers only need to be processed by the translator and the standard is in a strong position to internal identifier significant characters 282 specify the behavior. Other Languages Most other language implementations face similar problems with linkers as C does. However, not all language specifications explicitly deal with the issue (by specifying the behavior). The Java standard defines a complete environment that handles all external linkages. Coding Guidelines What are the costs associated with a change to the linkage of an identifier during program maintenance, from internal linkage to external linkage? (Experience shows that identifier linkage is rarely changed from external to internal?) In most cases implementations support a sufficiently large number of significant characters in an external name that a change of identifier linkage makes no difference to its significant characters (i.e., the number external identifier significant characters 283 of characters it contains falls inside the implementation limit). In those cases where a change of identifier identifier number of characters 792 linkage results in some of its significant characters being ignored, the affect may be benign (there is no other identifier defined with external linkage whose name is the same as the truncated name) or results in undefined behavior (the program defines two identifiers with external linkage with the same name). external linkage exactly one external definition 1818 806 The number of significant characters in an identifier is implementation-defined. Commentary Subject to the minimum requirements specified in the standard. internal identifier significant characters 282 C ++ 2.10p1 All characters are significant. 20) References to the same C identifier, which differs after the last significant character, will cause a diagnostic to be generated by a C ++ translator. There is also an informative annex which states: Annex Bp2 Number of initial characters in an internal identifier or a macro name [1024] Number of initial characters in an external identifier [1024] Other Languages Some languages require all characters in an identifier to be significant (e.g., Java, Snobol 4), while others don’t (e.g., Cobol, Fortran). Common Implementations It is rare to find an implementation that does not meet the minimum limits specified in the standard. A few translators treat all identifiers as significant. Most have a limit of between 256 and 2,000 significant characters. The POSIX standard requires that any language that binds to its API needs to support 14 significant characters in an external identifier. v 1.2 June 24, 2009 6.4.2.1 General 806 Coding Guidelines While the C90 minimum limits for the number of significant characters in an identifier might be considered unacceptable by many developers, the C99 limits are sufficiently generous that few developers are likely to complain. Automatically generated C source sometimes relies on a large number of significant characters in an identifier. This can occur because of the desire to simplify the implementation of the generator. Character sequences in different offsets within an identifier might be reserved for different purposes. Predefined default character sequence is used to pad the identifier spelling where necessary. As the following example shows, it is possible for a program’s behavior to change, both when the number of significant identifiers is increased and when it is decreased. 1 / * 2 * Yes, C99 does specify 64 significant characters in an internal 3 * identifier. But to keep this example within the page width 4 * we have taken some liberties. 5 * / 6 7 extern float _________1_________2_________3___bb; 8 9 void f(void) 10 { 11 int _________1_________2_________3___ba; 12 13 / * 14 * If there are 34 significant characters, the following operand 15 * will resolve to the locally declared object. 16 * 17 * If there are 35 significant characters, the following operand 18 * will resolve to the globally declared object. 19 * / 20 _________1_________2_________3___bb++; 21 } 22 23 void g(void) 24 { 25 int _________1_________2_________3___aa; 26 27 / * 28 * If there are 34 significant characters, the following operand 29 * will resolve to the globally declared object. 30 * 31 * If there are 33 significant characters, the following operand 32 * will resolve to the locally declared object. 33 * / 34 _________1_________2_________3___bb++; 35 } The following issues need to be addressed: • All references to the same identifier should use the same character sequence; that is, all characters are intended to be significant. References to the same identifiers that differ in nonsignificant characters need to be treated as faults. • Within how many significant characters should different identifiers differ? Should identifiers be required to differ within the minimum number of significant characters specified by the standard, or can a greater number of characters be considered significant? Readers do not always carefully check all characters in the spelling of an identifier. The contribution made by characters occurring in different parts of an identifier will depend on the pattern of eye movements employed June 24, 2009 v 1.2 6.4.2.1 General 807 Significant characters %identical matches 6 10 20 30 40 50 0.001 0.01 0.1 1 10 100 × × gcc × × × × × × × × × × × × × × × × × × × × × × × × × × × × ×× ×× × ×× × . . idsoftware . . . . . . . . . . . . . . . . . . . . . . . . . ∆ ∆ linux ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆∆ ∆ • • mozilla • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Figure 806.1: Occurrence of unique identifiers whose significant characters match those of a different identifier (as a percentage of all unique identifiers in a program), for various numbers of significant characters. Based on the visible form of the .c files. by readers, which in turn may be affected by their reasons for reading the source, plus cultural factors (e.g., reading kinds of 770 direction in which they read text in their native language, or the significance of word endings in their native language). Characters occurring at both ends of an identifier are used by readers (at least native English- and identifiers Greek readers 792 French-speaking ones) when quickly scanning text. word reading individual 770 Cg 806.1 When performing similarity checks on identifiers, all characters shall be considered significant. 807 Any identifiers that differ in a significant character are different identifiers. Commentary In many cases different identifiers also denote different entities. In a some cases they denote the same entity (e.g., two different typedef names that are synonyms for the type int). Other Languages This statement is common to all languages (but it does not always mean that they necessarily denote different entities). Coding Guidelines Identifiers that differ in a single significant character may be considered to be • different identifiers by a translator, but considered to be the same identifier by some readers of the source (because they fail to notice the difference). • the same identifiers by a translator (because the difference occurs in a nonsignificant character), but considered to be different identifiers by some readers of the source (because they treat all characters as being significant). • identifiers by both a translator and some readers of the source. The possible reasons for readers making mistakes are discussed elsewhere, as are the guideline recommenda- developer errors 0 tions for reducing the probability that these developer mistakes become program faults. identifier filtering spellings 792 Example v 1.2 June 24, 2009 6.4.2.2 Predefined identifiers 810 1 extern int e1; 2 extern long el; 3 extern int a_longer_more_meaningful_name; 4 extern int a_longer_more_meeningful_name; 5 extern int a_meaningful_more_longer_name; 808 If two identifiers differ only in nonsignificant characters, the behavior is undefined. Commentary While the obvious implementation strategy is to ignore the nonsignificant characters, the standard does not require implementations to use this strategy. To speed up identifier lookup many implementations use a hashed symbol table— the hash value for each identifier is computed from the sequence of characters it contains. Computing this hash value as the characters are read in, to form an identifier, saves a second pass over those same characters later. If nonsignificant characters were included in the original computed hash value, a subsequent occurrence of that identifier in the source, differing in nonsignificant characters, would result in a different hash value being calculated and a strong likelihood that the hash table lookup would fail. Developers generally expect implementations to ignore nonsignificant characters. An implementation that behaved differently because identifiers differed in nonsignificant characters might not be regarded as being very user friendly. Highlighting misspellings that occur in nonsignificant characters is not always seen in a positive light by some developers. C ++ In C ++ all characters are significant, thus this statement does not apply in C ++ . Other Languages Some languages specify that nonsignificant characters are ignored and have no effect on the program, while others are silent on the subject. Common Implementations Most implementations simply ignore nonsignificant characters. They play no part in identifier lookup in symbol tables. Coding Guidelines The coding guideline issues relating to the number of characters in an identifier that should be considered significant are discussed elsewhere. 792 identifier guideline significant characters 809 Forward references: universal character names (6.4.3), macro replacement (6.10.3). 6.4.2.2 Predefined identifiers Semantics 810 The identifier _ _func_ _ shall be implicitly declared by the translator as if, immediately following the opening __func__ brace of each function definition, the declaration static const char __func__[] = "function-name"; appeared, where function-name is the name of the lexically-enclosing function. 61) Commentary Implicitly declaring _ _func_ _ immediately after the opening brace in a function definition means that the first, developer-written declaration within that function can access it. Giving _ _func_ _ static storage duration enables its address to be referred to outside the lifetime of the function that contains it (e.g., enabling a call history to be displayed at some later stage of program execution). This is not a storage overhead because space needs to be allocated for the string literal denoted by _ _func_ _ . The const qualifier ensures June 24, 2009 v 1.2 6.4.2.2 Predefined identifiers 810 that any attempts to modify the value cause undefined behavior. The identifier _ _func_ _ has an array type, and is not a string literal, so the string concatenation that occurs in translation phase 6 is not applicable. translation phase 6 135 This identifier is useful for providing execution trace information during program testing. Developers who make use of UCNs may need to ensure that the library they use supports the character output required by them: 1 #include <stdio.h> 2 3 void \u30CE(void) 4 { 5 printf ("Just entered %s\n", __func__); 6 } The issue of wide characters in identifiers is discussed elsewhere. identifier multibyte character in 798 Which function name is used when a function definition contains the inline function specifier? In: 1 #include <stdio.h> 2 3 inline void f(void) 4 { 5 printf("We are in %s\n", __func__); 6 } 7 8 int main(void) 9 { 10 f(); 11 printf("We are in %s\n", __func__); 12 } the name of the function f is output, even if that function is inlined into main. C90 Support for the identifier _ _func_ _ is new in C99. C ++ Support for the identifier _ _func_ _ is new in C99 and is not available in the C ++ Standard. Common Implementations A translator only needs to declare _ _func_ _ if a reference to it occurs within a function. An obvious storage saving optimization is to delay any declaration until such time as it is known to be required. Another optimization is for the storage allocated for _ _func_ _ to exactly overlay that allocated to the string literal. Allocating storage for a string literal and copying the characters to the separately allocated object it initializes is not necessary when that object is defined using the const qualifier. gcc also supports the built-in form _ _FUNCTION_ _. Example Debugging code in functions can provide useful information. But when there are lots of functions, the quantity of useless information can be overwhelming. Controlling which functions are to output debugging information by using conditional compilation requires that code be edited and the program rebuilt. The names of functions can be used to dynamically control which functions are to output debugging information. This control not only reduces the amount of information output, but can also reduce execution time by orders of magnitude (output can be a resource-intense operation). flookup.h 1 typedef struct f__rec { 2 char * func_name; 3 _Bool enabled; 4 struct f__rec * next; v 1.2 June 24, 2009 [...]... source However, the issue is not how many times a constant having a particular semantic association occurs, but how many times the particular constant value occurs The same constant value can appear because of different semantic associations A search for a sequence of digits (a constant value) will locate all occurrences, irrespective of semantic association While an argument can always be made for certain... characters that are in the Ascii character set, but not in the basic source character set The ranges 0D800 through DBFF and 0DC00 through 0DFFF are known as the surrogate ranges The purpose of these ranges is to allow representation of rare characters in future versions of the Unicode standard This constraint means that source files cannot contain the UCN equivalent for any members of the basic source character... treated as UCNs by a translator, although other tools may choose to do so, in this context The mapping of UCNs in character constants and string literals to the execution character set occurs in translation phase 5 The constraint on the range of values that a UCN may take prevents them from being used to represent keywords 816 UCNs not basic character set C+ + The C+ + Standard also supports the use of... uses of the integer constants 0 and 1 in the visible source often have no special semantics associated with their usage They also represent a significant percentage of the total number of integer constants in the source code (see Figure 825.1) The frequency of occurrence of these values (most RISC processors dedicate a single register to permanently hold the value zero) comes about through commonly... reserved by 820 ISO/IEC 10646 for control characters, the character DELETE, and the S-zone (reserved for use by UTF-16) footnote 62 Commentary basic char- 215 acter set Requiring that characters in the basic character set not be represented using UCN notation helps guarantee that existing tools (e.g., editors) continue to be able to process source files The control characters may have special meaning for... may be conditional on the setting of some translation time 1931 macro option— for instance, -D) object-like The use of constants in source code creates a number of possible maintenance issues, including: • A constant value, representing some quantity, often needs to occur in multiple locations within source code Searching for and replacing all occurrences of a particular numeric value in the code is... character constant) One solution to these problems is to use an identifier to give a symbolic name822.1 to the constant, and to use that symbolic name wherever the constant would have appeared in the source Changes to the value of the constant can then be made by a single modification to the definition of the identifier and a well-chosen name can help readers make the appropriate semantic association The. .. any control character C+ + 2.2p2 If the hexadecimal value for a universal character name is less than 0x20 or in the range 0x7F–0x9F (inclusive), or if the universal character name designates a character in the basic source character set, then the program is ill-formed The range of hexadecimal values that are not permitted in C+ + is a subset of those that are not permitted in C This means that source... because an object declared with the const qualifier really is constant and a translator need not allocate storage for it, or because use of the preprocessor (often called the C preprocessor, as if it were not also in C+ +) is frowned on in the C+ + community and is left to the reader to decide The enumeration constant versus macro name issue is discussed in detail elsewhere What name to choose? The constant... universal character names in these contexts, but does not say in words what it specifies in the syntax (although 2.2p2 comes close for identifiers) Other Languages In Java, UnicodeInputCharacters can represent any character and is mapped in lexical translation step 1 It is possible for every character in the source to appear in this form The mapping only occurs once, so \u005cu005a becomes \u005a, not Z (005c . represent characters in the basic source character set. The exceptions listed enumerate characters that are in the Ascii character set, but not in the basic source. The disallowed characters are the characters in the basic character set and the code positions reserved by footnote 62 ISO/IEC 10646 for control characters,

Ngày đăng: 20/10/2013, 10:15

Xem thêm: The New C Standard- P9, The New C Standard- P9

The New C Standard- P9

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan