Bootstrap

C language【1】

C language

character set

In C language, a character set refers to the collection of characters that can be used in the source code of a C program. These characters are defined by the C standard and include the following:


C Language Character Set

1. Letters (Alphabetic Characters)
  • Uppercase letters: A to Z
  • Lowercase letters: a to z
2. Digits
  • 0 to 9
3. Special Characters

These include punctuation marks, operators, and other symbols used in C programming:

  • Arithmetic operators: +, -, *, /, %
  • Relational operators: <, >, <=, >=, ==, !=
  • Logical operators: &&, ||, !
  • Bitwise operators: &, |, ^, ~, <<, >>
  • Assignment operators: =, +=, -=, *=, /=, etc.
  • Other special symbols: ;, :, ,, ., [, ], {, }, (, ), #, \\, @, ?, \", \'
4. White Spaces

Whitespace characters include:

  • Space ( )
  • Horizontal tab (\t)
  • Vertical tab (\v)
  • Form feed (\f)
  • Newline (\n)
  • Carriage return (\r)
5. Escape Sequences

Escape sequences are used to represent certain special characters within string or character literals:

  • \n – Newline
  • \t – Tab
  • \\ – Backslash
  • \' – Single quote
  • \" – Double quote
  • \0 – Null character
6. Universal Character Names

C supports Unicode via universal character names:

  • \uXXXX – Unicode character with 4 hexadecimal digits.
  • \UXXXXXXXX – Unicode character with 8 hexadecimal digits.
7. Digraphs and Trigraphs
  • Digraphs: Alternate two-character sequences for some symbols. Example: [% for [, <% for {.
  • Trigraphs: Deprecated in modern C, but used as three-character sequences for certain symbols. Example: ??= for #.

Example Usage in C

#include <stdio.h>

int main() {
    // Using various characters in a C program
    char letter = 'A';     // Alphabet
    char digit = '5';      // Digit
    char symbol = '#';     // Special character
    char newline = '\n';   // Escape sequence

    printf("Letter: %c\nDigit: %c\nSymbol: %c\n", letter, digit, symbol);
    return 0;
}

Important Notes

  1. C is case-sensitive, meaning A and a are treated as different characters.
  2. Character constants in C are enclosed in single quotes ('A'), while string literals are enclosed in double quotes ("A").
  3. The character set supported depends on the compiler and platform, with ASCII and Unicode being the most common. Modern compilers typically support Unicode.

参考文献

  1. chatgpt
;