Injection attacks often use special characters such as single quotes “‘” and double quotes “”. In the application, for the sake of safety, the developer often uses the escape character “\” to avoid special characters, but when the database uses wide character sets, this may lead to some unexpected vulnerability. For example, when MySQL uses GBK encoding, 0xbf27 0xbf 5c will be considered as one (double-byte) character.
Before entering the database, the web language does not take the problem of double-byte characters into account, and one double-byte character is taken as two bytes. For example, in PHP, when the addslashes () function or magic_quotes_gpc is turned on, an escape character will be added before the special character “\”.
The addslashes() function will escape the four characters:
Description string addslashes ( string $str ) Returns a string with backslashes before characters that need to be quoted in database queries etc. These characters are single quote (‘), double quote (“), backslash (\) and NUL (the NULL byte).
Therefore, if the attacker inputs
0xbf27 or 1=1
namely, it will become 0xbf 5c 27 (“\” ASCII code 0x 5c), but 0xbf 5c is another character. That would otherwise make the symbol “\” disappear in the database, and it becomes.
To solve this problem, we need the database, the operating system, and the character set in web applications to be consistent to avoid the different understanding of the characters in various layers. UTF-8 is a good way to do this.
Attacks based on the character set do not happen only in the case of SQL injection. Whenever a data analytical place is involved, this problem will occur. XSS attacks, for example, may lead to character set attacks because the character encoding differs in the browser and the server. The solution is to specify the current page charset in the HTML page <meta> label. If, for whatever reason, you cannot use Unicode, you need the safety function for filter or escape, in which you need to take the possible range of characters into account.
Based on the character sets that the user uses, different allowable range filters can be set in order to ensure the security.