Abstract
- A set of rules that determine how text data is sorted and compared in a database, including case sensitivity
Case sensitivity
Database data is typically stored in a case-insensitive manner (e.g.,
"a" = "A"
).For example,
name = 'John'
will match “John"
,"john"
, and other variations in capitalisation.
Important
Case insensitivity affects only how the data is compared or searched, not how it is stored or retrieved.
For example: If the data stored in the database is
John
and you query withSELECT name FROM users WHERE name = 'john';
, the database will returnJohn
because it was stored that way, even though the search was case-insensitive.Therefore, it is important to process the value before insertion. For case-insensitive searches, storing all values in lowercase ensures consistency. For instance, if both
john
andJohn
are stored and you search forjohn
, the database returns both, which can lead to errors if you expect to retrieve only one result. To avoid this, you can standardize by storing all data in lowercase (e.g.,john
) so that the result is consistent regardless of the case used in the query.
Specifying case sensitivity
Use
utf8_general_ci
for case-insensitive storage. In MySQL, it also treats"hello "
and"hello"
as equal!Use
utf8mb4_general_ci
for case-insensitive storage with support for up to 4 bytes per character, accommodating a wider range of Unicode characters, including emojis and rare Chinese characters.