MySQL UTF-8 Encoding Bug
Published on 14 Nov 2022
TLDR, MySQL UTF-8 encoding was 1 byte shorter!
While working with a LAMP stack in 2015, I encountered something weird. Some emojis worked while some didn’t 🫨. Emojis were not so popular back then and the stakeholders were not pleased to see me working on them as they didn’t solve any business problems. I loved 🐙 , they express emotions. Typing on slow android phones was a hassle back then. A single emoji conveyed more.
TIL, the issue was in the MySQL encoding. Though it was set to UTF-8, it was utf8mb3, supporting a maximum of 3-byte characters. To support 4-byte characters, you need to use utf8mb4. This should have been mentioned in MySQL’s documentation.
Who should care?
Anyone working with legacy MySQL databases(~version 5).
Photo by Denis Cherkashin on Unsplash
all tags
AI EV Existory ML TIL ai astronomy book-review bookmark bootcamp chi containers devops embedded frontend gaming gist git github golang homelab imagemagick internet javascript jekyll js k8s linux mental-model mgmt mysql opensource paper personal phenomenology prototyping psychology python review semiconductor social-media sprituality stateof sustainable til tools unix virtualization writing दर्शन