stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

New
mikecargal
Title: Hands-on Rust: question about get_component (page 295) (feel free to respond. “You dug you’re own hole… good luck”) I have somet...
New
raul
Page 28: It implements io.ReaderAt on the store type. Sorry if it’s a dumb question but was the io.ReaderAt supposed to be io.ReadAt? ...
New
jskubick
I found an issue in Chapter 7 regarding android:backgroundTint vs app:backgroundTint. How to replicate: load chapter-7 from zipfile i...
New
digitalbias
Title: Build a Weather Station with Elixir and Nerves: Problem connecting to Postgres with Grafana on (page 64) If you follow the defau...
New
Charles
In general, the book isn’t yet updated for Phoenix version 1.6. On page 18 of the book, the authors indicate that an auto generated of ro...
New
brunogirin
When trying to run tox in parallel as explained on page 151, I got the following error: tox: error: argument -p/–parallel: expected one...
New
New
taguniversalmachine
It seems the second code snippet is missing the code to set the current_user: current_user: Accounts.get_user_by_session_token(session["...
New
roadbike
From page 13: On Python 3.7, you can install the libraries with pip by running these commands inside a Python venv using Visual Studio ...
New

Other popular topics Top

PragmaticBookshelf
Write Elixir tests that you can be proud of. Dive into Elixir’s test philosophy and gain mastery over the terminology and concepts that u...
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
Exadra37
Please tell us what is your preferred monitor setup for programming(not gaming) and why you have chosen it. Does your monitor have eye p...
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
DevotionGeo
I know that -t flag is used along with -i flag for getting an interactive shell. But I cannot digest what the man page for docker run com...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
New
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New

Sub Categories: