stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

abtin
page 20: … protoc command… I had to additionally run the following go get commands in order to be able to compile protobuf code using go...
New
ianwillie
Hello Brian, I have some problems with running the code in your book. I like the style of the book very much and I have learnt a lot as...
New
mikecargal
Title: Hands-On Rust (Chap 8 (Adding a Heads Up Display) It looks like ​.with_simple_console_no_bg​(SCREEN_WIDTH*2, SCREEN_HEIGHT*2...
New
Mmm
Hi, build fails on: bracket-lib = “~0.8.1” when running on Mac Mini M1 Rust version 1.5.0: Compiling winit v0.22.2 error[E0308]: mi...
New
swlaschin
The book has the same “Problem space/Solution space” diagram on page 18 as is on page 17. The correct Problem/Solution space diagrams ar...
New
leba0495
Hello! Thanks for the great book. I was attempting the Trie (chap 17) exercises and for number 4 the solution provided for the autocorre...
New
leonW
I ran this command after installing the sample application: $ cards add do something --owner Brian And got a file not found error: Fil...
New
fynn
This is as much a suggestion as a question, as a note for others. Locally the SGP30 wasn’t available, so I ordered a SGP40. On page 53, ...
New
jgchristopher
“The ProductLive.Index template calls a helper function, live_component/3, that in turn calls on the modal component. ” Excerpt From: Br...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New

Other popular topics Top

PragmaticBookshelf
Take your Go skills to the next level by learning how to design, develop, and deploy a distributed service. Start from the bare essential...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
PragmaticBookshelf
Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
sir.laksmana_wenk
I’m able to do the “artistic” part of game-development; character designing/modeling, music, environment modeling, etc. However, I don’t...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New

Sub Categories: