stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

mikecargal
Title: Hands-On Rust (Chap 8 (Adding a Heads Up Display) It looks like ​.with_simple_console_no_bg​(SCREEN_WIDTH*2, SCREEN_HEIGHT*2...
New
rmurray10127
Title: Intuitive Python: docker run… denied error (page 2) Attempted to run the docker command in both CLI and Powershell PS C:\Users\r...
New
oaklandgit
Hi, I completed chapter 6 but am getting the following error when running: thread 'main' panicked at 'Failed to load texture: IoError(O...
New
andreheijstek
After running /bin/setup, the first error was: The foreman' command exists in these Ruby versions: That was easy to fix: gem install fore...
New
a.zampa
@mfazio23 I’m following the indications of the book and arriver ad chapter 10, but the app cannot be compiled due to an error in the Bas...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New
bjnord
Hello @herbert ! Trying to get the very first “Hello, Bracket Terminal!" example to run (p. 53). I develop on an Amazon EC2 instance runn...
New
davetron5000
Hello faithful readers! If you have tried to follow along in the book, you are asked to start up the dev environment via dx/build and ar...
New
dachristenson
@mfazio23 Android Studio will not accept anything I do when trying to use the Transformations class, as described on pp. 140-141. Googl...
New
New

Other popular topics Top

PragmaticBookshelf
Machine learning can be intimidating, with its reliance on math and algorithms that most programmers don't encounter in their regular wor...
New
Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
DevotionGeo
I know that -t flag is used along with -i flag for getting an interactive shell. But I cannot digest what the man page for docker run com...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
PragmaticBookshelf
Use WebRTC to build web applications that stream media and data in real time directly from one user to another, all in the browser. ...
New
AstonJ
Biggest jackpot ever apparently! :upside_down_face: I don’t (usually) gamble/play the lottery, but working on a program to predict the...
New
PragmaticBookshelf
Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...
New
CommunityNews
A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New

Sub Categories: