stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

jon
Some minor things in the paper edition that says “3 2020” on the title page verso, not mentioned in the book’s errata online: p. 186 But...
New
johnp
Running the examples in chapter 5 c under pytest 5.4.1 causes an AttributeError: ‘module’ object has no attribute ‘config’. In particula...
New
JohnS
I can’t setup the Rails source code. This happens in a working directory containing multiple (postgres) Rails apps. With: ruby-3.0.0 s...
New
curtosis
Running mix deps.get in the sensor_hub directory fails with the following error: ** (Mix) No SSH public keys found in ~/.ssh. An ssh aut...
New
jskubick
I think I might have found a problem involving SwitchCompat, thumbTint, and trackTint. As entered, the SwitchCompat changes color to hol...
New
jskubick
I’m under the impression that when the reader gets to page 136 (“View Data with the Database Inspector”), the code SHOULD be able to buil...
New
Charles
In general, the book isn’t yet updated for Phoenix version 1.6. On page 18 of the book, the authors indicate that an auto generated of ro...
New
akraut
The markup used to display the uploaded image results in a Phoenix.LiveView.HTMLTokenizer.ParseError error. lib/pento_web/live/product_l...
New
EdBorn
Title: Agile Web Development with Rails 7: (page 70) I am running windows 11 pro with rails 7.0.3 and ruby 3.1.2p20 (2022-04-12 revision...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New

Other popular topics Top

Devtalk
Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...
1050 21151 394
New
PragmaticBookshelf
Free and open source software is the default choice for the technologies that run our world, and it’s built and maintained by people like...
New
New
AstonJ
We have a thread about the keyboards we have, but what about nice keyboards we come across that we want? If you have seen any that look n...
New
New
AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
New
PragmaticBookshelf
Tailwind CSS is an exciting new CSS framework that allows you to design your site by composing simple utility classes to create complex e...
New
rustkas
Intensively researching Erlang books and additional resources on it, I have found that the topic of using Regular Expressions is either c...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
AstonJ
This is a very quick guide, you just need to: Download LM Studio: https://lmstudio.ai/ Click on search Type DeepSeek, then select the o...
New

Sub Categories: