stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

johnp
Running the examples in chapter 5 c under pytest 5.4.1 causes an AttributeError: ‘module’ object has no attribute ‘config’. In particula...
New
ianwillie
Hello Brian, I have some problems with running the code in your book. I like the style of the book very much and I have learnt a lot as...
New
herminiotorres
Hi! I know not the intentions behind this narrative when called, on page XI: mount() |> handle_event() |> render() but the correc...
New
alanq
This isn’t directly about the book contents so maybe not the right forum…but in some of the code apps (e.g. turbo/06) it sends a TURBO_ST...
New
adamwoolhether
When trying to generate the protobuf .go file, I receive this error: Unknown flag: --go_opt libprotoc 3.12.3 MacOS 11.3.1 Googling ...
New
adamwoolhether
I’m not quite sure what’s going on here, but I’m unable to have to containers successfully complete the Readiness/Liveness checks. I’m im...
New
hgkjshegfskef
The test is as follows: Scenario: Intersecting a scaled sphere with a ray Given r ← ray(point(0, 0, -5), vector(0, 0, 1)) And s ← sphere...
New
New
jwandekoken
Book: Programming Phoenix LiveView, page 142 (157/378), file lib/pento_web/live/product_live/form_component.ex, in the function below: d...
New
gorkaio
root_layout: {PentoWeb.LayoutView, :root}, This results in the following following error: no “root” html template defined for PentoWeb...
New

Other popular topics Top

dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
New
PragmaticBookshelf
Create efficient, elegant software tests in pytest, Python's most powerful testing framework. Brian Okken @brianokken Edited by Kat...
New
AstonJ
Biggest jackpot ever apparently! :upside_down_face: I don’t (usually) gamble/play the lottery, but working on a program to predict the...
New
Maartz
Hi folks, I don’t know if I saw this here but, here’s a new programming language, called Roc Reminds me a bit of Elm and thus Haskell. ...
New
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
PragmaticBookshelf
Author Spotlight: VM Brasseur @vmbrasseur We have a treat for you today! We turn the spotlight onto Open Source as we sit down with V...
New
DevotionGeo
I have always used antique keyboards like Cherry MX 1800 or Cherry MX 8100 and almost always have modified the switches in some way, like...
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New

Sub Categories: