stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

jimmykiang
This test is broken right out of the box… — FAIL: TestAgent (7.82s) agent_test.go:77: Error Trace: agent_test.go:77 agent_test.go:...
New
jon
Some minor things in the paper edition that says “3 2020” on the title page verso, not mentioned in the book’s errata online: p. 186 But...
New
iPaul
page 37 ANTLRInputStream input = new ANTLRInputStream(is); as of ANTLR 4 .8 should be: CharStream stream = CharStreams.fromStream(i...
New
telemachus
Python Testing With Pytest - Chapter 2, warnings for “unregistered custom marks” While running the smoke tests in Chapter 2, I get these...
New
jesse050717
Title: Web Development with Clojure, Third Edition, pg 116 Hi - I just started chapter 5 and I am stuck on page 116 while trying to star...
New
Mmm
Hi, build fails on: bracket-lib = “~0.8.1” when running on Mac Mini M1 Rust version 1.5.0: Compiling winit v0.22.2 error[E0308]: mi...
New
swlaschin
The book has the same “Problem space/Solution space” diagram on page 18 as is on page 17. The correct Problem/Solution space diagrams ar...
New
jskubick
I found an issue in Chapter 7 regarding android:backgroundTint vs app:backgroundTint. How to replicate: load chapter-7 from zipfile i...
New
s2k
Hi all, currently I wonder how the Tailwind colours work (or don’t work). For example, in app/views/layouts/application.html.erb I have...
New
dachristenson
I’ve got to the end of Ch. 11, and the app runs, with all tabs displaying what they should – at first. After switching around between St...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
AstonJ
What chair do you have while working… and why? Is there a ‘best’ type of chair or working position for developers?
New
AstonJ
Or looking forward to? :nerd_face:
487 12810 260
New
New
New
AstonJ
I have seen the keycaps I want - they are due for a group-buy this week but won’t be delivered until October next year!!! :rofl: The Ser...
New
AstonJ
If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...
New
New
husaindevelop
Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...
New
Margaret
Ask Me Anything with Mark Volkmann @mvolkmann On February 24 and 25, we are giving you a chance to ask questions of PragProg author M...
New

Sub Categories: