stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

telemachus
Python Testing With Pytest - Chapter 2, warnings for “unregistered custom marks” While running the smoke tests in Chapter 2, I get these...
New
simonpeter
When I try the command to create a pair of migration files I get an error. user=> (create-migration "guestbook") Execution error (Ill...
New
mikecargal
Title: Hands-On Rust (Chap 8 (Adding a Heads Up Display) It looks like ​.with_simple_console_no_bg​(SCREEN_WIDTH*2, SCREEN_HEIGHT*2...
New
jgchristopher
“The ProductLive.Index template calls a helper function, live_component/3, that in turn calls on the modal component. ” Excerpt From: Br...
New
creminology
Skimming ahead, much of the following is explained in Chapter 3, but new readers (like me!) will hit a roadblock in Chapter 2 with their ...
New
s2k
Hi all, currently I wonder how the Tailwind colours work (or don’t work). For example, in app/views/layouts/application.html.erb I have...
New
jwandekoken
Book: Programming Phoenix LiveView, page 142 (157/378), file lib/pento_web/live/product_live/form_component.ex, in the function below: d...
New
bjnord
Hello @herbert ! Trying to get the very first “Hello, Bracket Terminal!" example to run (p. 53). I develop on an Amazon EC2 instance runn...
New
roadbike
From page 13: On Python 3.7, you can install the libraries with pip by running these commands inside a Python venv using Visual Studio ...
New
dachristenson
I’ve got to the end of Ch. 11, and the app runs, with all tabs displaying what they should – at first. After switching around between St...
New

Other popular topics Top

AstonJ
A thread that every forum needs! Simply post a link to a track on YouTube (or SoundCloud or Vimeo amongst others!) on a separate line an...
New
AstonJ
If it’s a mechanical keyboard, which switches do you have? Would you recommend it? Why? What will your next keyboard be? Pics always w...
New
PragmaticBookshelf
Write Elixir tests that you can be proud of. Dive into Elixir’s test philosophy and gain mastery over the terminology and concepts that u...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New

Sub Categories: