stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

herminiotorres
Hi! I know not the intentions behind this narrative when called, on page XI: mount() |> handle_event() |> render() but the correc...
New
AleksandrKudashkin
On the page xv there is an instruction to run bin/setup from the main folder. I downloaded the source code today (12/03/21) and can’t see...
New
jskubick
I’m running Android Studio “Arctic Fox” 2020.3.1 Patch 2, and I’m embarrassed to admit that I only made it to page 8 before running into ...
New
jskubick
I’m under the impression that when the reader gets to page 136 (“View Data with the Database Inspector”), the code SHOULD be able to buil...
New
hgkjshegfskef
The test is as follows: Scenario: Intersecting a scaled sphere with a ray Given r ← ray(point(0, 0, -5), vector(0, 0, 1)) And s ← sphere...
New
brunogirin
When running tox for the first time, I got the following error: ERROR: InterpreterNotFound: python3.10 I realised that I was running ...
New
taguniversalmachine
It seems the second code snippet is missing the code to set the current_user: current_user: Accounts.get_user_by_session_token(session["...
New
tkhobbes
After some hassle, I was able to finally run bin/setup, now I have started the rails server but I get this error message right when I vis...
New
davetron5000
Hello faithful readers! If you have tried to follow along in the book, you are asked to start up the dev environment via dx/build and ar...
New
roadbike
From page 13: On Python 3.7, you can install the libraries with pip by running these commands inside a Python venv using Visual Studio ...
New

Other popular topics Top

New
PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
PragmaticBookshelf
Get the comprehensive, insider information you need for Rails 8 with the new edition of this award-winning classic. Sam Ruby @rubys ...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New

Sub Categories: