stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

ianwillie
Hello Brian, I have some problems with running the code in your book. I like the style of the book very much and I have learnt a lot as...
New
mikecargal
Title: Hands-On Rust (Chapter 11: prefab) Just played a couple of amulet-less games. With a bit of debugging, I believe that your can_p...
New
raul
Page 28: It implements io.ReaderAt on the store type. Sorry if it’s a dumb question but was the io.ReaderAt supposed to be io.ReadAt? ...
New
Charles
In general, the book isn’t yet updated for Phoenix version 1.6. On page 18 of the book, the authors indicate that an auto generated of ro...
New
jonmac
The allprojects block listed on page 245 produces the following error when syncing gradle: “org.gradle.api.GradleScriptException: A prob...
New
EdBorn
Title: Agile Web Development with Rails 7: (page 70) I am running windows 11 pro with rails 7.0.3 and ruby 3.1.2p20 (2022-04-12 revision...
New
a.zampa
@mfazio23 I’m following the indications of the book and arriver ad chapter 10, but the app cannot be compiled due to an error in the Bas...
New
gorkaio
root_layout: {PentoWeb.LayoutView, :root}, This results in the following following error: no “root” html template defined for PentoWeb...
New
davetron5000
Hello faithful readers! If you have tried to follow along in the book, you are asked to start up the dev environment via dx/build and ar...
New
dachristenson
I’ve got to the end of Ch. 11, and the app runs, with all tabs displaying what they should – at first. After switching around between St...
New

Other popular topics Top

dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
PragmaticBookshelf
Design and develop sophisticated 2D games that are as much fun to make as they are to play. From particle effects and pathfinding to soci...
New
New
AstonJ
poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New
RobertRichards
Hair Salon Games for Girls Fun Girls Hair Saloon game is mainly developed for kids. This game allows users to select virtual avatars to ...
New
PragmaticBookshelf
A concise guide to MySQL 9 database administration, covering fundamental concepts, techniques, and best practices. Neil Smyth MySQL...
New

Sub Categories: