stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

abtin
page 20: … protoc command… I had to additionally run the following go get commands in order to be able to compile protobuf code using go...
New
GilWright
Working through the steps (checking that the Info,plist matches exactly), run the demo game and what appears is grey but does not fill th...
New
yulkin
your book suggests to use Image.toByteData() to convert image to bytes, however I get the following error: "the getter ‘toByteData’ isn’t...
New
jskubick
I’m running Android Studio “Arctic Fox” 2020.3.1 Patch 2, and I’m embarrassed to admit that I only made it to page 8 before running into ...
New
adamwoolhether
Is there any place where we can discuss the solutions to some of the exercises? I can figure most of them out, but am having trouble with...
New
s2k
Hi all, currently I wonder how the Tailwind colours work (or don’t work). For example, in app/views/layouts/application.html.erb I have...
New
jwandekoken
Book: Programming Phoenix LiveView, page 142 (157/378), file lib/pento_web/live/product_live/form_component.ex, in the function below: d...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New
bjnord
Hello @herbert ! Trying to get the very first “Hello, Bracket Terminal!" example to run (p. 53). I develop on an Amazon EC2 instance runn...
New
davetron5000
Hello faithful readers! If you have tried to follow along in the book, you are asked to start up the dev environment via dx/build and ar...
New

Other popular topics Top

PragmaticBookshelf
Stop developing web apps with yesterday’s tools. Today, developers are increasingly adopting Clojure as a web-development platform. See f...
New
Exadra37
Please tell us what is your preferred monitor setup for programming(not gaming) and why you have chosen it. Does your monitor have eye p...
New
AstonJ
There’s a whole world of custom keycaps out there that I didn’t know existed! Check out all of our Keycaps threads here: https://forum....
New
PragmaticBookshelf
Rust is an exciting new programming language combining the power of C with memory safety, fearless concurrency, and productivity boosters...
New
AstonJ
In case anyone else is wondering why Ruby 3 doesn’t show when you do asdf list-all ruby :man_facepalming: do this first: asdf plugin-upd...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New
PragmaticBookshelf
Fight complexity and reclaim the original spirit of agility by learning to simplify how you develop software. The result: a more humane a...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New

Sub Categories: