stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

Alexandr
Hi everyone! There is an error on the page 71 in the book “Programming machine learning from coding to depp learning” P. Perrotta. You c...
New
leba0495
Hello! Thanks for the great book. I was attempting the Trie (chap 17) exercises and for number 4 the solution provided for the autocorre...
New
leonW
I ran this command after installing the sample application: $ cards add do something --owner Brian And got a file not found error: Fil...
New
brunogirin
When I run the coverage example to report on missing lines, I get: pytest --cov=cards --report=term-missing ch7 ERROR: usage: pytest [op...
New
brunogirin
When installing Cards as an editable package, I get the following error: ERROR: File “setup.py” not found. Directory cannot be installe...
New
hazardco
On page 78 the following code appears: <%= link_to ‘Destroy’, product, class: ‘hover:underline’, method: :delete, data: { confirm...
New
taguniversalmachine
Hi, I am getting an error I cannot figure out on my test. I have what I think is the exact code from the book, other than I changed “us...
New
andreheijstek
After running /bin/setup, the first error was: The foreman' command exists in these Ruby versions: That was easy to fix: gem install fore...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New
davetron5000
Hello faithful readers! If you have tried to follow along in the book, you are asked to start up the dev environment via dx/build and ar...
New

Other popular topics Top

PragmaticBookshelf
Machine learning can be intimidating, with its reliance on math and algorithms that most programmers don't encounter in their regular wor...
New
Exadra37
Please tell us what is your preferred monitor setup for programming(not gaming) and why you have chosen it. Does your monitor have eye p...
New
Rainer
My first contact with Erlang was about 2 years ago when I used RabbitMQ, which is written in Erlang, for my job. This made me curious and...
New
AstonJ
Thanks to @foxtrottwist’s and @Tomas’s posts in this thread: Poll: Which code editor do you use? I bought Onivim! :nerd_face: https://on...
New
AstonJ
If you are experiencing Rails console using 100% CPU on your dev machine, then updating your development and test gems might fix the issu...
New
AstonJ
Saw this on TikTok of all places! :lol: Anyone heard of them before? Lite:
New
AstonJ
Was just curious to see if any were around, found this one: I got 51/100: Not sure if it was meant to buy I am sure at times the b...
New
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New

Sub Categories: