stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

jimschubert
In Chapter 3, the source for index introduces Config on page 31, followed by more code including tests; Config isn’t introduced until pag...
New
New
jdufour
Hello! On page xix of the preface, it says there is a community forum "… for help if your’re stuck on one of the exercises in this book… ...
New
leonW
I ran this command after installing the sample application: $ cards add do something --owner Brian And got a file not found error: Fil...
New
curtosis
Running mix deps.get in the sensor_hub directory fails with the following error: ** (Mix) No SSH public keys found in ~/.ssh. An ssh aut...
New
brunogirin
When I run the coverage example to report on missing lines, I get: pytest --cov=cards --report=term-missing ch7 ERROR: usage: pytest [op...
New
brunogirin
When installing Cards as an editable package, I get the following error: ERROR: File “setup.py” not found. Directory cannot be installe...
New
New
hazardco
On page 78 the following code appears: <%= link_to ‘Destroy’, product, class: ‘hover:underline’, method: :delete, data: { confirm...
New
bjnord
Hello @herbert ! Trying to get the very first “Hello, Bracket Terminal!" example to run (p. 53). I develop on an Amazon EC2 instance runn...
New

Other popular topics Top

Exadra37
I am thinking in building or buy a desktop computer for programing, both professionally and on my free time, and my choice of OS is Linux...
New
AstonJ
poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
AstonJ
Biggest jackpot ever apparently! :upside_down_face: I don’t (usually) gamble/play the lottery, but working on a program to predict the...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
husaindevelop
Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...
New
CommunityNews
A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New
AstonJ
This is cool! DEEPSEEK-V3 ON M4 MAC: BLAZING FAST INFERENCE ON APPLE SILICON We just witnessed something incredible: the largest open-s...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New

Sub Categories: