stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

iPaul
page 37 ANTLRInputStream input = new ANTLRInputStream(is); as of ANTLR 4 .8 should be: CharStream stream = CharStreams.fromStream(i...
New
ianwillie
Hello Brian, I have some problems with running the code in your book. I like the style of the book very much and I have learnt a lot as...
New
simonpeter
When I try the command to create a pair of migration files I get an error. user=> (create-migration "guestbook") Execution error (Ill...
New
AleksandrKudashkin
On the page xv there is an instruction to run bin/setup from the main folder. I downloaded the source code today (12/03/21) and can’t see...
New
AndyDavis3416
@noelrappin Running the webpack dev server, I receive the following warning: ERROR in tsconfig.json TS18003: No inputs were found in c...
New
oaklandgit
Hi, I completed chapter 6 but am getting the following error when running: thread 'main' panicked at 'Failed to load texture: IoError(O...
New
akraut
The markup used to display the uploaded image results in a Phoenix.LiveView.HTMLTokenizer.ParseError error. lib/pento_web/live/product_l...
New
rainforest
Hi, I’ve got a question about the implementation of PubSub when using a Phoenix.Socket.Transport behaviour rather than channels. Before ...
New
andreheijstek
After running /bin/setup, the first error was: The foreman' command exists in these Ruby versions: That was easy to fix: gem install fore...
New
New

Other popular topics Top

PragmaticBookshelf
Machine learning can be intimidating, with its reliance on math and algorithms that most programmers don't encounter in their regular wor...
New
ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
I ended up cancelling my Moonlander order as I think it’s just going to be a bit too bulky for me. I think the Planck and the Preonic (o...
New
PragmaticBookshelf
Use WebRTC to build web applications that stream media and data in real time directly from one user to another, all in the browser. ...
New
New
PragmaticBookshelf
Author Spotlight Rebecca Skinner @RebeccaSkinner Welcome to our latest author spotlight, where we sit down with Rebecca Skinner, auth...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
PragmaticBookshelf
Author Spotlight: VM Brasseur @vmbrasseur We have a treat for you today! We turn the spotlight onto Open Source as we sit down with V...
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New

Sub Categories: