stmpjmpr

stmpjmpr

Programming Ruby 3.3 (5th Edition, p 122): Unicode Character Definition Incorrect

@noelrappin

On page 122, in the discussion of “Strings and Encodings”, it is stated: “A Unicode character is two bytes long, which makes a Unicode string twice as long internally as a Latin-1 string.” This is incorrect. Unicode represents a 21-bit code space; two bytes are not enough to cover it.

The Unicode Standard says, "Unicode characters are represented in one of three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been designed for ease of use with existing ASCII-based systems. "

I’d further argue that this is not just being pedantic: the idea that Unicode can be captured completely in two bytes is one of the more dangerous lingering misconceptions. It leads people to think that UTF-16, for instance, will always be 2-byte units (it is, like UTF-8, a variable-length encoding).

Where Next?

Popular Pragmatic Bookshelf topics Top

iPaul
page 37 ANTLRInputStream input = new ANTLRInputStream(is); as of ANTLR 4 .8 should be: CharStream stream = CharStreams.fromStream(i...
New
jeffmcompsci
Title: Design and Build Great Web APIs - typo “https://company-atk.herokuapp.com/2258ie4t68jv” (page 19, third bullet in URL list) Typo:...
New
nicoatridge
Hi, I have just acquired Michael Fazio’s “Kotlin and Android Development” to learn about game programming for Android. I have a game in p...
New
adamwoolhether
I’m not quite sure what’s going on here, but I’m unable to have to containers successfully complete the Readiness/Liveness checks. I’m im...
New
brunogirin
When running tox for the first time, I got the following error: ERROR: InterpreterNotFound: python3.10 I realised that I was running ...
New
oaklandgit
Hi, I completed chapter 6 but am getting the following error when running: thread 'main' panicked at 'Failed to load texture: IoError(O...
New
akraut
The markup used to display the uploaded image results in a Phoenix.LiveView.HTMLTokenizer.ParseError error. lib/pento_web/live/product_l...
New
jwandekoken
Book: Programming Phoenix LiveView, page 142 (157/378), file lib/pento_web/live/product_live/form_component.ex, in the function below: d...
New
redconfetti
Docker-Machine became part of the Docker Toolbox, which was deprecated in 2020, long after Docker Desktop supported Docker Engine nativel...
New
dachristenson
I’ve got to the end of Ch. 11, and the app runs, with all tabs displaying what they should – at first. After switching around between St...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
DevotionGeo
I know that these benchmarks might not be the exact picture of real-world scenario, but still I expect a Rust web framework performing a ...
New
PragmaticBookshelf
Design and develop sophisticated 2D games that are as much fun to make as they are to play. From particle effects and pathfinding to soci...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...
New
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
PragmaticBookshelf
Get the comprehensive, insider information you need for Rails 8 with the new edition of this award-winning classic. Sam Ruby @rubys ...
New

Sub Categories: