"💩".length === 2

tracker1 3039 days ago. link 1 point ▲ ▼
It's funny, but the specification for JS says that it's supposed to use UCS16/UTF16 for it's internal string representation.  This means that anything over 16 bits uses a set...  I kind of wish they'd just switched to UTF-8 around EcmaScript3, before it was too big of an issue, now we're effectively stuck with it.

It would still have interesting lengths, but would at least be slightly more predictable... String's .normalize is an example of all that is wrong in this space... it's really wild.

It's also really important to be aware of this stuff and normalize before doing password hashing for your applications.