Hacker Times
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
CRConrad
on June 6, 2024
|
parent
|
context
|
favorite
| on:
How to chop off bytes of an UTF-8 string to fit in...
I'm thinking even bog-standard European umlauts, cedillas, etc go multi-byte in Unicode? (Take a string of ÅÄÖåäöÜü and chop it off at various byte limits and see.)
gmueckl
on June 6, 2024
|
next
[–]
This is just the general behavior of truncating strings by code point when they contain decomposed glyphs. This can also impact accents etc.
panzi
on June 6, 2024
|
prev
[–]
I don't remember the details, only that it was a bigger deal than with umlauts. I'll see if I can find the talk again.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: