Hacker Times
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
vardump
on May 16, 2018
|
parent
|
context
|
favorite
| on:
Validating UTF-8 strings using as little as 0.7 cy...
> I think it would be faster to OR the entire string with itself, then finally check the 8th bit though.
The string could have NUL (zero) bytes in between.
zbjornson
on May 16, 2018
[–]
You're right that it changes the behavior vs. what the current implementation is, but 0x0 is a valid ASCII character.
vardump
on May 16, 2018
|
parent
[–]
While you're technically right NUL is a part of ASCII set in practise it's rarely wanted in the data.
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
The string could have NUL (zero) bytes in between.