Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

That strikes me as a common hugepages win. People never believe you, though, when you say you can make their thing 20% faster for free.
 help



Then it should be pretty easy to display that 20% "faster for free", no? But as always the devil is in the details. I experimented a lot with huge pages, and although in theory you should see the performance boost, the workloads I have been using to test this hypothesis did not end up with anything statistically significant/measurable. So, my conclusion was ... it depends.

Try a big factorio map just as a test case. It's a bit of an outlier on performance, in particular it's very heavy on memory bandwidth.

Of course, it only helps workloads that exhibit high rates of page table walking per instruction. But those are really common.

Yes, I understand that. It is implied that there's a high TLB miss rate. However, I'm wondering if the penalty which we can quantify as O(4) memory accesses for 4-level page table, which amounts to ~20 cycles if pages are already in L1 cache, or ~60-200 cycles if they are in L2/L3, would be noticeable in workloads which are IO bound. In other words, would such workloads benefit from switching to the huge pages when most of the time CPU anyways sits waiting on the data to arrive from the storage.

In a multi-tenant environment, yes. The faster they can get off the CPU and yield to some other tenant, the better it is.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: