That's the abstraction they want you to work with, yes. That doesn't mean it's w...

jsnell · on Feb 20, 2024

I've linked to public documentation that is pretty clearly in conflict with what you said. There's no wiggle room in how AWS describes their service without it being false advertising. There's no "ah, but what if we define the entire building to be the host computer, then the networked SSDs really are inside the host computer" sleight of hand to pull off here.

You've provided cryptic hints and a suggestion to watch some unnamed presentation.

At this point I really think the burden of proof is on you.

stingraycharles · on Feb 21, 2024

You are correct, and the parent you’re replying to is confused. Nitro is for EBS, not the i3 local NVMe instances.

Those i3 instances lose your data whenever you stop and start them again (ie migrate to a different host machine), there’s absolutely no reason they would use network.

EBS itself uses a different network than the “normal” internet, if I were to guess it’s a converged Ethernet network optimized for iSCSI. Which is what Nitro optimizes for as well. But it’s not relevant for the local NVMe storage.

MichaelZuo · on Feb 21, 2024

The argument could also be resolved by just getting the latency numbers for both cases and compare them, on bare metal it shouldn't be more than a few hundred nanoseconds.

fcsp · on Feb 21, 2024

I see wiggle room in the statement you posted in that the SSD storage that is physically inside the machine hosting the instance might be mounted into the hypervised instance itself via some kind of network protocol still, adding overhead.

eek2121 · on Feb 21, 2024

At minimum, the entire setup will be virtualized, which does add overhead.

jng · on Feb 20, 2024

Nitro "virtual NVME" device are mostly (only?) for EBS -- remote network storage, transparently managed, using a separate network backbone, and presented to the host as a regular local NVME device. SSD drives in instances such as i4i, etc. are physically attached in a different way -- but physically, unlike EBS, they are ephemeral and the content becomes unavaiable as you stop the instance, and when you restart, you get a new "blank slate". Their performance is 1 order of magnitude faster than standard-level EBS, and the cost structure is completely different (and many orders of magnitude more affordable than EBS volumes configured to have comparable I/O performance).

rcarmo · on Feb 21, 2024

This is the way Azure temporary volumes work as well. They are scrubbed off the hardware once the VM that accesses them is dead. Everything else is over the network.

jasonwatkinspdx · on Feb 20, 2024

Both the documentation and Amazon employees are in here telling you that you're wrong. Can you resolve that contradiction or do you just want to act coy like you know some secret? The latter behavior is not productive.

stingraycharles · on Feb 21, 2024

The parent thinks that AWS' i3 NVMe local instance storage is using a PCIe switch, which is not the case. EBS (and the AWS Nitro card) use a PCIe switch, and as such all EBS storage is exposed as e.g. /dev/nvmeXnY . But that's not the same as the i3 instances are offering, so the parent is confused.

dekhn · on Feb 20, 2024

it sounds like you're trying to say "PCI switch" without saying "PCI switch" (I worked at Google for over a decade, including hardware division).

pclmulqdq · on Feb 21, 2024

That is what I am trying to say without actually giving it out. PCIe switches are very much not transparent devices. Apparently AWS has not published anything about this, and doesn't have Nitro moderating access to "local" SSD, though - that I did get confused with EBS.

pzb · on Feb 21, 2024

AWS has stated that there is a "Nitro Card for Instance Storage"[0][1] which is a NVMe PCIe controller that implements transparent encryption[2].

I don't have access to an EC2 instance to check, but you should be able to see the PCIe topology to determine how many physical cards are likely in i4i and im4gn and their PCIe connections. i4i claims to have 8 x 3,750 AWS Nitro SSD, but it isn't clear how many PCIe lanes are used.

Also, AWS claims "Traditionally, SSDs maximize the peak read and write I/O performance. AWS Nitro SSDs are architected to minimize latency and latency variability of I/O intensive workloads [...] which continuously read and write from the SSDs in a sustained manner, for fast and more predictable performance. AWS Nitro SSDs deliver up to 60% lower storage I/O latency and up to 75% reduced storage I/O latency variability [...]"

This could explain the findings in the article - they only meared peak r/w, not predictability.

[0] https://perspectives.mvdirona.com/2019/02/aws-nitro-system/ [1] https://aws.amazon.com/ec2/nitro/ [2] https://d1.awsstatic.com/events/reinvent/2019/REPEAT_2_Power...

rowanG077 · on Feb 21, 2024

Why are you acting as if PCIe switches are some secret technology? It was extremely grating for me to read your comments.

the-rc · on Feb 21, 2024

Although it used them for years, the first mention by Google of PCIe switches was probably in the 2022 Aquila paper, which doesn't really talk about storage anyway...

rowanG077 · on Feb 21, 2024

I don't understand why you would expect Google to state that. They have been standard technology for almost 2 decades. You don't see google claiming they use jtag or using SPI flash or whatever. It's just not special.

the-rc · on Feb 22, 2024

Google didn't invent the Clos network, either, but it took years before they started talking about its adoption and with what kind of proprietary twists. Same with power supplies. You're right, a PCIe switch is not special, unless maybe it's integrated in some unconventional way. It's in Google's DNA to be cagey by default on a lot of details, to avoid giving ideas to the competition. Or misleading others down rabbit holes, like with shipping container datacenters.

sitkack · on Feb 22, 2024

No, it dismisses technology until it does a 180 and then pretends it innovated in ways everyone is too stupid to understand. Google exceptionalism 101.

stingraycharles · on Feb 21, 2024

Because the parent works/worked for Google, so obviously it must be super secret sauce that nobody has heard of. /s

Next up they’re going explain to us that iSCSI wants us to think it’s SCSI but it’s actually not!

dekhn · on Feb 21, 2024

Like many other people in this thread, I think we disagree that a PCI switch means that an SSD "is connected over a network" to the host bus.

Now if you can show me two or more hosts connected to a box of SSDs through a PCI switch (and some sort of cool tech for coordinating between the hosts), that's interesting.