Dataflow is a fault-tolerant deterministic processing framework, engine, and service, not a messaging queue, so it doesn't "do that" by definition.. wrong product :)
That said, one may order and dedupe the message stream with Dataflow using message metadata, time windows, watermarks and triggers.
Sure. I think it is fair to say that AWS does not offer a "Global seamlessly scalable durable message delivery service". What doesn't seem fair to me is to complain that AWS offers too many products, or to ding Kinesis for making normal/understandable engineering trade-offs.
It turns out that while AWS doesn't offer a single "Global seamlessly scalable durable message delivery service", Google doesn't offer a single strictly ordered, at least once delivery message delivery service.
Personally, I think thats ok, as a variety of solutions is great for all of us, but its hard to say one decision is better than the other when they are solving different problems.
I agree with you, but re-reading the original commenter's argument, he was saying that with AWS services you don't get seamless scalability, even though there's SQS, Kinesis, Firehose. I don't think he was complaining about the number of products, I think he was making a point that most AWS services don't seamlessly scale the way Google Services do.
There's an interesting blog in the works by one of our customers, who "surprised" PubSub with 4.5 million messages per second, and kept on this test for about a week. One hell of a load test :)
And this is especially true when looking at the product I work on, BigQuery.
That said, one may order and dedupe the message stream with Dataflow using message metadata, time windows, watermarks and triggers.
PubSub offers at least once delivery semantics.
And I agree with your last statement.