> Every few seconds one of the writes takes forever [~5s]. You can notice the long periods of inactivity, and after that a green dot at the right of the chart: that’s our slow call. What is likely happening is: the local cache saturates and when that happens the application has to wait until the local data is pushed to the remote volume. Boy, you sure don’t want one of your critical code paths to hit one of these slow calls.
I'm surprised that there's no asynchronous way that the FS cache will flush itself i.e. when it reaches 50% capacity, and rate-limit incoming requests if it's too full. The idea that an FS cache is so dumb that it can't do anything while it's flushing its entire self is a bit scary - I'd expect that circular buffers and granular locking mechanisms could be used to great effect here. Is this kernel code? Userspace code? Is there research into this? Fundamental tradeoffs that I'm missing?
It would be interesting to see the client/benchmarking program. It almost sounds like it could be single-threaded ... which would mean the delay is an artifact of the benchmark only having one op outstanding, rather than something inherent in the storage layer.
Even with one client thread, though, shouldn't there be a background OS thread maintaining the FS cache and flushing parts of it? I don't think it should block the client just because it decided it was too full.
That's clever and well executed. Wrong palette though :P
Red implies problems, green implies "normality", but here this association is misplaced. Perhaps a typical "fire" palette would be better - from dark brown to red to orange to yellow and, ultimately, to white for the extremes.
OP here. Unfortunately the ansi palette is pretty limited so I didn't have a lot of flexibility in the color choice. That said, this can definitely be improved. I can work on it if people find it useful.
> Unfortunately the ansi palette is pretty limited so I didn't have a lot of flexibility in the color choice.
I believe the issue raised isnt the palette range itself, but rather that it is the reverse of what it is typically expected. The current red area "should" be green indicating there are many calls in the fast region while the current trailing green blocks "should" be red indicating problem issues
Here, "good" is on the left and "bad" is on the right. The color is orthogonal (it gives the number of operations with latencies in a given bucket). For example, a red square on the right side of the output would have definitely been "bad".
Neat! This is definitely a step forward -- and thanks for the shout-out to our (that is, Sun's and Joyent's) prior work here. Tempted to also incorporate this into agghist and aggpack, the new DTrace actions I added for this kind of functionality.[1] Anyway, good stuff -- it's always good to see new visualizations of system behavior!
It would be interesting to run these tests on different instance sizes, specifically for data on the instance store. The larger the instance, the fewer neighbors you have to worry spending those precious IOPS.
As for SSD vs Magnetic EBS, I can't say that I'm surprised. I'd assume that EBS implements some sort of cache in between you and your actual disk on the other side of the network so that the writes can return even faster. Try doing this again with reads and I'd bet you'd get some interesting results.
Yes, I did pre-warm the volumes before using them.
And yes, there are several interesting workloads that I didn't test, including read only and read+write. It's potential material for another blog post.
In the world of IOPS provisioned iops application demanding faster and faster iops this tool is handy for devops guy to find the truth of iops being used and how its performing, selecting if there is need to upgrade the storage ..
I'm surprised that there's no asynchronous way that the FS cache will flush itself i.e. when it reaches 50% capacity, and rate-limit incoming requests if it's too full. The idea that an FS cache is so dumb that it can't do anything while it's flushing its entire self is a bit scary - I'd expect that circular buffers and granular locking mechanisms could be used to great effect here. Is this kernel code? Userspace code? Is there research into this? Fundamental tradeoffs that I'm missing?