Pathological performance of file_handle_cache read buffer when synchronising queues #69

simonmacmullen · 2015-03-13T14:09:46Z

Noticed while working on #65.

The queue eager-sync process ends up reading messages from the queue in the opposite order to that in which they were published. If the messages are being read from the message store this causes very poor performance since 3.5.0.

If we have 10kB messages then we read 1MB from near the end of the file, then seek back 10kB, decide that's outside the buffer, read another 1MB, seek back 10kB, and so on. On my machine we sync ~250 msg/s like that. Oops!

Ways we could fix this, in ascending order of cleverness:

Hard code a smaller buffer size
Dynamically shrink the buffer size if we determine it is not working
Read the buffer backwards from our seek point if we detect we are seeking backwards

...such as when we are reading backwards through a file. References #69.

simonmacmullen · 2015-03-13T16:45:06Z

Hard code a smaller buffer size

Seems rather sad.

Dynamically shrink the buffer size if we determine it is not working

This is the approach I've gone for, primarily because it should be able to stop other pathological behaviour.

Read the buffer backwards from our seek point if we detect we are seeking backwards

This might be a nice option to get syncing going still faster, but it's also fiddly and only solves this exact problem. I'll settle for having it no worse than it was in 3.4.4.

simonmacmullen · 2015-03-19T13:00:05Z

Just for clarity, this bug:

Only affects 3.5.0
Requires messages to be larger than the queue index embedding threshold (by default 4kB)
Requires messages to be paged out before synchronisation starts

You can see in the I/O stats on the master that if (say) 250 messages are read from disk per second, we also read 250MB/s even if the messages are much smaller than that.

dumbbell · 2015-03-19T17:41:18Z

Here is what I did to test the correction:

I start two nodes, A and B, with a very low vm_memory_high_watermark to make them page messages out early, clustered them and added the following HA policy on node B:
```
rabbitmqctl -n B set_policy ha-all "." '{"ha-mode":"all"}'
```
I stopped node B using:
```
rabbitmq -n B stop_app
```
I used PerfTest to publish 10 kB messages with a rate-limited consumer so messages stay in RabbitMQ:
```
PerfTest -s 10240 -R 100
```
The producer could publish around 40,000 messages before being throttled.
I started node B again and force synchronisation from the management UI.

With the stable branch, the management UI reports I/O read rates of:

150 messages/s
150 MB/s

With the rabbitmq-server-69 branch (this fix), it reports:

1000 messages/s
15 MB/s

I logged the size of the read buffer in file_handle_cache.erl at the same time. With stable, the buffer remains at an expected 1MB size. With the fix, the size continuously switches between 10468 and 20936, with an occasional jump to 4 MB.

simonmacmullen · 2015-03-19T17:59:15Z

Note that you don't need the -R 100, you can use -y0 -u test -p to get PerfTest to publish to a queue with no consumers which might be easier to work with.

The 4MB sizes probably refer to other files (queue index files?)

Not sure whether the flicking between 10468 and 20936 is worth fixing, what do you think?

simonmacmullen · 2015-03-19T18:09:10Z

Oh, also you can set a very low vm_memory_high_watermark_paging_ratio rather than vm_memory_high_watermark, that way you can publish indefinitely but get paged out rapidly.

dumbbell · 2015-03-19T18:55:13Z

One correction to my previous comment:

150 messages/s
1000 messages/s

Those should read:

150 reads/s
1000 reads/s

The 4MB sizes probably refer to other files (queue index files?)

You're right, the file handle differs for those reads.

Not sure whether the flicking between 10468 and 20936 is worth fixing, what do you think?

After a test:

With the flickering buffer:
- 1000 reads/s
- 15 MB/s from disc
- 10-12 MB/s sent to node B
With a constant buffer:
- 750 reads/s
- 7 MB/s from disc
- 7 MB/s sent to node B

In the first case, we read 20 kB to only use 10 kB, then we read 10 kB, then we double and so on. We don't do this in the second case (we always read 10 kB). When comparing the number of reads to the throughput, we see the 33% decrease of throughput in the second case, corresponding to not wasting 10 kB. However, I can't explain why it is slower...

While here, fix two typos. References #69.

dumbbell · 2015-03-23T16:44:56Z

Here are new, more meaningful numbers comparing stable and 8faf4ee.

The protocol is:

Start nodes A and B, cluster them, add a HA policy.
Create a queue from the management UI.
Stop node N.
Use PerfTest to queue 300,000 messages, which is enough to page them out (the filesystem is tmpfs). No clients are connected after that.
Start B and force synchronization. While this happens, look at the time the full sync takes, as well as I/O and network statistics.

Results with stable:

Synchronization finished in 1'55".
Reads: 1600/s (1.6 GiB/s)
Network (from A to B): 18 MiB/s while messages are paged in, then 58 MiB/s

Results with 8faf4ee:

Synchronization finished in 1'10".
Reads: 4500/s (57 MiB/s)
Network (from A to B): 58 MiB/s

...such as when we are reading backwards through a file. References #69.

While here, fix two typos. References #69.

Take into account the size of the current read if appropriate, add an extra byte when shrinking so next time we know how we did, and fix comments all round. References #69.

Updates queue_master_location/1 parameter spec from pid() to #amqqueue{}

Switch to MPL2

…ffset-epoch-atom Make init_reader and init_data_reader use a consistent return value

simonmacmullen added bug effort-low labels Mar 13, 2015

simonmacmullen self-assigned this Mar 13, 2015

simonmacmullen pushed a commit that referenced this issue Mar 13, 2015

Tune the read buffer size to prevent pathological behaviour

55ac08b

...such as when we are reading backwards through a file. References #69.

dumbbell added a commit that referenced this issue Mar 20, 2015

file_handle_cache: Remove the "no read buffer" comment in the essay

179ab9c

While here, fix two typos. References #69.

simonmacmullen pushed a commit that referenced this issue Mar 23, 2015

Tune the read buffer size to prevent pathological behaviour

73e1273

...such as when we are reading backwards through a file. References #69.

dumbbell added a commit that referenced this issue Mar 23, 2015

file_handle_cache: Remove the "no read buffer" comment in the essay

6048843

While here, fix two typos. References #69.

simonmacmullen pushed a commit that referenced this issue Mar 23, 2015

Improvements to tune_read_buffer_limit

f1945c0

Take into account the size of the current read if appropriate, add an extra byte when shrinking so next time we know how we did, and fix comments all round. References #69.

dumbbell added this to the 3.5.1 milestone Mar 23, 2015

dumbbell closed this as completed Mar 23, 2015

dcorbacho pushed a commit that referenced this issue May 27, 2016

Merge pull request #69 from rabbitmq/rabbitmq-common-68

2b934e7

Updates queue_master_location/1 parameter spec from pid() to #amqqueue{}

dcorbacho pushed a commit that referenced this issue Nov 18, 2020

Merge pull request #69 from rabbitmq/switch-to-MPL2

125bb65

Switch to MPL2

dcorbacho pushed a commit that referenced this issue Jul 5, 2023

Merge pull request #69 from rabbitmq/mk-consistent-returned-invalid-o…

167d3ef

…ffset-epoch-atom Make init_reader and init_data_reader use a consistent return value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pathological performance of file_handle_cache read buffer when synchronising queues #69

Pathological performance of file_handle_cache read buffer when synchronising queues #69

simonmacmullen commented Mar 13, 2015

simonmacmullen commented Mar 13, 2015

simonmacmullen commented Mar 19, 2015

dumbbell commented Mar 19, 2015

simonmacmullen commented Mar 19, 2015

simonmacmullen commented Mar 19, 2015

dumbbell commented Mar 19, 2015

dumbbell commented Mar 23, 2015

Pathological performance of file_handle_cache read buffer when synchronising queues #69

Pathological performance of file_handle_cache read buffer when synchronising queues #69

Comments

simonmacmullen commented Mar 13, 2015

simonmacmullen commented Mar 13, 2015

simonmacmullen commented Mar 19, 2015

dumbbell commented Mar 19, 2015

simonmacmullen commented Mar 19, 2015

simonmacmullen commented Mar 19, 2015

dumbbell commented Mar 19, 2015

dumbbell commented Mar 23, 2015