Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteToMessageDecoder to call fireChannelRead more frequently #4284

Closed
Scottmitch opened this issue Sep 25, 2015 · 3 comments
Closed

ByteToMessageDecoder to call fireChannelRead more frequently #4284

Scottmitch opened this issue Sep 25, 2015 · 3 comments
Assignees
Milestone

Comments

@Scottmitch
Copy link
Member

If a large amount of objects are added to the out list in ByteToMessageDecoder.decode for a single channelRead operation then this can result in a large amount of ByteBuf objects to be allocated. There may be an opportunity for improvement in terms of ByteBuf reuse and reduction of allocations if we can call fireChannelRead more frequently.

From @ninja- #4205 (comment):

@Scottmitch yeah actuall y I pointed a bad place. At the moment I am using a hack that fires it inside decode() so more like:

Inside the frame decoder:

    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {
        Object o = decode(ctx,in);
        if (o != null)
            ctx.fireChannelRead(o); // instant fire, reuse buffers quicker :)

        if (o != null && out.isEmpty())
            out.add(Unpooled.EMPTY_BUFFER); // to get the read() correctly

    }

    //    @Override
    protected ByteBuf decode(ChannelHandlerContext ctx, ByteBuf in) throws Exception
{
// decode a frame here
}

so I am not changing the original class.

@Scottmitch
Copy link
Member Author

@ninja- Feel free to add more description.

@Scottmitch Scottmitch changed the title ByteToMessageDecoder to call channelRead more frequently ByteToMessageDecoder to call fireChannelRead more frequently Sep 26, 2015
@ninja-
Copy link

ninja- commented Sep 26, 2015

looks good to me. I could clarify that in my use case (a proxy) it wouldn't change the behaviour without the message squasher so it completes what squasher does rel #4285.
unless @normanmaurer has something against it we could set a default to "call when out.size() > 16, after each call to decode(...)" and add a setter so for example setFireMessageCount(1) and it seems like 10 minutes to implement.

ninja- pushed a commit to ninja-/netty that referenced this issue Sep 27, 2015
…ead(...) at ByteToMessageDecoder

Motivation:

A long fight against spam of ChannelOutboundBuffer.Entry instances has revealed a need
to pass decoded messages earlier in the decoding process so they could be released and instantly reused (ByteBufs).

Modifications:

ByteToMessageDecoder has been modified to fire channelRead(...) every 16 decoded messages with an option to override that number.

Related issue: netty#4285
ninja- pushed a commit to ninja-/netty that referenced this issue Sep 27, 2015
…ead(...) at ByteToMessageDecoder

Motivation:

A long fight against spam of ChannelOutboundBuffer.Entry instances has revealed a need
to pass decoded messages earlier in the decoding process so they could be released and instantly reused (ByteBufs).

Modifications:

ByteToMessageDecoder has been modified to fire channelRead(...) every 16 decoded messages with an option to override that number.

Results:

The buffers are more quickly reused and the number of recycled entries has been reduced.

Related issue: netty#4285
ninja- pushed a commit to ninja-/netty that referenced this issue Sep 27, 2015
…(with voidPromise).

Motivation:

A long fight against spam of ChannelOutboundBuffer.Entry instances has revealed a need
to squash these entries into one along with their buffers, especially the small buffers(small packets) were a big overhead.

Modifications:

The modified version of ChannelOutboundBuffer#addMessage now includes a fast-path that will append the data to latest buffer's entry(aka tailEntry)
if following conditions are met:
- both the requested message and tailEntry's message are ByteBufs
- both the requested promise and promise of the tailEntry are void promises

If needed, the tailEntry's buffer will be expanded by using the logic from ByteToMessageDecoder.Cumulator.

Result:

- dramatically reduce number of ChannelOutboundBuffer.Entry for proxy applications
- less overhead for small packets
- reduce the number of pooled buffers along with the entries, so they could be more efficiently reused
- ability to disable this behaviour with a system property

Real life test-case of a proxy app:
- before this change: 1844 outbound entries and 1845 buffers
- after this change: 4 entries and 4 buffers (!)

(the test-case also required changes from netty#4284)
@ninja-
Copy link

ninja- commented Sep 27, 2015

done @ PR #4288

@normanmaurer normanmaurer self-assigned this Oct 6, 2015
@normanmaurer normanmaurer added this to the 4.0.33.Final milestone Oct 6, 2015
normanmaurer added a commit that referenced this issue Oct 7, 2015
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
normanmaurer added a commit that referenced this issue Oct 7, 2015
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
normanmaurer added a commit that referenced this issue Oct 7, 2015
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
normanmaurer added a commit that referenced this issue Oct 7, 2015
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
ninja- pushed a commit to ninja-/netty that referenced this issue Oct 11, 2015
…(with voidPromise).

Motivation:

A long fight against spam of ChannelOutboundBuffer.Entry instances has revealed a need
to squash these entries into one along with their buffers, especially the small buffers(small packets) were a big overhead.

Modifications:

The modified version of ChannelOutboundBuffer#addMessage now includes a fast-path that will append the data to latest buffer's entry(aka tailEntry)
if following conditions are met:
- both the requested message and tailEntry's message are ByteBufs
- both the requested promise and promise of the tailEntry are void promises

If needed, the tailEntry's buffer will be expanded by using the logic from ByteToMessageDecoder.Cumulator.

Result:

- dramatically reduce number of ChannelOutboundBuffer.Entry for proxy applications
- less overhead for small packets
- reduce the number of pooled buffers along with the entries, so they could be more efficiently reused
- ability to disable this behaviour with a system property

Real life test-case of a proxy app:
- before this change: 1844 outbound entries and 1845 buffers
- after this change: 4 entries and 4 buffers (!)

(the test-case also required changes from netty#4284)
JLofgren added a commit to JLofgren/netty that referenced this issue Apr 6, 2018
Motivation:

When connecting to an HTTP/2 server that did not set any value for the
SETTINGS_MAX_HEADER_LIST_SIZE in the settings frame, the netty client was
imposing an arbitrary maximum header list size of 8kB. There should be no need
for the client to enforce such a limit if the server has not specified any
limit. This caused an issue for a grpc-java client that needed to send a large
header to a server via an Envoy proxy server. The error condition is
demonstrated here: https://github.com/JLofgren/demo-grpc-java-bug-4284

Fixes grpc-java issue netty#4284 - grpc/grpc-java#4284
and netty issue netty#7825 - netty#7825

Modifications:

In HpackEncoder use MAX_HEADER_LIST_SIZE as default maxHeader list size.

Result:

HpackEncoder will only enforce a max header list size if the server has
specified a limit in its settings frame.
Scottmitch pushed a commit that referenced this issue Apr 16, 2018
Motivation:

When connecting to an HTTP/2 server that did not set any value for the
SETTINGS_MAX_HEADER_LIST_SIZE in the settings frame, the netty client was
imposing an arbitrary maximum header list size of 8kB. There should be no need
for the client to enforce such a limit if the server has not specified any
limit. This caused an issue for a grpc-java client that needed to send a large
header to a server via an Envoy proxy server. The error condition is
demonstrated here: https://github.com/JLofgren/demo-grpc-java-bug-4284

Fixes grpc-java issue #4284 - grpc/grpc-java#4284
and netty issue #7825 - #7825

Modifications:

In HpackEncoder use MAX_HEADER_LIST_SIZE as default maxHeader list size.

Result:

HpackEncoder will only enforce a max header list size if the server has
specified a limit in its settings frame.
pulllock pushed a commit to pulllock/netty that referenced this issue Oct 19, 2023
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants