Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

5. Byte-by-Byte Wire Format Analysis

This section shows exactly how every struct looks on the wire, before and after the fixes. Open your hex editor.


How Bincode Serializes Things

All Solana gossip messages use bincode::serialize() with default options:

  • Little-endian byte order
  • FixintEncoding for enums (u32 discriminant)
  • u64 for Vec lengths (unless overridden by short_vec)
  • u64 for integer types (unless overridden by serde_varint)

The Version Struct: 17 bytes → 12 bytes

Before (broken wire format):

Offset  Hex        Field              Notes
0       02 00      major = 2          u16 LE = 2 bytes
2       00 00      minor = 0          u16 LE = 2 bytes
4       00 00      patch = 0          u16 LE = 2 bytes
6       01         Some tag           Option<u32> = 1 byte prefix
7       00 00 00 00 commit = Some(0)  u32 LE = 4 bytes
11      00 00 00 00 feature_set = 0   u32 LE = 4 bytes
15      03 00      client = 3 (Agave) u16 LE = 2 bytes
                      TOTAL = 17 bytes

After (correct wire format):

Offset  Hex        Field              Notes
0       02         major = 2          varint = 1 byte (0x02)
1       00         minor = 0          varint = 1 byte (0x00)
2       00         patch = 0          varint = 1 byte (0x00)
3       00 00 00 00 commit = 0        u32 LE = 4 bytes
7       00 00 00 00 feature_set = 0   u32 LE = 4 bytes
11      03         client = 3 (Agave) varint = 1 byte (0x03)
                      TOTAL = 12 bytes

Savings: 5 bytes (29% reduction)


The ContactInfo Struct: 307 bytes → 65 bytes

Before (broken wire format, typical case: 1 addr, 1 socket, 0 extensions):

Offset  Hex (length)  Field
0       32 bytes      pubkey (32 bytes)
32      8 bytes       wallclock (u64) ← WRONG: should be varint, ~5 bytes
40      8 bytes       outset (u64)
48      2 bytes       shred_version (u16)
50      17 bytes      version (wrong format)
67      8 bytes       addrs len (u64 = 1) ← WRONG: should be varint, 1 byte
75      4 bytes       addr[0] (Ipv4: 4 bytes)
79      8 bytes       sockets len (u64 = 1) ← WRONG: should be varint, 1 byte
87      4 bytes       socket[0] (1 + 1 + 2 = 4 bytes)
91      8 bytes       extensions len (u64 = 0) ← WRONG: should be varint, 1 byte
99      208 bytes     cache[13] (13 × SocketAddr = 208 bytes ← WRONG: should be SKIPPED
                      TOTAL = ~307 bytes

After (correct wire format):

Offset  Hex (length)  Field
0       32 bytes      pubkey (32 bytes)
32      ~5 bytes      wallclock (varint, ~5 bytes for timestamp ~1.7e15)
37      8 bytes       outset (u64)
45      2 bytes       shred_version (u16)
47      12 bytes      version (correct format)
59      ~1 byte       addrs len (varint = 1)
60      4 bytes       addr[0] (Ipv4: 4 bytes)
64      ~1 byte       sockets len (varint = 1)
65      ~4 bytes      socket[0] (key=1, index=1, offset=varint)
69      ~1 byte       extensions len (varint = 0)
--      SKIPPED       cache (not serialized)
                      TOTAL = ~65 bytes

Savings: ~242 bytes (79% reduction)


The CrdsValue: 32 bytes of hash poison

Before (CrdsValue on wire):

Offset  Hex           Field
0       64 bytes      signature (Signature = 64 bytes)
64      variable      data (CrdsData enum, variable length)
64+N    32 bytes      hash (Hash = 32 bytes) ← WRONG: Agave doesn't send this
                      TOTAL = 96 + N bytes

After (CrdsValue on wire):

Offset  Hex           Field
0       64 bytes      signature (Signature = 64 bytes)
64      variable      data (CrdsData enum, variable length)
                      TOTAL = 64 + N bytes
                      hash SKIPPED

Savings: 32 bytes per CrdsValue

Since our PullRequest contains 1 CrdsValue (the caller ContactInfo), this saved 32 bytes in the PullRequest. These 32 bytes were then reclaimed by the bloom filter (the bloom budgeting code in pull_request.rs fills up to PACKET_DATA_SIZE = 1232 bytes).