High-performance Golang struct optimizations: Paddings and Alignments.

When we're creating structs in Golang, we typically think about the fields we need and their types, the readability and maintainability. Sometimes we sort fields by their name, sometimes by their logic, and sometimes we just put them in the order we like. But what if I tell you that the order of fields in Golang struct dramatically affects the memory usage and performance of your application?

In certain scenarios, you can reduce RAM usage by your application by 20-50% and increase allocation and access performance for free. To achieve this, we need to understand how the Golang compiler works with paddings and alignments in structs.

Struct Layout.

The compiler in Golang uses a specified logic for struct layout to make it more efficient in terms of memory access and optimal alignment for target hardware architecture. The compiler will align struct fields to the size of the largest field type in the struct, and it will also add padding bytes between fields to ensure that each field is aligned correctly.

So if your struct has 2 fields, one of type int32 and another of type int64, the compiler will align the int64 field to 8 bytes boundaries, and it will add 4 bytes of padding after the int32 field to ensure that the int64 field is aligned correctly. This 8 byte boundary size applies to 64-bit architecture, which is the most common architecture used today. 64 bits equals 8 bytes, we can say that 64-bit architecture word size is 8 bytes and 32-bit architecture word size is 4 bytes.

However, if the underlying type size is greater than the word size, the compiler will still align it to our architecture word size (8 bytes in our case). For example, if you have a string field of size 30 bytes, the compiler will align it to 8 bytes boundaries making it look like 4 chunks: 3 chunks of 8 bytes and 1 chunk of 6 bytes plus 2 padding bytes to make it 32 bytes in total.

Those padding bytes are not used by the application, but they are still allocated in memory, which can lead to increased memory usage and decreased performance. What can you possibly put in 2 bytes? It can be 2 fields of int8 or byte type, or even bool; one field of int16 and uint16 etc.

So, the order of fields in your structure will affect the number of padding bytes added by the compiler between them and the overall memory usage of your application. The more padding bytes you have, the more memory your application will use, and the slower it will be.

How paddings are born.

Let's take a look at a simple structure example first:

  // NestedLayout represents a nested structure simple example.
  // Size: 24 bytes
  type NestedLayout struct {
    ID    int64 // 8 bytes
    Phone int64 // 8 bytes
    Age   int32 // 4 bytes
    // 4 bytes padding to align the structure to 8 bytes
  }

  // UnoptimizedLayout - not optimized for size and performance.
  // Size: 96 bytes
  type UnoptimizedLayout struct {
    AreaID         int32        // 4 bytes
    IsActive       bool         // 1 byte
    BalanceInCents int64        // 8 bytes
    IsSpecial      bool         // 1 byte
    IdempotencyKey int64        // 8 bytes
    ID             uint32       // 4 bytes
    User           NestedLayout // 24 bytes
    IsMigrated     bool         // 1 byte
    CreatedAt      int32        // 4 bytes
    Status         uint16       // 2 bytes
    Key            float64      // 8 bytes
    TenantID       int8         // 1 byte
    UpdatedAt      int32        // 4 bytes
  }

The structure UnoptimizedLayout may look normal to the untrained eye, but it has a lot of padding bytes between fields. The total size of the structure is 96 bytes after compiler optimizations. But if you manually calculate the sum of all the fields you will get 70 bytes. The size of your structure increased by 26 padding bytes. You can check the size of this structure for yourself using Sizeof function from unsafe package:

  func main() {
    fmt.Printf("UnoptimizedLayout size: %d bytes\n", unsafe.Sizeof(UnoptimizedLayout{}))
  }

UnoptimizedLayout size: 96 bytes

aligo: visualizing paddings of your structure.

To visualize paddings and alignments of your structure, I recommend using aligo tool. It is a simple command line tool that can help you to visualize the layout of your structures in Golang and help you to optimize it. You can install it using go install command:

go install github.com/essentialkaos/aligo/v2@latest

Next, you can run it against your structure:

aligo -s UnoptimizedLayout view main.go

This will output a visual representation of your structure with paddings and alignments:

aligo: UnoptimizedLayout

Paddings marked with red color can be optimized by changing the order of fields in your structure. Let's take a look at another aligo command that will help us to optimize our structure:

aligo -s UnoptimizedLayout check main.go

This will output a sorted struct with optimized paddings and alignments:

aligo: UnoptimizedLayout optimized

You can see that we just changed the order of fields and reduced the size of our structure from 96 bytes to 72 bytes - 25% memory usage reduction.

aligo: comparing structures.

Let's rename our optimized structure to OptimizedLayout and run aligo view to visualize the difference:

aligo: OptimizedLayout

You can see that the number of padding bytes is reduced from 26 to 2 bytes. The total size of the structure is reduced from 96 bytes to 72 bytes, which is a 25% memory usage reduction. We still have 2 padding bytes left, we can add 2 more boolean fields in our structure for free, and we will still have the same size of 72 bytes.

If you have a million of those records sitting in your memory, your unoptimized structure will use 96 MB of memory, while the optimized one will use only 72 MB. This is a 24 MB difference, which is a pretty significant number for a million records.

Benchmarks.

Now that we know how to optimize our structures, let's take a look at the performance difference between them.

  func Benchmark(b *testing.B) {
    k := 1000000 // Number of records to allocate and access
    unoptimized := make([]UnoptimizedLayout, k)
    optimized := make([]OptimizedLayout, k)

    b.Run("MemoryAllocation-UnoptimizedLayout", func(b *testing.B) {
      for b.Loop() {
        _ = make([]UnoptimizedLayout, k)
      }
    })

    b.Run("MemoryAllocation-OptimizedLayout", func(b *testing.B) {
      for b.Loop() {
        _ = make([]OptimizedLayout, k)
      }
    })

    b.Run("FieldAccess-UnoptimizedLayout", func(b *testing.B) {
      for i := 0; b.Loop(); i++ {
        for j := range unoptimized {
          unoptimized[j].BalanceInCents = int64(i + j)
        }
      }
    })

    b.Run("FieldAccess-OptimizedLayout", func(b *testing.B) {
      for i := 0; b.Loop(); i++ {
        for j := range optimized {
          optimized[j].BalanceInCents = int64(i + j)
        }
      }
    })
  }

Now run the benchmarks using go test -bench=. -benchmem command and you will get similar output:

Benchmark/MemoryAllocation-UnoptimizedLayout-14  2470  447593  ns/op  96002053 B/op 1 allocs/op
Benchmark/MemoryAllocation-OptimizedLayout-14    3802  314902  ns/op  72007681 B/op 1 allocs/op
Benchmark/FieldAccess-UnoptimizedLayout-14        948  1118201 ns/op  0        B/op 0 allocs/op
Benchmark/FieldAccess-OptimizedLayout-14         1202  927754  ns/op  0        B/op 0 allocs/op

Of course, the numbers may vary depending on your hardware and Go version, but you can see that the optimized structure has better memory allocation performance and field access performance. The memory allocation time for unoptimized structure is 447593 ns/op while for optimized structure it is 314902 ns/op. The field access time for unoptimized structure is 1118201 ns/op while for optimized structure it is 927754 ns/op. This is a 30% and 17% performance improvement respectively for 1 million records. Absolutely for free!

How to read benchmarks and what else you can bench I described in my article Optimization Odyssey: pprof-ing & Benchmarking Golang App.

The elephant in the room.

Let's take a look at the elephant in the room. The elephants, to be precise. Types that I never mentioned in this article on purpose.

Pointer - pointer to the structure in Golang only takes 8 bytes of memory on 64-bit architecture. But it obscures the real size of the structure it points to and adds additional job for the garbage collector as well.

Interface - the interface type in Golang is a reference type, which means that it does not have a fixed size and can be used to store any type that implements the interface. We can't say for sure ho much memory it will use because it depends on the implementation. But we can say that the interface pointer in Golang is 16 bytes. Why 16 bytes and not 8? Because Golang uses pointer tagging to store additional information about the pointer, like the type of the value it points to. This is done to make the garbage collector more efficient and to allow for more flexible memory management. So, if you have a structure with an interface field, it will use 16 bytes for the interface and additional memory for the implementation it points to.

String - the string type in Golang is also a reference type, which means that it does not have a fixed size and can be used to store any string value. The string type in Golang is implemented as a struct with two fields: a pointer to the string data and the length of the string. The pointer is 8 bytes (uintptr type) and the length is 8 bytes (int type), so the minimum size of the string type is 16 bytes. This means that if you have a structure with a string field, it will use 16 bytes for the pointer and the length of the string, plus additional memory for the string data itself.

Slice - the slice type in Golang is also a reference type, which means that it does not have a fixed size and can be used to store any slice value. Similar to the string type, the slice type in Golang is implemented as a struct with three fields: a pointer to the slice data, the length of the slice, and the capacity of the slice.

The same goes for the map type in Golang and others. I avoided them here on purpose because it is hard to say how much memory they will use in this educational article. It will depend on the implementation and the data you put in them, and it opens doors to more optimizations and considerations.

Some tips and considerations.

Now, when you know how to optimize your structures in Golang, you probably wonder why the compiler does not do it for you automatically? Don't we have a compiler flag for that? The answer is quite simple: the compiler does not know the context of your application and the logic behind your structures. You can use your structures for serialization and deserialization, for database operations, for network communication, etc. The compiler cannot know if the order of the fields in your structure is important for your application logic.

At this time, I need to stop you to think. Should you really optimize your structures? In most cases, the answer is no. The performance gain will be more visible only in certain cases, like working with large datasets, parallel batch processing with workflows, or high-performance applications. If you are working on a simple HTTP router - most likely you don't notice any difference in performance.

But let's say you want to make a habit of optimizing your structures for some reason. The rule of thumb here is to group the fields by their size, moving the largest fields to the top of the structure and the smallest fields to the bottom. This might be enough for most cases. You can also use the help of the govet linter in golangci-lint tool that you probably already use in your project. Just enable fieldalignment option for govet in your .golangci.yml configuration file. But it might be overkill.

You can also call this linter manually, since it is using a tool from the Golang standard library under the hood:

go install golang.org/x/tools/go/analysis/passes/fieldalignment/cmd/fieldalignment@latest
fieldalignment -fix main.go

I hope this article inspired you to take a look at your code and find some places where you can benefit from padding optimizations. If you are working with large datasets - every byte counts.

14 Jun

2025