From Sequential Scripts to Concurrent AI Pipelines in Go — Part 6

Part 6 — Buffered vs Unbuffered Channels: Choosing the Right Pipe

Part of the series: Production-Grade Concurrent AI Systems in Go

Full code for this post: github.com/madmmas/go-concurrent-ai-systems/tree/part-06Diff from Part 5: compare/part-05...part-06Run it: go run ./cmd/news-processor -mode=unbuffered inside arc-1-foundations/part-06-buffered-channels


Part 5 introduced channels as a way to pass results between goroutines without shared memory. We used an unbuffered channel and it worked cleanly — no mutex, no race, one goroutine owned the results slice.

But Part 5 used one specific kind of channel without explaining the choice. There are two kinds:

unbuffered := make(chan model.AIResult)     // blocks on send until received
buffered   := make(chan model.AIResult, 10) // sends up to 10 values without blocking

These behave very differently under load. Understanding that difference is not academic — it directly affects how your pipelines handle bursts, slow consumers, and backpressure. Choosing the wrong one produces either deadlocks (Part 4's territory) or silently degraded throughput.


Unbuffered Channels: Synchronous Handoff

An unbuffered channel is a rendezvous point. The sender blocks until a receiver is ready. The receiver blocks until a sender sends. Neither side can proceed without the other.

ch := make(chan model.AIResult) // no buffer

// Sender blocks here until collector receives
resultsCh <- result

// Collector blocks here until a sender sends
result := <-resultsCh

Think of it as a direct handoff — like passing a physical document from one person to another. You can't put it down; someone has to take it from your hand.

In our news pipeline:

func (p *UnbufferedPipeline) ProcessAll(articles []model.Article) ([]model.AIResult, time.Duration) {
    resultsCh := make(chan model.AIResult) // unbuffered
    var wg sync.WaitGroup

    for _, article := range articles {
        wg.Add(1)
        go func(a model.Article) {
            defer wg.Done()
            result := p.processArticle(a)
            resultsCh <- result // blocks until collector receives
        }(article)
    }

    go func() {
        wg.Wait()
        close(resultsCh)
    }()

    var results []model.AIResult
    for r := range resultsCh {
        results = append(results, r)
    }
    return results, time.Since(start)
}

When article 3 finishes before articles 1 and 2, its goroutine tries to send on resultsCh. If the collector is busy processing article 5's result, article 3's goroutine waits. It can't proceed until the collector is ready.

This creates implicit backpressure: fast workers slow down to match a slow collector. That's sometimes exactly what you want. A slow downstream database write, for example, should naturally throttle upstream processing rather than accumulating unbounded results in memory.


Buffered Channels: Asynchronous Queue

A buffered channel has internal capacity. A send succeeds immediately as long as the buffer isn't full. The sender only blocks when the buffer is at capacity.

ch := make(chan model.AIResult, 10) // buffer of 10

// Succeeds immediately if fewer than 10 values are queued
resultsCh <- result

// Still blocks if buffer is empty — receiver waits for a sender
result := <-resultsCh

Think of it as an inbox. You drop documents into it without waiting for someone to take each one. The recipient picks them up when ready. But if the inbox fills up, you have to wait for it to drain before you can add more.

In our pipeline:

func (p *BufferedPipeline) ProcessAll(articles []model.Article) ([]model.AIResult, time.Duration) {
    resultsCh := make(chan model.AIResult, p.BufferSize) // buffered
    // ... same structure, but workers rarely block on send
}

With BufferSize = len(articles), all workers can send their results without ever waiting for the collector. The collector drains the buffer at its own pace. Workers and collector run independently.


When the Difference Actually Matters

With our simulated LLM latency (500ms–1500ms per call), both versions take roughly the same time. The bottleneck is the network calls, not the channel operations. Channel sends take nanoseconds.

The difference becomes real in two scenarios:

Scenario 1: The collector does meaningful work.

If collecting a result means writing to a database, calling an API, or running further processing — the collector takes real time per result. With an unbuffered channel, every completed worker blocks waiting for the collector to finish its current result. With a buffered channel, workers keep running and queue their results while the collector works through the backlog.

Scenario 2: Burst processing with variable article latency.

Imagine 100 articles where 90 complete in 500ms and 10 take 5 seconds. With an unbuffered channel, the 90 fast workers each complete and immediately block waiting for the collector. The collector processes them one at a time while the slow 10 are still running. No worker can free its resources until the collector cycles to it.

With a buffered channel of size 100, all 90 fast workers send immediately and exit. The collector drains the buffer. Memory is freed as workers complete rather than being held while they wait at the channel.


The Capacity Question

If buffered channels are so flexible, why not always use a large buffer?

Two reasons. First, a large buffer hides backpressure. If your collector is slower than your producers, a buffer lets producers keep running long after the consumer has fallen behind — building up memory that eventually becomes a problem. An unbuffered channel surfaces that imbalance immediately. You feel the backpressure in your benchmark numbers rather than in an out-of-memory crash at 3 a.m.

Second, buffer capacity is a configuration decision that now lives in your code. The right size depends on the specific throughput characteristics of your system. make(chan T, len(articles)) — the pattern from Part 5's worker pool — works when you know the total article count upfront. It's not right for infinite streaming workloads.

The practical guidance:

SituationChannel type
Tight producer/consumer synchronisationUnbuffered
Known batch size, workers should never block on sendmake(chan T, n) where n = batch size
Unknown/streaming input, need to limit memorymake(chan T, smallFixed) — tune empirically
Collector does slow work, producers should stay aheadBuffered with empirically-sized capacity

The select Statement

Part 6's code introduces one more primitive worth naming explicitly: select.

select lets a goroutine wait on multiple channel operations simultaneously and proceed with whichever one is ready first. You've seen it implicitly — in Part 8 (Context and Timeouts), the simulator uses it to race a timer against a context cancellation:

select {
case <-time.After(latency):
    return nil          // normal completion
case <-ctx.Done():
    return ctx.Err()    // cancelled or timed out
}

Without select, you'd have to pick one channel to wait on, potentially blocking on the wrong one. select lets you express "do whatever happens first" — which is exactly the right model for timeout enforcement, graceful shutdown, and any situation where multiple events can unblock a goroutine.

A default case makes select non-blocking:

select {
case result := <-resultsCh:
    process(result)
default:
    // nothing ready, do something else
}

This pattern — non-blocking channel probe with default — appears in the worker pool's cancellation check in Part 9. When you see it, it means "try this, but don't wait."


What's Next

We've now covered channels in depth — unbuffered synchronous handoff, buffered async queuing, close semantics, and select. The next part takes all of this and builds the worker pool that powers the rest of the series: a fixed number of long-lived goroutines pulling from a jobs channel, sending results to a results channel, controlled entirely by channel close signals.

See you in Part 7.


This is Part 6 of the series "Production-Grade Concurrent AI Systems in Go." Read Part 5 — Channels and Message Passing or continue to Part 7 — Worker Pools and Bounded Concurrency.