Adversarially Constrained Alphabet Channel Capacity

In information theory literature, a common setup is that of the imperfect channel. Typically, two people are trying to communicate with one another across a channel. The sender can send one symbol out of a given alphabet at each step, and after they send their symbol an "adversary" possibly introduces error by modifying their symbol with some probability before it gets received by the receiver. This is a reasonable model for many communication channels that exist in the real world, as we often encounter loss in our communication channels in the form of random bit flips.

I became interested in a related problem, which is a setup where the "adversary" acts before the sender does, instead of after. Concretely, at each step the "adversary" somehow constrains the possibilities that are open to the sender. In this blog post I will talk specifically about the scenario where the alphabet of possible symbols to be sent over the channel has size \(m\), and at each step the adversary chooses \(n < m\) of those as the options for the sender to choose from.

The simplest nontrivial case is when \(m=3\) and \(n=2\). Let us define our alphabet as \(\{0,1,2\}\). At every step, our adversary will choose two out of these three options, and our sender will need to select one of the two to send to the receiver. We assume there is no error or loss in the communication channel after a symbol is selected and sent.

We want to know how many bits of information we can actually convey at each step, and what strategy we can use to maximize the information conveyed. There are actually several slightly different formulations of this problem, which drastically change the solution. Let us examine a few.

Active Adversary, Deterministic Encoding Scheme

For this case, we assume that the encoding/decoding scheme we use is deterministic, and that the adversary might have knowledge of it and can therefore choose which options to give us to minimize the information we can send.

One way to think about this is to suppose we are trying to eventually convey one specific possibility out of set \(X\) of length \(x\) total possible messages. For example if we were trying to send a 4-bit binary message, we would be trying to convey one out of the 16 possible 4-bit binary strings. At each step, we can think about how many remaining options we are able to "eliminate". Let \(X_p\subset X\) be the remaining options conditional on receiving \(p\) for \(p\in\{0,1,2\}\). Clearly we need that \(X_p\cup X_q = X\) for \(p\neq q\).

It is not hard to prove that in this formulation, the optimal arrangement is to let \(\vert X_p\vert = \frac23\) for all \(p\). We can therefore eliminate \(\lfloor\frac13\rfloor\) of the options at each step. This suggests a channel capacity of \(\log_2\frac32\approx 0.585\) bits per step.

However, we have a problem. Using this approach, we cannot distinguish between the last two possible messages. In fact, we can prove that there is no encoding that allows us to do this.

Proof. Suppose we are just trying to convey a single bit of information (let's call the options High and Low). Suppose for contradiction that there exists a minimal number of steps \(s\) that is required to convey this bit of information. This means that after \(s-1\) steps, there must be at least one adversarial strategy that results in the sender sending a sequence of symbols that cannot be definitively interpreted as either High nor Low. Then in the following step, by simple pigeonhole argument, there must either be only one out of the three possible symbols that, when received, would definitively convey to the receiver either High or Low. But then the adversary could simply prevent us from sending that symbol, which would prevent us from conveying either High or Low. That means we cannot definitively convey at least one of High or Low after \(s\) steps, which is a contradiction.

This proof was so surprising to me when I discovered it that I actually formalized it in Rocq to make sure I wasn't making a logical error. But it's true, we can't guarantee that we can convey a single bit! This feels extremely surprising to me because it means we can narrow down our space of possibilities exponentially quickly from any arbitrary number, all the way down to 2, but we can't necessarily differentiate between the last two options!

What's more, let's say we have two separate messages we want to send. We can definitely narrow down message 1 to two options (\(a\) and \(b\)), and we can definitely narrow down message 2 to two options (\(c\) and \(d\)). Then, it looks like we are stuck, but actually we can craft a new metamessage with 4 options:

"message 1 has value a, message 2 has value c"
"message 1 has value a, message 2 has value d"
"message 1 has value b, message 2 has value c"
"message 1 has value b, message 2 has value d"

We can now narrow this down again to two options, which means we can now either definitely convey one of the two bits, or we can say how they are related (a implies c and b implies d, for instance).

Going back to the general case where we have an alphabet of size \(m\), out of which ad adversary picks \(n\) options that are valid for us to send, we can show:

for any \(n\leq\lceil\frac m2\rceil\), we have this problem where we cannot distinguish between multiple messages at the end, and
for all larger \(n\) we are definitely able to distinguish between messages in a finite number of steps.

Active Adversary, Nondeterministic Encoding Scheme

Assuming a shared source of randomness between the sender and the receiver, we can use the same approach as before, except that at each step we randomly choose from \(\{0,1,2\}\) with equal probability and add that value to the message (modulo 3). This ensures that the adversary cannot choose a strategy that prevents us from disambiguating between the last 2 options, and the probability that we are able to convey our message definitively approaches 1 as the number of steps approaches infinity.

We might expect that the best solution is again to set \(\vert X_s\vert = \frac23\) for all \(s\), which as we calculated yields a channel capacity of \(\log_2\frac32\approx 0.585\) bits per step. However, let's see if we can do better. We can express the channel capacity in terms of \(X_p\) as:

Work in progress, more to come soon