The sub-text might encode a different type of data from the text. For example, a video stream might encode large features as text and fine detail as sub-text.
A noisy channel might only allow the text to be recovered, whereas if the channel is less noisy both text and sub-text can be recovered. There is a reduction in the robustness of the text encoding, but it won't be 100% less robust even if the sub-text is transmitted at the maximum possible rate.
One example of a redundant code is to encode a model for each datum. Having a model allows the datum to be encoded more concisely, but there are multiple possible models, introducing redundancy. Such codes are easier to de-code than non-redundant codes, only the sender need perform model estimation. MML is an example of this.
If an MML code is transmitted with a sub-text, the information from the choice of model can be recovered and perhaps shouldn't be counted in the message length. This kind of consideration already occurs in Snob, although hackishly: part of the text is converted into sub-text.
I propose that
- Such a text/sub-text protocol is the best way of thinking about MML. Assume and model/data pair is part of an on-going communication. This allows the lattice constant terms to be dropped from the equations, making everything rather simpler. (Note: Chris Wallace has published a paper along these lines, this isn't a new idea on my part. There's been a curious lack of follow-through on the idea though.)
- Text/sub-text encoding is also a good description of human language usage. Note also that in noisy environments or environments that require high accuracy (eg aeroplane radio, military) sub-text tends to get dropped to increase the robustness of the text.