From Newsgroup: comp.arch
On 3/7/2025 2:29 PM, MitchAlsup1 wrote:
A "good try" at encryption is what engineers show management
in order to claim they know what they are doing {{even when
they really don't}}.
I was in the meetings where the AMD architecture team discussed
this "security issue" and I can name names.
Not sure about the specifics of this case.
But, sometimes one can also use encryption mostly as a legal tool (say,
for anti-tampering).
Like, if it is just bare data, they can't do as much.
But, if encryption or similar is involved, they can bring in the full
force of the law...
In the latter case, the encryption would often be something like XOR'ing
with a bit pattern or a Caesar cipher or similar.
Like, say, lazy man's encryption could be something like:
void encode(void *dst, void *src, int sz, uint64_t key)
{
uint64_t *cs, *ct, *cse;
cs=src; cse=cs+(sz+7)>>3; ct=dst;
while(cs<cse)
{ *ct++=(*cs++)+key; }
}
void decode(void *dst, void *src, int sz, uint64_t key)
{ encode(dst, src, sz, (~key)+1); }
Where, in this case, the strength (or lack thereof) doesn't really matter.
If you happen to already know some of the non-encoded data, breaking
this is trivial (and figuring out 8 bytes is enough to decode the whole thing). Only reason to do it 8 bytes at a time (vs 1 byte) is because 8
bytes is faster.
But, if encoding a known format (say, PE/COFF or WAV or similar), could probably crack it very quickly relying on some basic knowledge of the
file format (eg, where to find magic numbers and blobs of NUL bytes).
Could potentially break it in under 1000 clock-cycles this way.
Or, maybe they could make it a little stronger by using PRNG...
uint64_t permuteKey(uint64_t key)
{
uint64_t ckey, cklo, ckhi, cka;
cklo=((uint32_t)(key>> 0))*0xE20B7AC6ULL; //*1
ckhi=((uint32_t)(key>>32))*0xE20B7AC6ULL;
cka=(ckhi>>32)|((cklo>>32)<<32);
ckey=key+cka;
return(ckey);
}
*1: Use cases that can be turned into a (faster) 32-bit widening
multiply. Where, full 64-bit multiply is unreasonably slow. In this
case, the multiplies serve to mix the bits around somewhat.
void encode(void *dst, void *src, int sz,
uint64_t key1, uint64_t key2)
{
uint64_t ckey, cka, ckb, ckc, ckstep, v;
uint64_t *cs, *ct, *cse;
int n;
cs=src; cse=cs+(sz+7)>>3; ct=dst;
//setup cost, likely expensive, probably unavoidable
cka=key1; ckb=key2; ckc=key1^key2;
ckey=((uint32_t)ckc)*0xE20B7AC6ULL;
n=(ckey>>32)&63;
while(n--)
cka=permuteKey(cka);
n=(ckey>>38)&63;
while(n--)
ckb=permuteKey(ckb);
n=(ckey>>44)&15;
while(n--)
ckc=permuteKey(ckc);
ckey=cka+ckb; n=64;
ckstep=ckey+ckc;
ckey=permuteKey(ckey); //(strength boost)
ckstep=permuteKey(ckstep); //?
while(cs<cse)
{
v=(*cs++);
n--;
*ct++=v^ckey;
ckey+=ckstep; //weak, but cheap-ish...
ckstep=(ckstep<<1)^(ckstep>>27); //? (strength boost)
//permute key, stronger but slow
if(!n)
{
cka=permuteKey(cka);
ckb=permuteKey(ckb);
ckc=permuteKey(ckc);
ckey=cka+ckb;
ckstep=ckey+ckc;
ckey=permuteKey(ckey); //? (strength boost)
ckstep=permuteKey(ckstep); //?
n=64; //so only do it rarely
}
}
}
Where, it would be no longer sufficient to know N bytes of payload data
to break it. As for whether it would be acceptably cheap/fast is unknown.
To try to limit computational cost, only permute keys once every 512
bytes or so (though, it would still be fairly weak within each 512
block; but doing this too often could negatively effect data throughput).
Could be made faster (say, by working 32 bytes at a time), but would get probably a bit too bulky for use as an example here (but, I suspect
could be possible to get it within around 80% of memcpy speed with some creative unrolling).
Switched to XOR in the example (as the final data-facing step), which
avoids needing a separate decoder function.
Or, a possible faster/cheaper intermediate option being to not
re-permute mid-stream.
Though, if one had a chunk of known data (*2), it could be possible to
work out the step values (using the power of integer subtract), and
break the rest. So, probably not sufficient... (Maybe passably if this strategy would only break a small chunk of data).
*2: Say, magic numbers or known locations where one is likely to find
blobs of NUL bytes or similar given the file format.
Say, it probably at least needs to look like it would be hard to break,
and not something where someone can look at it and figure out that the
key could be broken by subtracting pairs of values and then effectively
having captured the key-state for the whole message...
While, ideally, also not adding too much computational overhead.
Though, not sure where exactly would be the lower bar here (probably
needs to at least appear like it would work).
...
--- Synchronet 3.20c-Linux NewsLink 1.2