Encoding
Encoding πΌοΈοΈ is a way of presenting data. Unlike encryption π, anyone who can identify the algorithm used can decode the message.
A radix, or base, is a set of unique characters that we can use to encode a message. The most well-known ones are
- binary (radix 2, $[0-1]$)
- octal (radix 8, $[0-7]$)
- decimal (radix 10, $[0-9]$)
- hexadecimal (radix 16, $[0-9]$ and $[A-F]$)
- base32 ($[A-Z]$ and $[2-7]$ or $[A-V]$ and $[0-9]$)
- base64 ($[A-Z]$ and $[0-9]$ and $[+/]$ and "=" for padding)
β‘οΈ "Radix n" or "Radix-n" are both valid and commonly used.
Some common rules π
- In a $radix\ n$, values go from $0$ to $n-1$
- After 9, we are using letters
- After 35, we are using symbols
- ...
- $(n)_{k}$ means that the number $n$ is in a radix $k$
Some tools to detect the encoding/decode/encode π
- CyberChef (21.2k β) | Online version
- Burp Suite Decoder
- Decodify (0.8k β)
- binaryhexconverter.com
- rapidtables.com/base converter
- dencode.com
Division by base
Division by base is a simple and straightforward way to convert numbers from one base to another.
- $a = \text{your_number}$
- $n = \text{your_radix}$
- do while $a > 0$
- $q_i = \frac{a}{n}$
- $r_i = a\ mod\ n$
- $a = q_i$
The output is a set of $r_i$. You may have to convert them. For instance, $15$ will be converted to $F$. The final value is the concatenation of every $r_i$ in the reverse order (from the last to the first).
β‘οΈ There are other techniques.
Example π₯
- $a = 6072$
- $n = 15$
- do while $a > 0$
- $q_0 = 6072 / 15 = 404$
- $r_0 = 6072\ mod\ 15 = 12$
- $q_1 = 404 / 15 = 26$
- $r_1 = 404\ mod\ 15 = 14$
- $q_2 = 26 / 15 = 1$
- $r_2 = 26\ mod\ 15 = 11$
- $q_3 = 1 / 15 = 0$
- $r_3 = 1\ mod\ 15 = 1$
- exit, $a$ is now 0
Then we convert $12=C$, $14=E$, $11=B$, and concatenating them in reverse order giving us $(6072)_{10} = (1BEC)_{15}$.
Radix 2 - π₯οΈ
Radix 2, commonly called binary, is a base made of one and zero. It's the language used by machines π₯οΈ.
To convert a binary to a decimal, and vice versa, you need to know every power of two ($2^9 = 512 \ldots$).
$2^8$ | $2^7$ | $2^6$ | $2^5$ | $2^4$ | $2^3$ | $2^2$ | $2^1$ | $2^0$ |
---|---|---|---|---|---|---|---|---|
256 | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
Every number can be expressed as a sum of powers of two. If you use a power of two when expressing a number, then add a 0, else 1.
Radix 10 to Radix 2
- $125 = 64+32+16+8+4+1$
- $125 = 2^6+2^5+2^4+2^3+2^2+2^0$
- $125 = {\color{red} 1 *} 2^6+ {\color{red} 1 *} 2^5+ {\color{red} 1 *} 2^4+ {\color{red} 1 *} 2^3+ {\color{red} 1 *} 2^2 + {\color{green} 0 *} * 2^1 + {\color{red} 1 *}2^0$
- $(125)_{10} = ({\color{red} 11111} {\color{green}0} {\color{red}1})_{2}$
From Radix 2 to Radix 10
- $(1111101)_{2}$
- There are 7 digits, so the first is $2^6$
- $1 * 2^6 + 1 * 2^5 + 1 * 2^4+ 1 * 2^3+ 1 * 2^2+ 0 * 2^1+ 1 * 2^0$
- $64 + 32 + 16 + 8 + 4 + 0 + 1$
- $125$
Radix 8 - π
Radix 8, commonly called octal, is a base made of numbers from zero to seven. It's not commonly used π.
Radix 8 numbers may, or may not, start with a 0
(zero), such as 07
. The presence of this zero indicates that this is an octal number.
3 binary digits are equal to one octal number.
From Radix 8 to Radix 2
- Given $(175)_{8}$
- $(1)_8 = ({\color{grey}00}1)_2$
- $(7)_8 = (111)_2$
- $(5)_8 = (101)_2$
- Giving us $(175)_{8}=({\color{grey}00}1111101)_{2}=(1111101)_{2}$
Radix 2 to Radix 8
- Given $(1111101)_{2}$, we need 2 leading zeros
- $(001111101)_{2}$
- Convert each group of 3 digits to radix 10 π¦
- $(001)_2$ is equal to $0+0+1=(1)_{10}$
- $(111)_2$ is equal to $4+2+1=(7)_{10}$
- $(101)_2$ is equal to $4+0+1=(5)_{10}$
- So we have $(1111101)_{2}=(175)_{8}=0175$
β‘οΈ We convert each group to radix 10, but it's the same as converting to radix 8, as the maximum value is 7. We say "radix 10" to avoid a recursive explanation.
Radix 16 - π
Radix 16, commonly called hexadecimal, is a base made of numbers from zero to 9, and letters from A to F. It replaced octal, and is the most popular way to write shorter binary numbers π.
Hexadecimal numbers usually start with a 0x
("zero x"), such as 0x7
.
4 binary digits are equal to one hexadecimal number.
Radix 10 and Radix 16
From 0 to 9, there are no changes. From 10 to 15, we use letters:
10 | 11 | 12 | 13 | 14 | 15 |
---|---|---|---|---|---|
A | B | C | D | E | F |
Radix 16 to Radix 2
- Given $(7D)_{16}$
- $(7)_{16} = ({\color{grey}0}111)_2$
- $(D)_{16} = (13)_{10} = (1101)_2$
- Giving us $(7D)_{16}=({\color{grey}0}1111101)_{2}=(1111101)_{2}$
Radix 2 to Radix 16
- Given $(1111101)_{2}$. We need 1 leading zero for 2 groups of 4.
- $(01111101)_{2}$
- Convert each group of 3 digits to radix 10 then radix 16 π¦
- $(0111)_2$ is equal to $1+2+4=(7)_{10}=(7)_{16}$
- $(1101)_2$ is equal to $1+4+8=(13)_{10}=(D)_{16}$
- So we have $(7D)_{16}$ or $\text{0x}7D$
Base64 - βοΈ
Base64 is usually used to encode an image/..., so that we can transfer it as a string. Most base64 strings are ending with "=", or "==", which is the padding.
On Linux
$ echo -n "toto" | base64
dG90bw==
$ cat /etc/passwd | base64 -w 0 # inline
...
$ echo 'dG90bw==' | base64 -d
toto
On Windows
PS> [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes('toto'))
dG90bw==
PS> [Convert]::ToBase64String([System.Text.Encoding]::Unicode.GetBytes('toto'))
dABvAHQAbwA=
PS> [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("dG90bw=="))
PS> [IO.File]::WriteAllBytes("file", [Convert]::FromBase64String("dABvAHQAbwA="))
PS> cat file
toto
PS> [Convert]::ToBase64String((Get-Content -path "someFile" -Encoding byte))
...
URL encoding - π
URL encoding, also known as percent-encoding, is an encoding mostly used in URLs and resources, to encode characters that have a special meaning in URLs.
$ echo "encode me" | jq -sRr @uri
π See also: urlencoder or url-encode-decode.
Example of encoding .
- Find the ASCII of "
.
":46
- Convert the value to hexadecimal:
2e
- Add
%
before the result:%2e
Double URL encoding for .
:
- The encoding of
%
is%25
- The encoding of
.
is%2e
- The result:
urlencode(urlencode(.)) == urlencode(%2e) == %252e
π» To-do π»
Stuff that I found, but never read/used yet.
- binary-coded decimal (BCD)
- Can put a file in cyber chef
- Extract strings
- Find/Replace to remove patterns
- Drop bytes to remove chars
- Defang URL (avoid clicking)
- Extract URLs
- leetspeak
hexdump/xxd
: convert some text to octal/hexadecimal, and π
-
-b
: to octal -
-C
: to hexadecimal -
-e
: customize
$ hexdump <<< "Hello, World" > hello_world.hex
$ cat hello_world.hex
0000000 6548 6c6c 2c6f 5720 726f 646c 000a
000000d
# letters were mixed (WTF!!?)
$ echo -e "\x65"
e
$ echo -e "\x48"
H
# here it works fine
$ xxd <<< "Hello, World" | tee hello_world.hex
00000000: 4865 6c6c 6f2c 2057 6f72 6c64 0a Hello, World.
# reverse
$ xxd -r hello_world.hex
Hello, World
$ sudo apt install xxd
$ xxd -p -rd <<< <SomeHexa>
k#n
: convert a number $n$ in radix-$k$ to decimal
$ echo $[2#101] # 5
bytes.fromhex('...').decode()
base64.b64encode(bytes)