Files and Data
Publ .
Mins 8 (1573 words).
Edit .

To understand the different utilities and procedures used to overcome the next challenges you must learn the basics of data encoding. The key point is that data in a computer system is only stored and manipulated as binary data. It is important to specifically address that when saying this, we refer to the low-level definition of file. In this sense, a file is a finite set of discrete values of only two possible states, often labeled with zeros and ones.
So far, we have been dealing with files made entirely of printable characters. These are often referred to as ASCII files in the high-level sense of the term. However, those files are obvioulsy comprised of binary data at the low-level, in the way that data is arranged on disk or memory. After making this distinction, we can understand that people often refer to binary files also on the high-level sense of it.
This differentiation leads to acknowledge that file contents can be other types of data structures aside from printable characters. A computer file can store encoded data, machine code, directory files, etc. It could also contain a mixture of two or more of these data types.
The following levels introduce several tools. Most of these are used to display the contents of files in various formats. They assist in the search for an interpretation of some unknown data for some given file.
Index
LEVEL 9 -> LEVEL 10
The password for the next level is stored in the file data.txt in one of the few human-readable strings, preceded by several ‘=’ characters.
Commands you may need to solve this level
grep
,sort
,uniq
,strings
,base64
,tr
,tar
,gzip
,bzip2
,xxd
The quest is for a particular chain of characters in a given file. Concatenating the file, strings of characters lacking any kind of meaning appear on the screen. This is because the data of this file is comprised of more than ASCII characters. It is given that it has some chain of readable characters, but the rest can be any type of data.
The suggested command, strings
, is a program that outputs the sequence of
printable characters present in a file. After passing to it the file
data.txt
, you may observe that this file has many printable lines. To
properly ‘filter’ the output you should rely on the second hint. Piping the
output of strings
to grep
to only catch the lines that match our desired
pattern.
$ strings data.txt | grep "=="
========== the
bu========== password
4iu========== is
b~==P
========== G7w8LIi6J3kTb8A7j9LgrywtEUlyyp6s
LEVEL 10 -> LEVEL 11
The password for the next level is stored in the file data.txt which contains base64 encoded data.
Commands you may need to solve this level
grep
,sort
,uniq
,strings
,base64
,tr
,tar
,gzip
,bzip2
,xxd
- Helful Reading Material Base64 on Wikipedia
Base64 is a binary-to-text encoding scheme. The base64 alphabet is comprised of
alphanumeric charactes and the +
and /
symbols, reserving the =
sign to
signal padding. Therefore it can often be recognised by trailing equal signs
at the end of the data, which quite often has.
The command base64 allows for encoding and decoding in Base64. For decoding, you need to use the flag -d.
$ cat data.txt
VGhlIHBhc3N3b3JkIGlzIElGdWt3S0dzRlc4TU9xM0lSRnFyeEUxaHhUTkViVVBSCg==
$ base64 -d data.txt
The password is IFukwKGsFW8MOq3IRFqrxE1hxTNEbUPR
LEVEL 11 -> LEVEL 12
The password for the next level is stored in the file data.txt, where all lowercase (a-z) and uppercase (A-Z) letters have been rotated by 13 positions
Commands you may need to solve this level
grep
,sort
,uniq
,strings
,base64
,tr
,tar
,gzip
,bzip2
,xxd
- Helpful Reading Material
Rot13 on Wikipedia
The password for the next level is in an encripted file. The plain data has been processed with the algorithm rot13. This is a simple cryptographic function that changes every character in its input for the one positioned 13 steps ahead on the alphabet. Below is a table of the function’s mappings:
Input | Output |
---|---|
a | n |
b | o |
c | p |
d | q |
… | … |
A | N |
B | O |
C | P |
… | … |
… | … |
Y | L |
Z | M |
This function has the particular property of being its own inverse. This means when you apply the function twice to the same character you ‘return’ to its original value. Observe, for instance, that “Bad” ciphers to “Onq” when applying the algorithm and then “Onq” becomes “Bad” again when applying it a second time.
To decript this data, the ciphertext has to be passed through the same function used for encryption:
Map all characters in the set:'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
To the corresponding characters of the set'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyz'
The command tr
(translation) can translate form one set to the oter. Piping
the output of cat
to tr
and using the two sets as arguments yields the
password for the next level:
$ cat data.txt | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' 'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm'
The password is JVNBBFSmZwKKOP0XbFXOoW8chDz5yVRv
LEVEL 12 -> LEVEL 13
The password for the next level is stored in the file data.txt, which is a hexdump of a file that has been repeatedly compressed. For this level it may be useful to create a directory under /tmp in which you can work using mkdir. For example: mkdir /tmp/myname123. Then copy the datafile using cp, and rename it using mv (read the manpages!)
Commands you may need to solve this level
grep
,sort
,uniq
,strings
,base64
,tr
,tar
,gzip
,bzip2
,xxd
,mkdir
,cp
,mv
,file
- Helpful Reading Material
Hex dump on Wikipedia
The advice for this level is that you start by locating the file into a temporary folder, to aid in the procedure of obtaining the password in an orderly manner.
Let’s make a directory on the temporary folder:
$ mkdir /tmp/foobarbaz
Then copy the file to the created folder:
$ cp data.txt /tmp/foobarbaz
Position the prompt into the directory:
$ cd /tmp/foobarbaz
Use the command mv
to rename the file:
$ mv data.txt data
The file to be manipulated, a hexdump, is a file whose data has been reorganized with the intention of perhaps being further analyzed. It has been rearranged into a format in which its binary data (each set of 8-bits, that is, a byte) is now represented as a two-digit hexadecimal number.
By default Linux has several utilities for the creation and manipulation of
hexdump files. After exploring the given command’s manpages xxd
seems as the
proper tool. This is an utility capable of creating hexadecimal representations
from binary files, or vice-versa. To return (reverse) a hexdump to a binary you
pass the flag -r
.
$ xxd -r data data2
This outputs to the folder a new file data2
that should be a compressed file.
file
determines the type of compression or contents of this data.
$ file data2
data2: gzip compressed data, was "data2.bin", last modified: Thu May 7 18:14:30 2020, max compression, from Unix
Knowing that its type is gzip, allows you to find a proper way to treat this
file. But there is more to the file
command. One of the flags, -Z
, makes
it try to look inside compressed files:
$ file -Z data2
data2: bzip2 compressed data, block size = 900k
More compression. Let’s move on and decompress data.bin using the gunzip
utility:
$ gzip -d data2
gzip: data2: unknown suffix -- ignored
A small error. A search
reveals that gunzip is one of the tools that do care about what extension files
are named with. mv
adds an extension before trying to unzip:
$ mv data2 data2.bin.gz
$ gzip -d data2.bin.gz
Now, running again file
and also file -Z
to peek its insides:
$ file data2.bin
data2.bin: bzip2 compressed data, block size = 900k
$ file -Z data2.bin
data2.bin: gzip compressed data, was "data4.bin", last modified: Thu May 7 18:14:30 2020, max compression, from Unix
You must use bunzip
to decompress this type of file:
$ bunzip2 data2.bin
bunzip2: Can't guess original name for data2.bin -- using data2.bin.out
An error message reads that there is some problem about our original file name.
Nevertheless, the task has been successful. The decompressed file has the name
data2.bin.out. Issuing file
to check its type and contents:
$ file data2.bin.out
gzip compressed data, was "data4.bin", last modified: Thu May 7 18:14:30 2020, max compression, from Unix
$ file -Z data2.bin.out
data2.bin.out: POSIX tar archive (GNU)
After running gzip
once more two POSIX tar archive within one another, are
obtained:
$ tar -xf data2.bin.out
And data5.bin:
$ tar -xf data5.bin
The procedure yielded a file named data6.bin, of type bzip2, that is compressing yet another tar:
$ bunzip2 data6.bin
bunzip2: Can't guess original name for data6.bin -- using data6.bin.out
$ tar -xf data6.bin.out
Now data8.bin, may be very close to solving this puzzle. It is a file of type
gzip
, containing inside an ASCII text file.
$ file data8.bin
data8.bin: gzip compressed data, was "data9.bin", last modified: Thu May 7 18:14:30 2020, max compression, from Unix
$ file -Z data8.bin
data8.bin: ASCII text
After adding an extension and decompressing the file, the final step is making
use of cat
to read the text it stores:
$ mv data8.bin data8.z
$ gunzip data8.z
$ ls
data data2.bin.out data5.bin data6.bin.out data8
$ cat data8
The password is wbWdlBxEir4CaE8LaPhauuOo6pwRmrDw
Finally, you shold clean up and delete the whole folder used. rm
with the
options -f
and -r
removes files and folder never prompting and recursively.
$ cd ~
rm -rf /tmp/foobarbaz