Cubicle Riddle Writeup - Cyber Apocalypse 2024
→ 1 Introduction
This writeup covers the Cubicle Riddle Misc challenge from the Hack The Box Cyber Apocalypse 2024 CTF, which was rated as having an ‘easy’ difficulty. The challenge involved generation of Python bytecode that meets certain constraints.
The description of the challenge is shown below.
→ 2 Key Techniques
The key techniques employed in this writeup are:
- Manual Python source code review
- Using dis to disassemble Python bytecode
- Generating Python bytecode to meet certain constraints
→ 3 Artifacts Summary
The downloaded artifact had the following hash:
$ shasum -a256 misc_cubicle_riddle.zip
6a658def36d661b328f17c3dc66ddc412c23a66c02a2e65988e2b0983225364a misc_cubicle_riddle.zip
The zip file contained three Python source code files:
$ unzip misc_cubicle_riddle.zip
Archive: misc_cubicle_riddle.zip
creating: challenge/
creating: challenge/riddler/
inflating: challenge/riddler/riddler.py
inflating: challenge/server.py
inflating: challenge/exceptions.py
$ shasum -a256 $(find challenge -type f)
e8402a96b6ac02e87cae04abd62d9d5a660461c17ecf672fabbf249bc66a94a8 challenge/riddler/riddler.py
9141365b9bc69d50f3a2eab927e1267981a9fce73833b1bd0b99efe6e76b5267 challenge/exceptions.py
c8ecb85ef0aebb709f9816d58371f661d6ef6d0b619c856499aa58ff1c50e2d7 challenge/server.py
→ 4 Dynamic analysis
There was a fair amount of code in the downloaded artifact so in order to better visualize the program behavior, a connection was established to the remote endpoint. After choosing “Approach the cube…”, a riddle was presented. However, it was not obvious what input was expected and thus static analysis was the next step.
$ nc -n -v 83.136.254.16 33284
(UNKNOWN) [83.136.254.16] 33284 (?) open
___________________________________________________________
While journeying through a dense thick forest, you find
yourself reaching a clearing. There an imposing obsidian
cube, marked with a radiant green question mark,halts your
progress,inviting curiosity and sparking a sense of wonder.
___________________________________________________________
> 1. Approach the cube...
> 2. Run away!
(Choose wisely) > 1
___________________________________________________________
As you approach the cube, its inert form suddenly comes to
life. It's obsidian parts start spinning with an otherwordly
hum, and a distorted voice emanates from withing, posing a
cryptic question that reverbates through the clearing,
, shrouded in mystery and anticipation.
___________________________________________________________
> Riddler: 'In arrays deep, where numbers sprawl,
I lurk unseen, both short and tall.
Seek me out, in ranks I stand,
The lowest low, the highest grand.
What am i?'
(Answer wisely) >
→ 5 Static Analysis
→ 5.1 server.py
server.py
contains the code that operates the program’s
menu. On line 46, the user’s answer, after having already chosen to
approach the cube, is decoded to UTF-8. Line 47 calls
self._format_answer
to format the answer, then delegates to
the Riddler
class to test if the answer is correct. If it
is, the flag is printed on line 53.
match chosen_action:
case 1:
self.req.sendall(
b"\n___________________________________________________________\n"
+ b"\nAs you approach the cube, its inert form suddenly comes to \n"
+ b"\nlife. It's obsidian parts start spinning with an otherwordly\n"
+ b"\nhum, and a distorted voice emanates from withing, posing a \n"
+ b"\ncryptic question that reverbates through the clearing, \n"
+ b"\n, shrouded in mystery and anticipation. \n"
+ b"\n___________________________________________________________\n"
)
self.req.sendall(b"> Riddler:" + self.riddler.ask_riddle().encode())
self.req.sendall(b"\n(Answer wisely) > ")
answer: str = self.req.recv(4096).decode()
if self.riddler.check_answer(self._format_answer(answer)):
self.req.sendall(
b"\n___________________________________________________________\n"
+ b"\nUpon answering the cube's riddle, its parts spin in a \n"
+ b"\ndazzling display of lights. A resonant voice echoes through\n"
+ b"\nthe woods that says... "
+ flag.encode()
+ b"\n___________________________________________________________\n"
)
The self._format_answer
function splits the input on
commas, then converts each element into an int
type, before
converting the result to bytes
. In other words, the answer
is expected to be comprised of comma separated integers, where each
integer represents the decimal encoding of a byte.
def _format_answer(self, answer: str) -> bytearray:
try:
return bytes([int(b) for b in answer.strip().split(",")])
except Exception:
raise WrongFormatException(
"\nFormat should be like: int_value1,int_value2,int_value3...\nExample answer: 1, 25, 121...\n"
)
→ 5.2 riddler.py
In riddler.py
, the check_answer
function
delegates to self._construct_answer
on line 63 to construct
a function from the answer. The function is then called on line 65 and
the result expected to equal a tuple containing the minimum and maximum
values of self.num_list
, defined on line 50, where the list
contains 10 integers randomly chosen between -1000 and 1000.
The self._construct_answer
function constructs a byte
array consisting of self.co_code_start
, the bytes from the
answer, then self.co_code_end
. The byte array is then used
on line 79 in the construction of a code object. Thus, the answer is
expected to be valid Python bytecode.
import types
from random import randint
class Riddler:
max_int: int
min_int: int
co_code_start: bytes
co_code_end: bytes
num_list: list[int]
def __init__(self) -> None:
self.max_int = 1000
self.min_int = -1000
self.co_code_start = b"d\x01}\x01d\x02}\x02"
self.co_code_end = b"|\x01|\x02f\x02S\x00"
self.num_list = [randint(self.min_int, self.max_int) for _ in range(10)]
def ask_riddle(self) -> str:
return """ 'In arrays deep, where numbers sprawl,
I lurk unseen, both short and tall.
Seek me out, in ranks I stand,
The lowest low, the highest grand.
What am i?'
"""
def check_answer(self, answer: bytes) -> bool:
_answer_func: types.FunctionType = types.FunctionType(
self._construct_answer(answer), {}
)
return _answer_func(self.num_list) == (min(self.num_list), max(self.num_list))
def _construct_answer(self, answer: bytes) -> types.CodeType:
co_code: bytearray = bytearray(self.co_code_start)
co_code.extend(answer)
co_code.extend(self.co_code_end)
code_obj: types.CodeType = types.CodeType(
1,
0,
0,
4,
3,
3,
bytes(co_code),
(None, self.max_int, self.min_int),
(),
("num_list", "min", "max", "num"),
__file__,
"_answer_func",
"_answer_func",
1,
b"",
b"",
(),
(),
)
return code_obj
→ 6 Hypothesizing what bytecode is expected
The dis
module was used to disassemble the co_code_start
and
co_code_end
bytecode in an IPython session.
-
The module was imported:
In [2]: import dis
-
co_code_start
was disassembled. The output below has been manually commented to explain what each line does based on the documented Bytecode instructions-
co_consts is the “tuple of constants used in the bytecode”.
-
co_varnames is the “tuple of names of arguments and local variables”.
-
Given the constructed function is intended to find the min and max elements in
num_list
, the hypothesis was made thatco_consts[1]
andco_consts[2]
are each one or other of the 1000 or -1000 constants.
In [3]: dis.dis(b"d\x01}\x01d\x02}\x02") 0 LOAD_CONST 1 # Pushes co_consts[1] onto the stack. 2 STORE_FAST 1 # Stores STACK.pop() into the local co_varnames[1]. 4 LOAD_CONST 2 # Pushes co_consts[1] onto the stack. 6 STORE_FAST 2 # Stores STACK.pop() into the local co_varnames[2].
-
-
co_code_end
was similarly disassembled-
Given the constructed function is intended to return a tuple containing the min and max elements in
num_list
, the following hypothesis was made:-
co_varnames[1]
is a local variable containing the min value. -
co_varnames[2]
is a local variable containing the max value.
-
In [4]: dis.dis(b"|\x01|\x02f\x02S\x00") 0 LOAD_FAST 1 # Pushes a reference to the local co_varnames[1] onto the stack. 2 LOAD_FAST 2 # Pushes a reference to the local co_varnames[2] onto the stack. 4 BUILD_TUPLE 2 # Creates a tuple consuming 2 items from the stack, and pushes the # resulting tuple onto the stack. 6 RETURN_VALUE # Returns with STACK[-1] to the caller of the function.
-
Based on the above, the full expected function, including
co_code_start
and co_code_end
was hypothesized
to be the following:
def _answer_func(numlist):
min = 1000 # co_code_start
max = -1000
for num in numlist: # answer
if num <= min:
min = num
for num in numlist:
if num >= max:
max = num
return (min,max) # co_code_end
→ 7 Generating the bytecode
A create_hypothesised_answer_func_bytecode.py
script was
created to generate the bytecode for the hypothesized function.
- Line 15 compiles the hypothesized function
-
Line 19 prints the compiled function’s
co_code
and line 21 disassembles it. -
Line 25 execs the compiled function, making
_answer_func
available -
Line 27 disassembles
_answer_func
and line 30 prints itsco_code
import dis
code = """
def _answer_func(numlist):
min = 1000
max = -1000
for num in numlist:
if num <= min:
min = num
for num in numlist:
if num >= max:
max = num
return (min,max)
"""
compiled = compile(code, "_answer_func", "exec", optimize=0)
# The compiled code's co_code is not actually what we're after
co_code = compiled.co_code
print(f"co_code: {co_code}")
print(f"disassembled co_code is not what we're after:")
dis.dis(co_code)
# If we exec the compiled code, the _answer_func becomes available and its's bytecode is what
# we're after.
exec(compiled)
print(f"disassembled _answer_func:")
dis.dis(_answer_func)
_answer_func_co_code = dis.Bytecode(_answer_func).codeobj.co_code
print(f"_answer_func_co_code is what we're after: {_answer_func_co_code}")
Running the script results in the following:
-
The start (lines 13-17) and end (lines 47-50) of the disassembled
_answer_func
look very similar to the previously disassembledco_code_start
andco_code_end
. The only difference is an extra starting instruction on line 11. However, RESUME is a no-op (no operation) and can thus be ignored. -
Similarly, the generated bytecode on line 51 has a start and end
matching
co_code_start
andco_code_end
, except for the leading\x97\x00
corresponding to theRESUME
instruction.
$ python3 create_hypothesised_answer_func_byte_code_for_writeup.py
co_code: b'\x97\x00d\x00\x84\x00Z\x00d\x01S\x00'
disassembled co_code is not what we're after:
0 RESUME 0
2 LOAD_CONST 0
4 MAKE_FUNCTION 0
6 STORE_NAME 0
8 LOAD_CONST 1
10 RETURN_VALUE
disassembled _answer_func:
2 0 RESUME 0
3 2 LOAD_CONST 1 (1000)
4 STORE_FAST 1 (min)
4 6 LOAD_CONST 2 (-1000)
8 STORE_FAST 2 (max)
5 10 LOAD_FAST 0 (numlist)
12 GET_ITER
>> 14 FOR_ITER 10 (to 36)
16 STORE_FAST 3 (num)
6 18 LOAD_FAST 3 (num)
20 LOAD_FAST 1 (min)
22 COMPARE_OP 1 (<=)
28 POP_JUMP_FORWARD_IF_FALSE 2 (to 34)
7 30 LOAD_FAST 3 (num)
32 STORE_FAST 1 (min)
>> 34 JUMP_BACKWARD 11 (to 14)
8 >> 36 LOAD_FAST 0 (numlist)
38 GET_ITER
>> 40 FOR_ITER 10 (to 62)
42 STORE_FAST 3 (num)
9 44 LOAD_FAST 3 (num)
46 LOAD_FAST 2 (max)
48 COMPARE_OP 5 (>=)
54 POP_JUMP_FORWARD_IF_FALSE 2 (to 60)
10 56 LOAD_FAST 3 (num)
58 STORE_FAST 2 (max)
>> 60 JUMP_BACKWARD 11 (to 40)
11 >> 62 LOAD_FAST 1 (min)
64 LOAD_FAST 2 (max)
66 BUILD_TUPLE 2
68 RETURN_VALUE
_answer_func_co_code is what we're after: b'\x97\x00d\x01}\x01d\x02}\x02|\x00D\x00]\n}\x03|\x03|\x01k\x01\x00\x00\x00\x00r\x02|\x03}\x01\x8c\x0b|\x00D\x00]\n}\x03|\x03|\x02k\x05\x00\x00\x00\x00r\x02|\x03}\x02\x8c\x0b|\x01|\x02f\x02S\x00'
→ 8 Generating the payload
A create_encoded_payload.py
script was created featuring
the following:
-
Lines 3-32 contain code copied from
Riddler
. - Line 36 defines the bytecode, minus the prefix and suffix.
-
Lines 38-41 construct a function from the bytecode in the same way
Riddler
does. - Lines 43-46 tests the function behaves correctly.
- Lines 48-50 encode the bytecode as comma separated integers
import types
# Code extracted from Riddler
co_code_start = b"d\x01}\x01d\x02}\x02"
co_code_end = b"|\x01|\x02f\x02S\x00"
def _construct_answer(answer: bytes) -> types.CodeType:
co_code: bytearray = bytearray(co_code_start)
co_code.extend(answer)
co_code.extend(co_code_end)
code_obj: types.CodeType = types.CodeType(
1,
0,
0,
4,
3,
3,
bytes(co_code),
(None, 1000, -1000),
(),
("num_list", "min", "max", "num"),
__file__,
"_answer_func",
"_answer_func",
1,
b"",
b"",
(),
(),
)
return code_obj
# Bytes generated by create_hypothesised_answer_func_byte_code.py, minus the co_code_start prefix and
# co_code_end suffix
answer=b'|\x00D\x00]\n}\x03|\x03|\x01k\x01\x00\x00\x00\x00r\x02|\x03}\x01\x8c\x0b|\x00D\x00]\n}\x03|\x03|\x02k\x05\x00\x00\x00\x00r\x02|\x03}\x02\x8c\x0b'
# Construct a function the same way Riddle does.
_sol_answer_func: types.FunctionType = types.FunctionType(
_construct_answer(answer), {}
)
# Test whether the function correctly returns the min and max of a list.
test_list = [18, 7, 3, 22]
print(f"test_list: {test_list}")
print(f"test_list min, max: {_sol_answer_func(test_list)}")
# Encode the payload for the challenge.
answer_encoded_bytes=', '.join(list(map(lambda x: str(int(x)), answer)))
print(f"answer_encoded_bytes: {answer_encoded_bytes}")
Running the script results in a payload being generated:
python3 create_encoded_payload_for_writeup.py
test_list: [18, 7, 3, 22]
test_list min, max: (3, 22)
answer_encoded_bytes: 124, 0, 68, 0, 93, 10, 125, 3, 124, 3, 124, 1, 107, 1, 0, 0, 0, 0, 114, 2, 124, 3, 125, 1, 140, 11, 124, 0, 68, 0, 93, 10, 125, 3, 124, 3, 124, 2, 107, 5, 0, 0, 0, 0, 114, 2, 124, 3, 125, 2, 140, 11
→ 9 Obtaining the flag
The payload was delivered to the remote endpoint and the flag obtained:
$ nc -n -v 83.136.254.16 33284
(UNKNOWN) [83.136.254.16] 33284 (?) open
___________________________________________________________
While journeying through a dense thick forest, you find
yourself reaching a clearing. There an imposing obsidian
cube, marked with a radiant green question mark,halts your
progress,inviting curiosity and sparking a sense of wonder.
___________________________________________________________
> 1. Approach the cube...
> 2. Run away!
(Choose wisely) > 1
___________________________________________________________
As you approach the cube, its inert form suddenly comes to
life. It's obsidian parts start spinning with an otherwordly
hum, and a distorted voice emanates from withing, posing a
cryptic question that reverbates through the clearing,
, shrouded in mystery and anticipation.
___________________________________________________________
> Riddler: 'In arrays deep, where numbers sprawl,
I lurk unseen, both short and tall.
Seek me out, in ranks I stand,
The lowest low, the highest grand.
What am i?'
(Answer wisely) > 124, 0, 68, 0, 93, 10, 125, 3, 124, 3, 124, 1, 107, 1, 0, 0, 0, 0, 114, 2, 124, 3, 125, 1, 140, 11, 124, 0, 68, 0, 93, 10, 125, 3, 124, 3, 124, 2, 107, 5, 0, 0, 0, 0, 114, 2, 124, 3, 125, 2, 140, 11
___________________________________________________________
Upon answering the cube's riddle, its parts spin in a
dazzling display of lights. A resonant voice echoes through
the woods that says... HTB{r1ddle_m3_th1s_r1ddle_m3_th4t}
___________________________________________________________
→ 10 Conclusion
The flag was submitted and the challenge was marked as pwned