I was eleven when I started writing scripts. Not to build anything useful, I just wanted to see how the stuff I used every day actually worked under the hood. That curiosity pulled me in further than I expected.
I taught myself through trial and error, mostly by breaking things and figuring out why. I ended up gravitating toward backend work because that is where all the real responsibility lives. Right now I am mostly thinking about event-driven architecture and the problems that come with concurrent state — things like race conditions, ordering guarantees and what happens when a system partially fails.
Almost everything I know came from building something and watching it fall apart in an interesting way. The question that stuck with me is not what should this do but what happens when something goes wrong.
Case studies.
These are things I built to figure out how software actually behaves once it is running. None of them stayed simple. Each one broke in a way I did not predict, and that is where the learning happened.
The first version of Mikami worked fine until I tried to add something new. Every feature was tangled up with every other one. Fixing a bug in the moderation logic would quietly break the economy system. There was no safe place to make a change because everything knew too much about everything else.
I did not have a bot with separate features. I had a single file that happened to respond to Discord events. The frustration was specific enough that I wanted to actually understand what people meant when they talked about software architecture, not just read about it.
The redesign flipped the dependency direction. Instead of the router importing modules, modules register themselves into the router at startup. The router just dispatches events to whatever handlers have been registered. It never needs to know what features exist. Adding a module means writing one file and calling register. Nothing else changes.
The rate limiting problem was the most annoying to figure out. The first time Mikami hit a 429, I caught the error and retried right away, which produced another 429, which I retried again. I made it worse before I understood what was actually happening.
A 429 is not a failure. It is the API telling you exactly when to send that request. Once I saw it that way, the fix was obvious: give each route its own queue, read the retry-after header, and hold the request until the bucket opens back up. Nothing gets dropped. The rate limiter went from causing cascading errors to being invisible.
I had read about dependency inversion before building Mikami. I did not really get it until I had to live with the alternative. When the core does not depend on its features, you can add, remove or replace a feature without touching anything else. The direction of the dependency matters as much as the dependency itself.
Gateway disconnects are not edge cases. Discord has a resume protocol because disconnects are expected. Treating reconnection as the default path rather than the fallback changed how reliable the whole thing felt in practice.
AutoBets started simple. User submits a wager, system rolls an outcome, balance updates. It worked fine when only one person was betting. When two users submitted wagers at the same moment, both pipelines read the same balance, both passed validation, both committed. The balance was decremented twice from the same starting value and nothing threw an error.
The system did exactly what I told it to. The issue was I had not thought about what it means for two things to touch the same state at the same time.
Every wager became a transaction where each step only runs if the previous one succeeded. The most important decision was where to put the balance read relative to the lock. It has to happen inside the lock. Reading before acquiring it means the value you are acting on can be stale by the time you hold it.
The hardest thing to figure out was where the lock boundary had to be. My original code read the balance before acquiring the lock. That sounds fine until you think about what can happen in the gap between the read and the lock. Another pipeline runs, modifies the balance, releases. Now your read is stale and you are about to act on a number that no longer exists.
Moving the read inside the lock closed that gap completely. The second problem was the order of the ledger write and the balance update. I originally updated the balance first. If the ledger write then failed, the balance had changed with no record anywhere. Flipping the order meant a failed ledger write leaves everything clean. A failed balance update after the ledger write leaves an orphaned record that is findable and fixable. I chose the failure I could live with.
Concurrency bugs are quiet by nature. The system does not know it is wrong. That is what makes them genuinely worse than regular bugs. The thing that stuck with me is that a lock and a read are not two separate decisions. They are one unit. Separating them creates the exact window you were trying to close.
The ordering lesson was broader. When you sequence operations you are choosing which failures are possible. There is no arrangement that makes all failures go away. The question is which failure leaves you in a state you can actually reason about. Silent corruption is always the worst outcome, not because it is the most dramatic but because you might not even know it happened.
Search engines have advanced operators like site:, filetype:, inurl: that are useful for security research and SEO. One keyword combined with a reasonable set of operators and parameters can produce hundreds of distinct patterns. Writing them by hand gets tedious fast.
I wanted to automate the construction and generate thousands of structured, deduplicated queries from a small input set. The interesting engineering problem was not the searching. It was the generation itself and how to keep the output meaningful as the numbers got large.
Three isolated layers. The generation engine handles the combinatorial logic and knows nothing about the interface. Configuration controls the engine without touching its internals. The PyQt6 interface handles input and output without any knowledge of how queries actually get built.
The problem was combinatorial explosion. When a template has multiple placeholders and each one draws from a list of values, the number of possible outputs grows multiplicatively. Four placeholder types each with 20 options gives 160,000 combinations. The naive approach builds all of that in memory before writing anything, which at that scale just crashes.
I fixed it by expanding one placeholder type per pass and writing the intermediate output to disk before moving to the next pass. The working set in memory at any point is one intermediate file, not the final product. That kept memory bounded regardless of how large the target output was, and the generator could scale without any changes to the engine itself.
The quality of a generator's output is decided before generation starts, not during it. A generator sampling randomly from poorly designed pools produces a lot of technically unique results that are not actually useful. The interesting work is in what goes into the pools and how templates are shaped. Everything after that is just mechanics.
Staged writes are not an optimisation. When the output space is too large to hold in memory, the right structure is to never let it exist all at once. That is the correct model, not a workaround.
The Dork Generator produced large query datasets and once I had them I needed to work with them — shuffle, split, extract, check. Each operation became its own script. After a few of them I had a pile of disconnected tools with no consistent way to run them or chain them together.
The friction was not any individual tool. It was the lack of structure around them. I wanted to understand how you keep a growing set of small programs usable as the collection gets bigger without each one turning into its own maintenance problem.
Each tool is a self-contained script with one job. A CLI launcher acts as a dispatcher. It shows a menu, reads the selection, and invokes the chosen module as a subprocess. The launcher has no idea what the tools do. It only knows their names and entry points. Adding a new tool means writing one script and registering its name. The launcher never changes.
The main design decision was whether modules should share state with the launcher or run as fully isolated subprocesses. Shared state is easier to wire up. Modules can read a common context, pass data around, build on each other's output. But it means every module is coupled to every other one through that shared context, and a bug in one can silently corrupt something another module depends on.
Subprocess isolation makes isolation the default. Each module runs in its own process, reads its own inputs, writes its own outputs and exits. There is no shared state to corrupt. The tradeoff is you cannot directly pass data between modules at runtime. But a broken module fails visibly in its own process without touching anything else. That is a trade I will take every time.
The subprocess dispatch model made the toolkit easy to extend almost by accident. Because the launcher has no knowledge of the modules, adding a tool requires zero changes to existing code. Once I understood why that worked, I could think about it deliberately rather than stumbling into it. A system is easy to extend when adding a feature means writing something new rather than modifying something existing.
Isolation also changes what failures look like. Shared state systems fail in tangled ways that are hard to trace. Isolated systems fail locally and visibly. When debugging eight independent tools, knowing exactly which one failed and exactly what state it was in makes an enormous difference.
How I think
through problems.
Problems I ran into while building, did not immediately understand, and had to actually work through. Writing them down is how I figure out what I learned.
In AutoBets, two simultaneous wagers from different users could both pass balance validation and both commit because each one read a balance the other had not modified yet. The resulting state was wrong. Nothing had thrown an error.
This is a TOCTOU race. The state you checked at the time of check is not the state you act on at the time of use. The window between those two moments is where the bug lives.
A wager is not a single calculation. It is a sequence of steps that need to behave as one indivisible unit. Once I saw it that way the fix became obvious. The lock has to come before the read, not after. The read and the mutation have to happen inside the same window with nothing in between.
- 1.Acquire per-user mutex lock before reading anything
- 2.Read and validate the balance inside that window
- 3.Generate and validate the outcome
- 4.Write the ledger record
- 5.Mutate the balance
- 6.Release the lock
The lock comes first. Every step that depends on the balance happens inside it. A second wager for the same user cannot start until the first has fully committed and released.
Discord enforces rate limits per route bucket. When Mikami hit them, I caught the 429 and retried immediately. Which produced another 429. Which I retried again. The backlog grew and the problem compounded. I made it worse before I understood it.
A 429 is the API telling you to wait a specific amount of time and try again at that exact moment. Once I thought of it that way, the right approach was clear. Treat outbound requests as a queue, give each route bucket its own cooldown state, and schedule retries at the right moment rather than immediately.
- All outbound requests go through a central scheduler rather than being sent directly
- Each route bucket tracks its own remaining count and reset window independently
- A 429 causes the request to be requeued at the exact retry-after time, nothing gets dropped
- Bursts of activity queue cleanly instead of producing cascading errors
Early AutoBets updated the balance first, then wrote the transaction record. I did not think about the ordering until I asked what happens if the record write fails. The balance would have changed with no trace of why. No way to detect it. No way to reconstruct it.
Silent inconsistency is the worst kind of failure. Not because it is dramatic but because it is invisible.
Reversing the order changed the failure mode entirely. The ledger write became a precondition. If it fails, nothing else runs. The balance is never touched without a record.
A failed ledger write leaves the system clean. A failed balance mutation leaves an orphaned record, which is detectable and fixable. Either way there is no silent inconsistency.
The first Dork Generator substituted all placeholder types in one pass and built the entire dataset in memory before writing anything. With four placeholder types drawing from lists of 20 to 50 values, outputs reached hundreds of thousands of entries. The process was killed by memory pressure before finishing.
The output does not need to fully exist before any of it is written. Each intermediate stage only needs to exist long enough to produce the next one.
- One substitution pass per placeholder type, expand one variable, write to disk, move to next pass
- Memory at any point is proportional to a single intermediate file, not the final product
- Output can grow to any size without changing the engine
- Each pass is independently inspectable which makes debugging much easier
My first instinct for Dorky-Dorker was to have the launcher share state with its modules through a common context object. That made inter-tool communication easy but it also meant a bug in one module could silently corrupt state that another module depended on. Failures became entangled and tracing them was genuinely painful.
Shared state makes the failure surface of one module equal to the failure surface of the whole system. If a tool runs in isolation with its own process, its own inputs and its own outputs, its failure mode is local by construction.
- Each module runs as an independent subprocess with no shared memory with the launcher
- Modules communicate through files only, read inputs, write outputs, exit
- A failing module crashes visibly in its own process without touching anything else
- Adding a new tool requires zero changes to existing code, just register a name and write a script
What each system assumes to be true.
Every system rests on assumptions its design does not verify. Writing them down is how you start to know where it will break.
- Gateway delivers events in order within a session
- Each user has at most one active command pipeline at a time
- Module handlers are stateless and all state lives in the data layer
- All API requests must respect Discord route bucket limits
- Analytics events are non-critical and loss is acceptable but delay is not
- Each user has a single active wager pipeline at any moment
- Ledger writes must succeed before balance mutation with no exceptions
- Balance reads must occur inside the lock window, never before acquiring it
- RNG failure must exit before any state mutation occurs
- Transaction records are immutable once written
- Pool sizes must be large enough that collision probability stays low at target output size
- Templates define shape only and pools define values and neither layer touches the other
- Configuration is read-only to the engine and it never modifies its own config
- Deduplication set is the source of truth for uniqueness and no query bypasses it
- Each module is an isolated subprocess with no shared state with the launcher
- Intermediate files between staged passes are the only cross-pass communication
- Preset lists are the engine's only external dependency and are swappable without code changes
- Launcher never imports module code and only invokes by path
Failure modes
and responses.
I find it useful to think through what can go wrong before it does. This is a record of the failure scenarios I have considered for these systems, what breaks, how the system responds, and what that response actually guarantees.
| Failure Scenario | System Response | Guarantee |
|---|---|---|
Network Discord drops the gateway connection | Resume session using the last known sequence number. Discord's resume protocol replays missed events. If resume fails, full reconnect. | No events dropped on clean disconnect |
Integrity Ledger write fails mid-wager | Ledger write is a hard precondition for balance mutation. If the write fails, execution stops. Balance never changes. Clean state. | No balance mutation without a ledger record |
Integrity RNG generation fails | Outcome generation is validated before any state is touched. Failure exits the pipeline. No lock held. No balance mutated. | Wager rejected before any state mutation |
Concurrency Two wagers from same user simultaneously | Per-user mutex ensures only one pipeline runs per user. Second request blocks until first commits. Reads the already-updated balance and is TOCTOU safe. | Strict serial ordering per user |
Network API request hits rate limit (429) | Scheduler reads retry-after header and requeues the request. Not dropped. Not retried immediately. Route bucket cooldown is respected. | Every queued request eventually completes |
Concurrency User wagers more than their balance | Two-stage validation. Pre-lock check rejects obvious overdrafts fast. Second check runs inside the lock against the live balance. The in-lock check is what actually prevents overdraft under concurrency. | No overdraft under any concurrency level |
Integrity Generator output approaches combinatorial space size | Deduplication set rejects collisions. Pool sizes are designed so collision probability stays low well past the target output size. Generator keeps sampling until target is reached. | Output contains no duplicate queries |
Integrity Staged expansion pass runs out of memory mid-generation | Each pass writes its output file before the next pass begins. A failed pass leaves the previous intermediate file intact. The run can be resumed from the last completed pass. | No data loss on partial expansion failure |
Network A Dorky-Dorker module crashes or exits with an error | Module runs as an isolated subprocess. Crash is contained to that process. Launcher stays running, returns to the menu, and can invoke any other module normally. | Module failure never affects the launcher or other tools |
Integrity Preset list file is missing or malformed at generation time | Engine validates preset files at startup before starting any substitution passes. A missing or unreadable file causes an immediate exit with a clear error. No partial output is written. | No silent generation with incomplete inputs |
Notes on things
I had to think through.
Writing is how I test whether I have actually understood something. These are problems I ran into while building, written the way that made them click for me.
TOCTOU stands for Time-Of-Check to Time-Of-Use. It describes the gap between when you observe a piece of state and when you act on that observation. In that gap another operation can change what you saw.
In AutoBets, two users submitted wagers at the same time. Both pipelines read the same balance, both found it sufficient, and both committed. The balance got decremented twice from the same starting value because neither operation was aware of the other.
Adding a mutex is not enough on its own. The question is where the read happens relative to acquiring the lock. If you read before the lock, the value you're acting on can go stale by the time you hold it. The read has to happen inside the protected window to mean anything.
The ordering is the guarantee. The lock and the read are not two separate things. They are one unit. Once I understood that, the rest of the wager pipeline was straightforward to reason about.
Discord enforces rate limits per route bucket. Each endpoint has its own limit, its own window, and a retry-after header it sends back when you have exceeded it. The first time I hit a 429 in Mikami, I caught the error and retried. Which caused another 429. Which I retried again.
The shift that fixed it was realising a 429 is not an error in the usual sense. It is a scheduling instruction. The API is working exactly as intended and telling you to send that request at a specific point in the future. Treating it as a failure to retry immediately completely misses the point.
Structuring the outbound layer as a per-bucket queue solved both problems at once. Requests get sent at the right time, and a burst on one endpoint does not affect timing on another. Nothing gets dropped and everything gets scheduled.
If you update the balance first and the ledger write then fails, you have a state change with no record of it. No error to surface, no audit trail, no way to know what the correct state should be. The inconsistency is silent by construction.
Reversing the order changes the failure mode. If the ledger write fails, nothing else runs and the balance is untouched. If the ledger write succeeds and the balance update then fails, there is an orphaned record. That is recoverable. You can detect it by comparing records to balances and replay or reverse it. You know exactly what happened.
The principle that came out of this: sequence operations so the worst reachable failure leaves you with something you can reason about. Silent inconsistency is harder to deal with than a detectable error, even if the detectable error is messier at first.
When you combine inputs from multiple pools, the output space grows multiplicatively. A template with four placeholder types each drawing from 30 values can produce 810,000 combinations. That sounds like abundance but the structure of that space determines whether the outputs are useful, and random sampling cannot fix a poorly designed space.
The problem I ran into was that a naive generator samples randomly and rejects duplicates. That works when the output size is small relative to the combinatorial space. As the target size approaches the space size, the generator spends more and more time producing duplicates and discarding them. If you ask for more outputs than the space contains, it loops forever.
The fix was recognising that deduplication is not where this problem should be solved. Deduplication catches collisions after the fact. The real solution is pool and template design that keeps the space large enough that collision probability stays negligible at any realistic target size. The generator's job is mechanical. The design work happens before it starts.
Staged expansion handles a different but related problem which is memory pressure. Expanding one placeholder type per pass and writing intermediate results to disk keeps memory proportional to one file. These are two separate concerns. Output space design prevents logical failure. Staged processing prevents resource failure.
When I first designed Dorky-Dorker, the plan was to have the launcher maintain a shared context that modules could read from and write to. A tool finishes, leaves its results in the context, the next tool picks them up. Intuitive and easy to think about.
The problem showed up when something went wrong. A module that crashed or produced unexpected output did not just fail. It left the shared context in a partially modified state. The next module read that state, behaved unexpectedly, and produced output that was wrong in a way that looked plausible. Debugging meant tracing through several tools to find where the corruption started.
Shared state couples the failure modes of every component that touches it. One module's bug becomes every subsequent module's problem. The failure surface of the system is the union of all its parts.
Subprocess isolation makes each module's failure local by construction. A crash stays in that process. The launcher catches the exit code, reports it, and continues. No shared state is corrupted. Every other module is unaffected. The cost is that modules cannot share data in memory and communication goes through files. That is a real constraint but predictable local failures are worth it.
Get in touch.
Open to conversations about backend engineering, interesting system problems, or anything worth building. I respond to every message.
I am sixteen and still early in all of this. But I have a real foundation in backend architecture, a clear sense of how I approach problems, and a genuine interest in working on things that require careful thinking. If that sounds useful, reach out.