Sunday, August 26, 2018

NetBeans Web Toolkit

I'm exploring NetBeans Web Toolkit with the articles here

NetBeans Web Toolkit is the new name I'm trying to give to Jaroslav Tulach's HTML/Java API, a rather impressive library that deserves more use.

Friday, March 02, 2018

Guards in Java

Haskell functions have this nice concept called 'guards' which allow you to define a condition and return a value when that condition is true.

For example:

abs n
  | n < 0     = -n
  | otherwise =  n

This makes the code rather readable, especially when you have more guards.

Guards build one upon another since you know that if your guard condition is checked, all the other failed:

something n
  | n < -2 = 10
  -- bellow we know that n > =-2
  | n < 0 = 8
  | otherwise = n

Back in Java land, where I get paid, I sometimes wondered if I should write a method as:

X method(Y param) {
  if (!param.isSomething()) {
    return null;
  } else {
    return param.getX();

or if I should write it as

X method(Y param) {
  if (!param.isSomething()) {
    return null;

  return param.getX();

I generally prefer the 2nd variant and now I realised these are a form of function guards!

Wednesday, October 04, 2017

The case of the different jsch 0.1.54 binaries

As part of the Apache NetBeans IP clearance we are combing through all the code and dependencies.

One interesting thing we bumped into was that the jsch 0.1.54 binary JAR we are using has a different hash (and size) than the binary JAR from Maven Central.

The old hash is 0D7D8ABA0D11E8CD2F775F47CD3A6CFBF2837DA4, the new one is DA3584329A263616E277E15462B387ADDD1B208D.

The binaries are 278,612 bytes vs 280,515 bytes in Maven Central.

Our version is actually the same as the one found on

Also, the Maven JAR is properly signed with the author's CA7FA1F0 key.

This is where it becomes clear that reproducible builds are important. You do not want to have to wonder why a binary differs, especially years later when you are doing a review. And this one is a library doing SSH!

So, why the different binaries?

It seems the original JAR was compiled on Aug 30, 2016 with Java 1.4 (major version 48) while the Maven Central JAR was compiled Sep 3, 2016 with Java 5 (major version 49).

The original JAR also concatenates strings using StringBuffer while the Maven Central JAR uses the newly introduced in 1.5 StringBuilder. Which should also be a bit faster since it's not synchronized.

Next, most of the cypher classes use some reflection via a static static java.lang.Class class$(java.lang.String) method.

What is this? It's just the way class literals worked in Java 1.4. As explained here, in Java 5 the ldc_w instruction was introduced to load a Class object.

In 1.4 the class literal was helped by the compiler by actually introducing the helper Class class$(java.lang.String className) method and replacing the Person.class with a class$("Person") call.

It conclusion, it seems that excluding the Java 1.4 to Java 5 compiler changes, the two JARs are identical. With the Maven Central JAR even a bit better due to StringBuilder being used.

There is no check so far that the sources do produce the specific JAR. This is an exercise left for the reader.

Note: I have also cross-posted this blog post to the Apache NetBeans blog.

Tuesday, June 20, 2017

Processing a generic Data.Array matrix

I had an interesting Haskell problem the other week: work on columns and rows of a Data.Array i e.

You only have the Ix i, Ord e class constraints, which make sense because the index must be a Data.Ix. The elements also must be Ord to be able to process them.

The thing about Data.Ix is that it's very opaque. It only extends Ord. There is nothing matrix-related in it. One could use Data.Array for a lot of data structures!

But if you do know it's a matrix, although you have no explicit class constraint, there is a nice trick to use: two neighbouring cells will have a Data.Ix.rangeSize of 2!

So, the rows may be extracted by this little function:

byRows :: Ix i => [i] -> [[i]]
byRows indices = incGroupBy isNeighbour $ indices
    where isNeighbour x y = 2 == rangeSize (x, y)

which is called like

process :: (Ix i, Ord e) => Array i e -> Something
process matrix =

    let rows = byRows $ indices matrix

Note the unknown incGroupBy which is a groupBy that takes pairs incrementally.

That's it! I had a lot of ideas about using rangeSize to figure out the matrix dimensions, but pairing cells this way was really clean.

Thursday, March 30, 2017

Retina work

These past months I have done a NetBeans patch for the Apple Retina Display and also made a small Wiki-like site to help me and anyone else interested with finding matching font icons for the NetBeans icons:

The stack is Angular, Prime NG, nginx, Jetty, Servlets, Spring Framework JDBC, HSQLDB. SSL via Let's Encrypt.  Hosted at Scaleway.

It's a fun project since I got to learn Angular, find a bug and submit a patch for Prime NG, see how Let's Encrypt does free SSL certificates, learn about the EU cookie warning and all the many tiny things that are needed for a site.

Angular in particular and the whole webdev ecosystem was a lot of new information for me. I was changing project configuration as Angular progressed along! Which reminds me: @angular/cli reached 1.0 and I should probably see if I need to tweak something.

Read on JAXenter a longer article about my work.

Tuesday, January 31, 2017

Machine learning everywhere!

Samsung announced a while back that they used a "Neural Net based predictor" for their CPU branch prediction.

Shortly after that an Intel person claimed it's no big deal because they have also been using a perceptron for some time.

But to me this seemed a rather big discovery! Previously I would have assumed that branch prediction is a super complex algorithm.

Learning that branch prediction is a basic perceptron reduces Intel's perceived strength.

So companies are openly and covertly using machine learning everywhere.

Machine learning is also a perfect fit for companies because there is no moral filter on a neural network and no chance of whistle blowing.

Volkswagen truly missed a golden opportunity here with their diesel scandal.

They should have just trained a neural network on passing the diesel criteria and then have perfect plausible deniability: "the neural network disabled the pollution filters all by itself!"

Wednesday, December 28, 2016

Migrating the extra large NetBeans Mercurial repository to Git

Note: This article is a living document and will be updated as I learn new useful information (last update 31st December 2016). I will move the helper scripts to a dedicated repository and copy part of this article into the Apache NetBeans wiki.


The NetBeans source code has been stored in a Mercurial repository for almost a decade now.

But starting October 2016 NetBeans is preparing to become an Apache project.

And all incubating projects must store their source code on Apache Software Foundation infrastructure which only provides Subversion or Git hosting.

So, NetBeans must migrate its Mercurial repository to Git.

Size concerns

The NetBeans Mercurial repository covers 17 years of history and has grown to over 3GB.

Apache projects are mirrored on GitHub and they have a limit of 2GB or so. As such, any talk of migration started with ways of reducing the size by potentially splitting up the repository or removing some of the history.

Luckily, it turns out the NetBeans Mercurial server was just using a really old Mercurial version. When Gregory Szorc looked into it we learned that with the format.generaldelta=true and format.aggressivemergedeltas=true flags, the repository drops to about 1GB.

This was great news but, of course, we still have to migrate to Git.

Under Git we have to make sure some compression is applied. This is done with the git gc command which reduces the repository to under 1GB too.

With size out of the way, we can do a straight migration and preserve our mono repository and the whole history.

The case of the corrupted repository

The most important NetBeans repository is releases. This repository holds the release branches such as release82 as well as the current state in the default branch. The main-silver default branch is periodically pushed into the releases/ default branch.

A direct conversion of releases/ is impossible though because the repository is corrupt:

$ hg verify
checking changesets
checking manifests                                                                                                                      

crosschecking files in changesets and manifests                                                                                        

checking files
 applemenu/src/org/netbeans/modules/applemenu/layer.xml@?: rev 12 points to unexpected changeset 149753                                  
 (expected 149755)
 defaults/src/org/netbeans/modules/defaults/Eclipse-keybindings-mac.xml@?: rev 0 points to unexpected changeset 149753                  
 (expected 149755)
 defaults/src/org/netbeans/modules/defaults/Eclipse-keybindings.xml@?: rev 25 points to unexpected changeset 149753                      
 (expected 149755)
 defaults/src/org/netbeans/modules/defaults/mf-layer.xml@?: rev 74 points to unexpected changeset 149753                                
 (expected 149755)
192754 files, 313961 changesets, 1122263 total revisions                                                                                

4 warnings encountered!
4 integrity errors encountered!

Luckily, the corruption seems to be in the default branch.

So, we can get a valid releases/ repository by first making a main-silver clone and then pulling the missing changesets from releases:

mkdir releases.fixed
cd releases.fixed
hg init .
hg pull
hg pull
hg out
#nothing should be displayed here
hg verify

hg-fast-export all the way

Now that we have a valid repository we just follow the steps in the official documentation about migrating from Mercurial:

git clone /tmp/fast-export
git init ~/git-releases
cd ~/git-releases
/tmp/fast-export/ -r ~/releases.fixed
git gc --aggressive --prune=now

and then wait 48 hours for it to finish!

.. but first: removing the unnamed heads

Once you do start you'll notice it fails early with

Error: repository has at least one unnamed head: hg rXXXX

This is caused because Git, unlike Mercurial, does not support unnamed branches.

It's not a big problem for the NetBeans repository because there are very few such commits and basically historical mistakes with no relevance.

I have just removed them altogether with hg strip

Incremental push

Although we have world class internet speed in Romania, I happened to be on a slow connection when the conversion finished. And it is no fun to restart a git push after 400MB have already been uploaded and the connection dropped!

I fixed this by uploading incrementally each month:

echo "Incremental git push"

for year in 2012 2013 2014 2015 2016; do
    for month in 1 2 3 4 5 6 7 8 9 10 11 12; do
SHA=`git rev-list -1 --before="$year-$month-1 12:00" master`
echo "$SHA $year-$month"
  echo git push origin "$SHA:master"
git push origin "$SHA:master"

Syncing with the old Mercurial repository

Right now my GitHub repositories are just experimental. The real work is still done in the Mercurial repository. As such, I still have to convert the new commits from Mercurial to Git.

hg-fast-export seems designed with this in mind. The -r parameter which specifies the source repository is only needed the 1st time. After that it may be skipped and hg-fast-export will incrementally convert the missing changesets.

So, it's just a matter of:

cd ~/releases.fixed
hg pull
cd ~/git-releases
git push

Note that a pull on releases/ will bring back the stripped commits...

Saving hg-fast-export state

I did run into a deadlock while incrementally converting with hg-fast-export.

The only solution seemed to be to redo the conversion.

At 48 hours per full repository this doesn't seem like fun, so I recommend periodically saving these files from the .git folder: hg2git-headshg2git-mappinghg2git-marks and hg2git-state.

Make sure not to ignoreCase

I discovered that on macOS core.ignoreCase is true which means that changesets that only change the case of a file name will produce an incorrect git changeset.

So on macOS the option needs to be explicitly set to false:

git config core.ignorecase false

NetBeans Web Toolkit

I'm exploring NetBeans Web Toolkit with the articles here NetBeans Web Toolkit is the new name I'm trying to give to Jaroslav Tu...