The Insanity of Upstream

Sometimes the Java community, or more specifically the people that write Java open source software, drive me nuts!

For the past couple of week I’ve been trying to build a new version for the Jetty package based on the current Jetty6 package from JPackage1, and in the process combating its hellish dependency tree and the way open source Java projects build opon each other in a complicated, confusing and often circular manner.

Examples for circular dependencies are varied and not that interesting – In order to build Jetty you need some third party libraries, which in order to build them you need a provider of Servlet API 2.4, of which there are several candidates including the current version of Jetty (you need to build Jetty6 in order to be able to build Jetty6), the previous version of Jetty (you need to build Jetty5 in order to build Jetty6 – a bit less silly but still silly) or Tomcat (you need to build the competitor to Jetty in order to build Jetty – quite silly even if its open source we are talking about).

This is not the worst circular dependency – they were others far worst last time I tackled something like this (a couple of years back), like you need to have Jakarta’s commons-httpclient installed in order to be able to build Jakarta’s commons-httpclient. Luckily it appears they got that fixed.

Update:

Continuing to work on this, here is another problem I encountered: In order to build Jetty 6 you need wadi 2, which in order to build you need Maven 2 and specifically the Mojo Maven 2 plugin for javacc, which in order to build you need Xfire, which in order to build you need Jetty 6. Lather, Rinse, Repeat.

But what prompted this rant was another very common problem with Java open source project – feature creep in the worst kind of way: I’m trying to build the JAXB package, which is a simple interface to parse and build XML. It depends on args4j which is described as “a small Java class library that makes it easy to parse command line arguments”. First I’ve heard of it2 but oh well, lets try to build it. Well, apparently in order to build it it requires Saxon which is an XSLT processor3. why ?!?

Noticed the “small” part that I highlighted earlier? In my dictionary “small” includes many properties, one of which is “Does not have large or complex external dependencies”.

And you get this all the time when trying to build Java libraries and applications – up until now I had to install 2 versions of groovy, 2 versions of saxon, 3 versions of the Java development kit itself4 and I had to build and install Xalan twice (once for bootstrapping and a second time to actually get the xsltc processer I needed).

A few years back I ranted about this in an email to JPackage maintainer Ralph Apel at which point he replied that this is the best they can do to fight the insanity in the upstream5.

  1. an excellent excellent project that is operated by talented people in what I can only guess is what little free time they have []
  2. why can’t they use gnu-getopt like everyone else? []
  3. as well as maven which is a huge headache in and of itself and requires and required the installation of 42 packages, but all the cool kids are using it these days so I can’t say anything about it []
  4. seems that most stuff I need requires java 1.5.0 and nothing later while we use 1.6 and the system itself requires 1.4 []
  5. For those not “in the know”, “upstream” in open source speak means the project that generates the source code we build upon []

3 Responses to “The Insanity of Upstream”

  1. Oded:

    I just came upon this shining piece of dependency from the xerces-j2 jpackage:

    Name: xerces-j2

    BuildRequires: xerces-j2

    Which means, for the uninitiated, that in order to build xerces-j2 I must have xerces-j2. Same thing with ant – to build ant I need to already have ant installed.

  2. David J. Liszewski:

    Could not agree more.

    One of the most salient points of choosing Java as an implementation language is to decouple host system dependencies. It’s really not hard to do once you accept the fact that the only tenable means to produce Java runtime artifacts is necessarily orthogonal to “native” artifact builds.

    It seems to me that some in the FOSS community would rather employ an N-squared dependency solution instead of a one-time embrace of a solution that solves dependency resolution independent of host OS. Effectively throwing the baby out with the bath water.

    I’m replying because of a link to your post on Reddit. I posted my $.02 to the relevant thread on Reddit: http://www.reddit.com/r/techsnap/comments/34izrf/dormant_docker_disasters_techsnap_212/cr14659

    FWIW, Google, Netflix, Amazon, Twitter, banks, insurance companies, hedge traders, and mutual funds don’t follow the JPackage mess.

    Be well.

  3. Oded:

    Its a question of reproduceability – when your build system relies on binary artifacts retrieved from external upstream servers you have no control over (which is the modus operandi of Maven, Ivy and similar tools) what will you do when you need to release an urgent update to your software, but the upstream servers are down for some reason? Or, your system has been completely obliterated when your co-location suffered a major failure and as luck would have it, Apache stores their main servers in the same colo, would you be willing to wait with your rebuild until your upstream (who owes you no SLA) rebuild their systems?

    I believe living like this is insane and if your DR plan includes “download these binaries from this non-contracted third party” then you should be fired.

    The only question now remains how you handle that? I guess some people use an artifact cache (such as Artifactory) and pray that it contains the correct binaries when push comes to shove. I prefer to live in certainty that I can build my entire production system from source, with 100% identical artifacts, by having SRPMs stored and backed up for anything over my base OS (whose provider I have an SLA with).

    You have to remember, your DR plan is as weak as the weakest SLA it relies on. If that is a $100 rebate from AWS, then the SLA you can provide to your customers can only guarantee that. If your DR plan relies on non-contracted third parties who has no obligation to you, then the only SLA you can reliably guarantee is “we’ll do our best to not lose your data” good luck selling that to your CEO.

Leave a Reply