roland-ewald.github.io

Compiling plv8 extension for PostgreSQL 12

2020-11-13T00:00:00+00:00

There are no Linux binaries available for recent versions of the plv8 extesion, which adds JavaScript support to PostreSQL. Compiling the extension manually is a little tricky, because additional dependencies are necessary (discussed here), while no copy of the V8 library, such as libnode-dev, must be installed (discussed here). The build also relies on python 2. This is how it should work on a Ubuntu 20.10 setup:

# See https://github.com/plv8/plv8/issues/387#issuecomment-605663975 for all libraries that may be an issue:
sudo apt-get remove libnode libnode-dev
# See https://github.com/plv8/plv8/issues/283#issuecomment-397359439
sudo apt-get install python postgresql-server-dev-12 make git pkg-config chromium-browser subversion clang apg ninja-build cmake libc++-dev libc++abi-dev
wget https://github.com/plv8/plv8/archive/v2.3.15.tar.gz
tar xfvz v2.3.15.tar.gz
cd plv8...
sudo make
sudo make install
make installcheck

Enjoy!

IntelliJ IDEA plugin for bulk-renaming Java types

2020-06-01T00:00:00+00:00

Last week I wanted to rename a larger amount of Java types so that they are consistent with a new naming scheme. This does not seem to be easily possible in IntelliJ, so I wrote a small plugin that imports a CSV file with the ‘refactoring instructions’ and runs them in bulk. It also allows to define type-specific refactoring options, such as telling IntelliJ to also replace occurrences of the type name in other text files.

The developers at JetBrains really put some effort into supporting plugin developers. There are many helpful context-specific messages, e.g. to inform you about APIs that are soon to be deprecated, or what exactly you forgot to put into your plugin.xml. The latter even comes with a custom auto-complete for module dependencies. There were a few minor roadblocks¹, but nothing serious. This is the power of dogfooding, I guess.

Downloader beware

This was a weekend project, so while I’m happy with the refactoring result itself, there are many rough edges (and I’m pretty sure there are better ways to use the IntelliJ API). Nevertheless, it may be a useful blueprint for implementing similar utilities that help with larger ‘one-off’ refactorings.

For example, the predefined gradle setup requires more than the default RAM on my machine, so a custom setting in the gradle.properties file was required to initially configure the project. ↩

Spring Boot projects in IntelliJ IDEA

2018-04-01T00:00:00+00:00

Problem

Setting up gradle-based Spring Boot projects in IntelliJ IDEA is tricky.

A good development experience means it is easy to

start any @SpringBootApplication (via its main method)
start any test case, including integration tests using @SpringBootTest etc.

Since local development machines cannot reach all remote services and often rely on mock-up beans for those (which, of course, are part of the test code), and also rely on .properties files for application profiles such as test or dev (which, of course, are part of the test resources), you may run into trouble when trying to run applications and tests in IDEA.

This is because IDEA does not allow simple ad-hoc modifications to the classpath, and at the same time will override all manually changed dependencies whenever you touch a build.gradle file and will also not honor the convention of adding test code and test resources to the Java classpath¹.

Solutions

To solve this, there are several options:

You can manually configure your IDEA module to also depend on the test sources and all resource directories. This is only feasible when the gradle setup rarely changes, because the dependencies will be reset when the gradle project is updated.
You can delegate IDEA build and run tasks to Gradle, which works but is rather slow.
You can play around with custom class loaders or -Xbootclasspath/a:, and the like. However, this is also tricky and makes the whole setup more fragile.
You can play around with the raw XML that define the IDEA modules and add dependencies there (or use the idea gradle plugin’s ẁhenMerged for this). Again, this adds complexity and makes the software fragile (because newer IDEA versions may also have another XML format).

Simpler Solution

There is no perfect solution for this, but if you do not rely on gradle’s resource processing features² you can just throwing ‘everything’ together, e.g. via

allprojects {
  apply plugin: 'idea'
  idea {
    module {
      inheritOutputDirs = false
      outputDir file("$buildDir/intellij")
      testOutputDir file("$buildDir/intellij")
    }
  }
}

This is easy to set up (should be picked up by IDEA automatically, otherwise just run gradle idea), easy to debug (if a resource or bean in the same module cannot be found, just check the contents ofbuild/intellij), and easy to use (it is separate from your gradle build output).

Unfortunately, this is not configurable either. ↩
Actually a nice feature, and still useful in all other kinds of files that are not crucial for your Spring Boot setup. ↩

Want to comment? Write an email.

2017-09-26T00:00:00+00:00

I could confirm the results posted by Ayush Sharma (and featured on HN) regarding the large performance hit a website takes when loading Disqus comments, apparently because they collaborate closely with some user-tracking ads company.

This page, for example, loads 10 times slower when tested with GTmetrix (not sure how objective this is): 0.7 seconds vs. 7.2 seconds.

This is quite a lot, and the over 70 redirects they generate, e.g. to the following domains, are certainly not helping:

https://aa.agkn.com
https://beacon.krxd.net
https://cm.g.doubleclick.net
https://d.agkn.com
https://dpm.demdex.net
https://e.nexac.com
https://ei.rlcdn.com
https://i.liadm.com
https://ib.adnxs.com
https://idsync.rlcdn.com
https://io.narrative.io
https://licensebuttons.net
https://loadus.exelator.com
https://p.adsymptotic.com
https://pippio.com
https://pixel.tapad.com
https://pm.w55c.net
https://rc.rlcdn.com
https://secure.insightexpressai.com
https://sp.adbrn.com
https://stags.bluekai.com
https://staticxx.facebook.com
https://sync.mathtag.com
https://tags.bluekai.com
https://usermatch.krxd.net
https://www.facebook.com
https://x.bidswitch.net
https://x.dlx.addthis.com

This is on top of all the privacy and security implications, and since nobody comments here anyway I will just disable them.

If you absolutely need to tell me I’m wrong, you can just write an e-mail :-)

Query AWS CloudWatch in Java

2017-07-09T00:00:00+00:00

If you run a service on the T2 tier of AWS, your machines have a specific compute budget. So, for example, you may want to delay certain CPU-intensive background tasks to a later time, throttling them by the CPUCreditBalance your machine currently has.

The current (averaged) CPUCreditBalance can be retrieved via CloudWatch, but I found the Java API rather unforgiving¹ – it fails silently when certain elements like the unit (StandardUnit.Count) are missing from the API call. Here is a snippet that retrieves the last (average) CPUCreditBalance, as a starting point:

String ec2InstanceId = EC2MetadataUtils.getInstanceId();
LocalDateTime endTime = LocalDateTime.now();
LocalDateTime startTime = endTime.minusMinutes(120);
GetMetricStatisticsRequest cpuCreditBalanceRequest = new GetMetricStatisticsRequest()
  .withNamespace("AWS/EC2")
  .withStatistics(Statistic.Average)
  .withMetricName("CPUCreditBalance")
  .withPeriod(60)
  .withUnit(StandardUnit.Count)
  .withStartTime(Date.from(startTime.toInstant(ZoneOffset.UTC)))
  .withEndTime(Date.from(endTime.toInstant(ZoneOffset.UTC)))
  .withDimensions(new Dimension().withName("InstanceId").withValue(ec2InstanceId));
GetMetricStatisticsResult cpuCreditBalanceResult = getAmazonCloudWatch().get().getMetricStatistics(cpuCreditBalanceRequest);
List<Datapoint> metricData = cpuCreditBalanceResult.getDatapoints();
if (metricData == null || metricData.isEmpty()) {
  LOG.warn("Metric 'CPUCreditBalance' is not available.");  
} else {
  Datapoint mostRecentData = metricData.get(metricData.size() - 1);
  Double result = mostRecentData.getAverage();
  LOG.info("Latest average of 'CPUCreditBalance': " + result);
}

In any case, before using the Java API, I would always suggest to try out the request via the AWS CLI first; this is a complex API with many features. ↩

A (very) simple R -> Python integration

2017-03-04T00:00:00+00:00

For a side project, I needed to run a piece of Python code from R. Given their popularity (e.g., they hold the spots #2 and #9 in the PYPL popularity of programming languages index) and their focus on being pragmatic, I thought this should be very simple.

I was wrong. There seems to be an easy solution to call R from Python (via rpy) but the other way around turns out to be much more difficult. The packages that promise to do so (such as rPython or rPythonWin) seem to be not really in widespread use, and are thus not maintained very well¹.

I shouldn’t complain about open source (and send patches instead ;-), but there seems to be no well-maintained, easy-to-use, cross-platform R plugin that allows to call Python. Since I only have to make a few calls to a single Python function and runtime performance is not an issue, I gave up and simply used JSON and the command line for data transfer and invocation.

Here are some code samples that should get you going, if you also just want to cobble something together without being particularly proficient in either language.

The R snippet (requires an install.packages("rjson") in the R console):

library(rjson)

python_caller <- function(x, y, z) {
    INPUT_FILE <- 'python_input.json'
    OUTPUT_FILE <- 'python_output.json'
    PYTHON_FILE <- 'my_code.py'

    # Write input parameters to JSON file
    json <- toJSON(list(
        x = x,
        y = y,
        z = z))
    fileConn <- file(INPUT_FILE)
    writeLines(json, fileConn)
    close(fileConn)

    # Run Python code
    system(paste('python ', PYTHON_FILE))

    #Read results from JSON file
    json <- readChar(OUTPUT_FILE, file.info(OUTPUT_FILE)$size)
    result <- fromJSON(json)
    file.remove(OUTPUT_FILE)

    # Return values
    return(list(result1 = result$'result1', result2 = result$'result2'))
}

The Python snippet for my_code.py:

import os
import json

# Read input parameters from JSON file
INPUT_FILE = 'python_input.json'
OUTPUT_FILE = 'python_output.json'
with open(INPUT_FILE) as json_file:
    my_args = json.load(json_file)
os.remove(INPUT_FILE)

# Run some actual code here instead:
output = {}
output["result1"] = my_args['x'] + my_args['y'] - my_args['z']
output["result2"] = my_args['x'] - my_args['y'] + my_args['z']

# Write results to JSON file
# Note that some Python objects (e.g. NumPy arrays) need to be converted before writing them (e.g. see http://stackoverflow.com/a/32850511/109942)
with open(OUTPUT_FILE, 'w') as f:
    json.dump(output, f)

When calling

python_caller(1,2,3)

in the R console, the intermediate files python_input.json and python_output.json look as expected (before being deleted):

python_input.json:

{"x":1,"y":2,"z":3}

python_output.json:

{"result2": 2, "result1": 0}

This approach is far from perfect (for example: no debug mode that keeps the intermediate files, no support for multi-threading, assumes python to be available on the PATH, python command-line options like -O are missing), but it is also simple enough to understand this again in a few months, it requires no extra setup, and it works cross-platform (I hope ;-).

This is a euphemism for me not being able to set them up, neither on Windows nor on Linux, and neither with a current R version (3.3.2) nor an old one (2.15.1) that they should support (according to their documentation). ↩

Configuring terminal usage in Atom

2017-02-26T00:00:00+00:00

I re-visited the Atom editor in the last weeks; after all its performance is being improved step by step¹ and by now the development speed seems to have settled down a little.

Overall, I am quite impressed with how simple the editor is to configure².

For example, I ran into a problem when configuring terminal support. While the terminal-fusion package (a Linux-only fork of platformio-atom-ide-terminal) works well out-of-the-box, it seems to miss a simple way to switch between terminal window and the current editor pane (and to auto-hide the terminal if it is not in focus).

Luckily, there is a code snippet for platformio-atom-ide-terminal³ that just needs a little tinkering to make it compatible with terminal-fusion, and then goes into init.coffee:

atom.packages.onDidActivatePackage (pack) ->
  if pack.name == 'terminal-fusion'
    atom.commands.add 'atom-workspace',
      'editor:focus-main', ->
        p = atom.workspace.getActivePane()
        panels = atom.workspace.getBottomPanels()
        term = panels.find (pan) ->
          pan.item.constructor.name == 'TerminalFusionView'
        if not term
          # Open a new terminal
          editor = atom.workspace.getActiveTextEditor()
          atom.commands.dispatch(atom.views.getView(editor), 'terminal-fusion:new')
        else if term and p.focused
          term.item.open()
          term.item.focus()
        else if term and !p.focused
          term.hide()
          p.activate()

It’s easy to understand what is going on here, even without knowing much about CoffeeScript.

Now just add this shortcut definition to your keymap.cson:

'.platform-linux, atom-text-editor atom-workspace':
  'ctrl-ä': 'editor:focus-main'

And yes, this is an ä you see there – change as appropriate if you are not blessed with a German keyboard layout :-)

My impression so far: absolutely usable for ‘normal’ documents (e.g. code), but still much slower than Sublime Text on larger files (e.g. log files of a few megabytes, or files with very long lines – although that seems to get fixed in v1.15). Also, starting the editor takes a few seconds, so it’s a little too slow for quick one-off edit tasks (git commits etc.). ↩
Of course it helps that important shortcuts like ctrl-shift-p are consistent with Sublime Text :-) ↩
The last version of that snippet, on which the above code is based, can be found here. ↩

PostgreSQL integration testing troubles

2017-02-19T00:00:00+00:00

I had quite some trouble this weekend to get a database integration test suite running with current versions of Spring Boot (1.5.1) and PostgreSQL (9.6.1).

There was this one error that would pop up sporadically:

org.postgresql.util.PSQLException: ERROR: cached plan must not change result type

(Actually, I first got just the German¹ version of that, which did not help either.)

The funny thing was that this error only happened when I ran the full test suite, and it would only happen after about 15 minutes. So much for rapid trial and error.

After testing out various settings (and different versions of JPA dialects, JDBC drivers, Hibernate versions…), my last guess was that it is not a Spring/JPA/Hibernate-related problem. While each change slightly altered the results (i.e. the number of errors or in which test it would occur), none of the changes really helped.

Finally, after some more searching for just the error message, this Github comment (unrelated issue, unrelated project) gave me my Eureka moment: simply appending ?autosave=ALWAYS to the JDBC URI was all it needed.

In hindsight, there is propbably some side-effect involved in the preceding database tests running in the suite (they also cover some failure scenarios), and this led to this strange issue. Further digging into the code of the Postgres JDBC driver explains what actually happens, and what the options are.

Morale of the story? Append ?autosave=ALWAYS to your JDBC URI if you are running into the same problem! ;-)²

While I like the idea of translating technical error messages, a translated message is rather useless without a language-invariant error ID: users need to search for this term! ↩
And, also, with a more complex system that has many potential culprits (i.e. debugging ‘targets’), always track back to the question Which part of the setup is most likely the issue? – I got that wrong for quite some time, which slowed down the debugging a lot. ↩

OpenVPN, Ubuntu 16.04, and WiFi connection problems

2017-01-08T00:00:00+00:00

I spent quite some time over the last few weeks to debug a strange issue that happened when connecting to a (newly configured) OpenVPN server with Ubuntu 16.04. Every few minutes the VPN connection simply dropped, with this log message:

Inactivity timeout (--ping-restart), restarting

The problem occurred sporadically, and nothing (firmware update on router, ‘polling’ a server inside the private network to generate traffic, RTFM, etc.) helped.

Yesterday I finally found a workaround in this AskUbuntu answer for a similar connection problem: after editing the WiFi connection and setting the IPv6 settings to Ignore, everything seems to work fine (for now). Since there are other workarounds for problems with the same symptoms, I suggest trying this first, since it is a simple and quick workaround.

OpenVPN support on Ubuntu can be rather challenging¹ to set up, depending on the server configuration.

Here is what I learned about that in the last weeks:

Depending on your configuration file, import into the Ubuntu network manager may not work, so you should check if there is a known bug that affects you, e.g. missing support for certain OpenVPN options.
Even when you got the import working, the network manager may simply not support all settings you need. In this case, you will have to run OpenVPN manually: sudo openvpn path/to/my/config.ovpn (this is also useful for debugging).
Some issues are caused by multiple machines being connected to the same OpenVPN server with the same certificate. Only use a single session to rule out any problems in that regard.
WiFi can cause a lot of trouble for OpenVPN and you may not notice very brief connection interruptions yourself, so improve your router configuration if you suspect this to be a problem (e.g. switch off the 5GHz band, only use 802.11b+g, etc.).
DNS setup also needs an extra step that may be missing from your OpenVPN config file. It should contain:

script-security 2
up /etc/openvpn/update-resolv-conf
down /etc/openvpn/update-resolv-conf

For more troubleshooting, change the verbosity level in your config file (e.g. to verb 4) and try again.

Good luck!

When I configured my clients for the previous VPN setup, it was the other way around: a nightmare on Windows, but a piece of cake on Ubuntu. I guess it really does depend on the specific OpenVPN setup. ↩

Let’s try this again…

2016-12-25T00:00:00+00:00

My previous setup was nice in principle, but still required too much overhead regarding the CGI setup, how to ‘freeze’ the page, and so on. The problem was not so much that it was complicated in principle, but it was too complicated to simply change or add some content in five minutes every few weeks, without using any of the tools involved in the meantime (and thus having to go back to the README again and again).

Maybe that was a reason I did not blog as (in)frequently as I would like to have done (and, yes, OK, maybe it is also because I was occupied with other things ;-).

Nevertheless, I now switched everything to Jekyll and host the site on Github. Even the vanilla setup covers 95% of my use case, and I will live without the other 5%¹.

Let’s see how that goes. At least I’ve dealt with that particular ‘technical debt’ before the year ends.

As a bonus, the site is now served via HTTPS. ↩