docker mongodb and insufficient free space

If you’re running a mongodb instance inside a docker container you may end up after a while of starting and shutting down the containers with mongodb no longer starting and complaining with the error message

ERROR: Insufficient free space for journal files

Indeed if you run from within the failing container a df -h you may easily see that the space for the mongodb data is less than 3GB. In my case was /data/db inside the container itself as I don’t care to preserve the data.

Running a command like

docker ps --filter status=dead --filter status=exited -aq | xargs docker rm -v

clean-up all the cached docker filesystem which add up one on top of each other.

If you then run the docker container with the –rm flag it will automatically clean-up after exiting.

$ docker run -rm -it <image>

References:

 

Bash: compute dates

Often I found myself in the need of computing some math on dates. For example what is 10 weeks from a given date? Rather than opening a calendar and start counting you can quickly open a bash shell.

Example: what’s 6th of June 2017 plus 10 weeks?

$ date -j -v '+10w' -f '%Y-%m-%d' '2017-06-06'
Tue 15 Aug 2017 10:00:20 BST

Maven release plugin and local changes

It may happen that you’re trying to fix some issues that affects the release of a maven project and therefore you apply some changes locally and then run

mvn release:prepare -DdryRun=true -Darguments=-DskipTests

Then you get an error message like

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare (default-cli) on project jackrabbit-oak: Cannot prepare the release because you have local modifications :
[ERROR] [oak-api/src/main/java/org/apache/jackrabbit/oak/api/PropertyState.java:modified]
[ERROR] [oak-api/src/main/java/org/apache/jackrabbit/oak/api/jmx/CacheStatsMBean.java:modified]
[ERROR] [oak-api/src/main/java/org/apache/jackrabbit/oak/api/jmx/CheckpointMBean.java:modified]
[ERROR] [oak-api/src/main/java/org/apache/jackrabbit/oak/api/jmx/IndexStatsMBean.java:modified]
[ERROR] [oak-api/src/main/java/org/apache/jackrabbit/oak/api/jmx/RepositoryStatsMBean.java:modified]

Sure. release-plugin is giving you an extra check about any local changes that may not end in the release. However you don’t want to commit those changes to the SCM just yet; you want first to see if they works.

A “quick” workaround is on the command line in the (verbose) form of

mvn release:prepare -DdryRun=true -Darguments=-DskipTests -DcheckModificationExcludeList=oak-api/src/main/java/org/apache/jackrabbit/oak/api/*,oak-api/src/main/java/org/apache/jackrabbit/oak/api/jmx/*

You can as well specify something in the poms directly however I never had the need.

References:

 

Firefox install unsigned add-ons

Since Firefox 43, it won’t allow you to install unsigned extensions.

While it’s a good thing, you can revert it in case with the following steps:

To disable signature checks, you will need to set the xpinstall.signatures.required preference to “false”.

  • type about:config into the URL bar in Firefox
  • in the Search box type xpinstall.signatures.required
  • double-click the preference, or right-click and selected “Toggle”, to set it to false.

It will eventually show your unsigned extension with a warning but it should work out fine.

References:

OSX Restarting GDrive App

If you use Google Drive and the OSX app, and like me you often travel and change network connection (VPNs count as well), you may have noticed that it won’t connect anymore.

While I don’t know why exactly, it may be due to some security things in the app itself, I noticed that by restarting the app it connects successfully.

Having to do it more than once a day, here is a simple script that with a double-click will restart it.

https://github.com/davidegiannella/misc/blob/master/restart-gdrive

The weight of a property name in AEM

Jackrabbit 2.x / CQ5

Back in Jackrabbit 2.x, and therefore in CQ/AEM 5.x, everything was indexed by default other than you stated otherwise.

This translated that every time you run a query, Lucene was there for you serving an indexed answer.

In this scenario it didn’t really matter what property name you used for you application or if you defined additional node types.

This had the advantage that everything was indexed and therefore an index was almost always there serving your query and you didn’t have to think about it.

On the other hand we all know that the bigger the index is, the slower it will be in serving you the result set, as it will simply have to analyse more data.

Jackrabbit Oak / AEM6

Nowadays Apache Jackrabbit Oak, aka Jackrabbit 3.x, is the foundation of AEM6.

Opposed to JR2, in Oak almost nothing is indexed by default. Which means that if you would take a vanilla Oak and run a query, you have very good chances you’re going to traverse the repository (depending on your query).

This has the advantage that you can create very dedicated indexes that will overall perform better as they will be as tailored as possible to your query.

The disadvantage are that you’ll have to define each index and that you’ll have to know how fine tune your queries for getting the most out of this approach.

Not going deeply into the configuration of each individual available index type I think the two main properties, you’ll end-up tuning for better performances are

  • propertyNames
  • declaringNodeTypes

the first one will define what property your index is going to index while the second will restrict the index to a specific node type. In other words the condition for a node to be included into an index are

$nodetype in ($declaringNodeTypes) AND $property = $propertyNames

caveats

  • indexes on more than one property are not supported (yet)
  • an index cannot serve conditions where you ask something like WHERE property IS NULL.

This take us to the very topic of this post: be careful on how you use your property or structure your queries.

Remember the rule: the smaller the index the more efficient the query.

Let’s see how important is a property and a node type with an example then.

If you have a custom application in which you want to extract nodes after a specific date, a way of doing so would be

SELECT * FROM [nt:base]
WHERE [jcr:lastModified] >= CAST('...' AS DATE)

this query is very bad. It can’t really makes use of any index.

Let’s say you create an index on jcr:lastModified. The index itself will be almost as big as the repository as by default in AEM (almost?) every node as mix:lastModified.

A better way would be

SELECT * FROM [nt:base]
WHERE [myLastModified] >= CAST('...' AS DATE)

this will allow you to define an index on the property mylastModified which you’ll know it will contain only your application data. But we can get even better.

Let’s assume you have a very sparse and large content structure so you can’t apply path filters and you don’t want on the other side to create tons of myLastModified for addressing different aspects of your information.

Let’s assume then, for sake of example, that you categorise your data into:

  • comments
  • news
  • articles.

What you could do is create three different node types:

  • my:comments
  • my:news
  • my:articles

now you can define three different, very dedicated indexes

  • declaringNodeTypes = my:comments AND propertyNames = myLastModified
  • declaringNodeTypes = my:news AND propertyNames = myLastModified
  • declaringNodeTypes = my:articles AND propertyNames = myLastModified

One eventual query will look like

SELECT * FROM [my:comments]
WHERE [myLastModified] >= CAST('...' AS DATE)

Actually in the example above, assuming your nodes comes with mix:lastModified, as soon as you create a custom node type you could have simply used the jcr:lastModified date as they will be (I expect) the same size. You can change the exercise above with any property name like: colours, size, tags, etc.

References