The weight of a property name in AEM

Jackrabbit 2.x / CQ5

Back in Jackrabbit 2.x, and therefore in CQ/AEM 5.x, everything was indexed by default other than you stated otherwise.

This translated that every time you run a query, Lucene was there for you serving an indexed answer.

In this scenario it didn’t really matter what property name you used for you application or if you defined additional node types.

This had the advantage that everything was indexed and therefore an index was almost always there serving your query and you didn’t have to think about it.

On the other hand we all know that the bigger the index is, the slower it will be in serving you the result set, as it will simply have to analyse more data.

Jackrabbit Oak / AEM6

Nowadays Apache Jackrabbit Oak, aka Jackrabbit 3.x, is the foundation of AEM6.

Opposed to JR2, in Oak almost nothing is indexed by default. Which means that if you would take a vanilla Oak and run a query, you have very good chances you’re going to traverse the repository (depending on your query).

This has the advantage that you can create very dedicated indexes that will overall perform better as they will be as tailored as possible to your query.

The disadvantage are that you’ll have to define each index and that you’ll have to know how fine tune your queries for getting the most out of this approach.

Not going deeply into the configuration of each individual available index type I think the two main properties, you’ll end-up tuning for better performances are

  • propertyNames
  • declaringNodeTypes

the first one will define what property your index is going to index while the second will restrict the index to a specific node type. In other words the condition for a node to be included into an index are

$nodetype in ($declaringNodeTypes) AND $property = $propertyNames


  • indexes on more than one property are not supported (yet)
  • an index cannot serve conditions where you ask something like WHERE property IS NULL.

This take us to the very topic of this post: be careful on how you use your property or structure your queries.

Remember the rule: the smaller the index the more efficient the query.

Let’s see how important is a property and a node type with an example then.

If you have a custom application in which you want to extract nodes after a specific date, a way of doing so would be

SELECT * FROM [nt:base]
WHERE [jcr:lastModified] >= CAST('...' AS DATE)

this query is very bad. It can’t really makes use of any index.

Let’s say you create an index on jcr:lastModified. The index itself will be almost as big as the repository as by default in AEM (almost?) every node as mix:lastModified.

A better way would be

SELECT * FROM [nt:base]
WHERE [myLastModified] >= CAST('...' AS DATE)

this will allow you to define an index on the property mylastModified which you’ll know it will contain only your application data. But we can get even better.

Let’s assume you have a very sparse and large content structure so you can’t apply path filters and you don’t want on the other side to create tons of myLastModified for addressing different aspects of your information.

Let’s assume then, for sake of example, that you categorise your data into:

  • comments
  • news
  • articles.

What you could do is create three different node types:

  • my:comments
  • my:news
  • my:articles

now you can define three different, very dedicated indexes

  • declaringNodeTypes = my:comments AND propertyNames = myLastModified
  • declaringNodeTypes = my:news AND propertyNames = myLastModified
  • declaringNodeTypes = my:articles AND propertyNames = myLastModified

One eventual query will look like

SELECT * FROM [my:comments]
WHERE [myLastModified] >= CAST('...' AS DATE)

Actually in the example above, assuming your nodes comes with mix:lastModified, as soon as you create a custom node type you could have simply used the jcr:lastModified date as they will be (I expect) the same size. You can change the exercise above with any property name like: colours, size, tags, etc.


Install MySQL on OSX

If you need a local instance of a mysql database for experiments and don’t really care about security as it’s not any production system here’s a quick way to install a mysql database on your OSX (tested with 10.9+)

  1. Download the latest OSX native package installation from the
    download page. It should be a dmg
  2. Install it as any other OSX application. It will create under
    /usr/local/mysql-x.y.z with a symlink to /usr/local/mysql the
    entire directory structure that will be needed by mysql. From now
    on I will refer as /usr/local/mysql directory.
  3. sudo rm -r /usr/local/mysql/data
  4. sudo mkdir /usr/local/mysql/data
  5. sudo chown -R <youruser> /usr/local/mysql/data
  6. /usr/local/mysql/scripts/mysql_install_db

If everything worked out fine you should be able to start it with

$ mysqld_safe &

to stop it

$ mysqladmin -u root shutdown

I’ve created a couple of aliases and exports in my ~/.profile to ease the tasks

 export PATH=$PATH:/usr/local/mysql/bin
 alias mysql-start="mysqld_safe &"
 alias mysql-stop="mysqladmin -u root shutdown"

“Reverting” the auto-fill in emacs

Most of the time I use emacs with auto-fill mode breaking any line at 70. Sometimes though I’d like a paragraph to be reverted to a single line; for example for copy-pasting into other editors where it could breaks the formatting. Here’s how

M-x set-variable RET fill-column RET 100000
M-q (on the desired paragraphs)
M-x set-variable RET fill-column RET 70

First line set the fill-column to a large amount, the second will re-run the fill-paragraph and the third will reset the fill-column to my usual value

WiFi disconnect while sleeping on Mavericks

Was just provided of a new laptop with Mac OS X Mavericks on it. Other than I think it’s a very bad product and made me seriously think about moving back to windows, I tried at all costs to solve the main issue I had: if the laptop goes to sleep the WiFi disconnects.

I can understand that for the majority of users this can be awesome but I simply hate it. I can have long-running process that want to have it running for the whole night.

Searched around the web here and there and tried many things. Reached the point where I think it’s not possible to disable this feature, therefore I went in disabling the Sleep completely. This made the trick in my situation.

Issue the following command on terminal:

$ sudo pmset -a sleep 0

Here’s my complete pmset settings:

$ pmset -g
Active Profiles:
Battery Power        -1*
AC Power        -1
Currently in use:
 standbydelay         10800
 standby              1
 halfdim              1
 hibernatefile        /var/vm/sleepimage
 darkwakes            0
 gpuswitch            2
 disksleep            0
 sleep                0
 autopoweroffdelay    14400
 hibernatemode        3
 autopoweroff         1
 ttyskeepawake        1
 displaysleep         2
 acwake               0
 lidwake              1


What’s my CQ WCM mode?

CQ offers you 4 different type of WCM modes: edit, preview, design and disabled. The first three are actually different modes while the forth is when none of them is available and, at the end, always represent the situation on the publish site.

Differentiate on the Java side is very easy as it provides java API and it provides Javascript API for doing it as well. Nevertheless the same object that gives you the mode is not (and should not be) available on the publish which will always end in ifs here and there.

This script should help you and it should be possible to drop it as part of your clientlibs and ready to be used for he use case.

Something very simple that hopefully will help your daily life. :)

How many CQ5 concurrent users?

Defining the concept of concurrent user in the web world is difficult and it’s even more difficult to do it in CQ as it doesn’t keep any session informantion. Technically speaking I define two users as concurrent when a request from user A has not finished yet that the one from B starts.

Don’t know if it’s possible to achieve such information just by looking at CQ logs but the analyse-access tool help you in the analysis of the access.log files that CQ produces giving you some numbers in a very handy markdown format that can be then converted to PDF for presenting it to the business.


CQ5/OSGi reference a unique service implementation

You should never ever do it. If you find yourself in the need of this; there’s something extremely wrong in your code. Nevertheless I found myself in needing it for some old legacy code that was almost impossible to fix in a reasonable time.

The question is: how can I reference a specific implementation of a service/component in an OSGi (therefore CQ5 as well) environment?

In the component, where you need to reference the implementation you can specify something like the following

Bar bar;

In the component implementing the service you’ll have something like

@Component(immediate=true, metatype=false, name="")
class BarImpl implements Bar{

By default the framework will assign the fully qualified class name as component name. I prefer to specify it for making the code more readable and no one prohibit you to specify any arbitrary string like mickey mouse or goofy as component name.