Spicule - Data Processing Experts

Tika Packaging With Snaps

During our work with Juju at Spicule we have on a number of occasions come into contact with the new Snap packaging format. For those of you who don’t know of it, think Debs or RPMs but crossplatform and with additional bells and whistles.

Snaps we born from Canonicals need to deploy applications to phones from their App Store, okay the phones weren’t a great hit, but the packaging format lives on and more and more Canonical based software is being released using the Snappy packaging format. So what makes it different?


Okay, so this one is still a work in progress but there is of course support for Ubuntu and some support for Arch, Fedora, Centos, Debian, Gentoo and OpenSuse to name a few. Being able to install the same package onto all these platforms without having to repackage in their own format is a application developers dream. Package once run everywhere is the grail of application packaging, no one has got a format to catch on the mainstream yet, hopefully Snaps becomes the standard.


When you install software onto your server, you never know quite what its going to do, what services its going to hit up and what stuff it does under the covers, we just make the assumption that it was written to do things for good, not evil. Of course even so zero day flaws exist and unless the developers are vastly ahead of the curve apt update && apt upgrade might not be enough. In snaps confinement means that the application isn’t going to intefere with any other application, no does it know of their existence. There are currently 3 modes, strict, devmode and classic. Strict and Devmode are similar where one is… strict and the other just gives you warnings (think of lovely apparmor). Classic is the closest you’ll get to a DEB or RPM and allows the Snap to still see the whole system, Strict will only see a small selection of apths that exist within its package. There are though interfaces that allow you to plug your snap into other systems, developers need to declare these to make them accessible and thats a good thing because it means the app is as secure as possible, unless the developer just defined every interface, and as such it will only gain access to the services its allowed to see.


Another cool thing about Snaps is the rolling updates. Because a developer can include all the dependencies within the Snap it means they don’t have to wait for new dependency X to be released or have to provide a nasty hack to get the packages installed, they just include the newer library within the Snap, no more lib version hell. Snaps also use clever deltas to only download the bits that have changed, which, initially can make the packages larger, but in the long run makes the updates much lighter weight. Snap packages will also check upstream weekly to see if there is a newer version, and if so, install it. Magic!

So what does this have to do with Apache Tika? Well I’ve built some Snap packages for them as we often have to stick Tika on remote servers and also after Tim’s explanation of code escallation it seemed to us that confinement would make sense to try and minimise risks to our servers. The cool thing about snaps is at their most basic they are easy to write here is one of our current Snaps:

name: tika-server
version: '1.16'
summary: Tika Server for metadata discovery and extraction # 79 char long summary
description: Apache Tika is a content detection and analysis framework, written 
      in Java, stewarded at the Apache Software Foundation.It detects and extracts 
      metadata and text from over a thousand different file types, and as well as 
      providing a Java library, has server and command-line editions suitable for 
      use from other programming languages.
grade: stable
confinement: strict

       java -jar $SNAP/jar/tika-server-1.16-SNAPSHOT.jar

    source: ../tika-server
    plugin: maven
    maven-options: [-DskipTests]

Pretty simple huh? It builds the tika server jar and then runs it. Canonical also provide a snap store so its dead simple to get going. If you want to try Tika Server on Ubuntu, Fedora etc just install Snapd:

apt install snapd 

(on ubuntu, here for more details) then

snap install tika-server

and you’ll have the Tika rest server available to throw documents at.

Similarly you can:

snap install tika-app

and get the Tika Gui running on your computer in next to no time.

Have a go and let us know what you think!


Apache Canonical Snapcraft