Set up Apache Tika server (optional)¶
Apache Tika is a content analysis toolkit used to detect and extract metadata and text from different file types. It can be used both as a service and a command line utility.
In the console, create
jars directory in your home directory and
position into it:
mkdir ~/jars cd ~/jars
Tika Server is a standalone runnable jar binary. Download the
appropriate version to the created
jars directory from
Execute on the command line:
Start the Tika server by executing on the command line:
java -jar jars/tika-server-1.24.1.jar
The server will run in the foreground, and you can stop it when needed
The server will be available on
127.0.0.1 on port
9998. To find
about other available options, execute on the command line:
java -jar jars/tika-server-1.24.1.jar --help
To test if Tika server is running, open http://localhost:9998.
This should open a web page describing Tika’s REST API endpoints.