Path

ez projects / eztika / news


News

This version of eZ Tika incorporates contributions from Felix Woldt (CJW Network) and an updated tika.jar.

The main new feature is that the extension will work simply by activating it. No need to copy and modify files around your server if you don't need to (typically a server hosting just one installation of eZ Publish).

Besides the zero-config option, it is now possible to activate a dedicated eztika debugging setting that will log the text extraction success or failure status and also optionally keeps the temporary file containing the extracted text itself.

Downloads and more: http://projects.ez.no/eztika

Happy indexing!

Paul

Besides many bugfixes, new file types are supported as well: iWork, chm

Check out http://projects.ez.no/eztika

Many bug fixes and quality enhancements to existing converters, most notably CJK pdf documents are now correctly converted to UTF-8 code.

eZ Tika 1.2, based on Apache Tika 0.6-dev, carries a lot of under the hood improvements and is capable of converting most binary file types to plain text for sub-sequent indexing by any search plugin, most notably eZ Find.