Apache TikaApache Tika
Apache TikaCertified

Tasks that extract text and metadata from files using Apache Tika.

Provide input files to detect content type and extract text or metadata across many formats (PDF, Office, images, etc.), returning parsed content for further processing in workflows.