MarkLogic Server 3.0 performs automatic conversion to XML on documents in PDF, HTML and Office formats.Mark Logic Corp. has made a number of enhancements to its XML content server in a new release, including automatic conversion features and better search functionality.
The MarkLogic Server 3.0 has three main improvements over its predecessor, all designed to give users more speed and flexibility when dealing with XML content, according to David Spenhoff, Mark Logic vice president of marketing.
The most notable new feature is the server's automatic conversion ability, which can turn documents created in Microsoft Office, PDF and HTML into XML files.
This could prove appealing to desktop publishers and other enterprises because it could potentially reduce the amount of time each file has to be processed.
"Most documents today aren't authored in XML," said Spenhoff. "That means that companies wanting to move to XML as a standard way of storing and accessing files will have to expend time and money to get them into a content server."
Click here to read about the system demands associated with XML processing.
By contrast, automatic conversion can reduce that pain. The MarkLogic Server loads XML "as is," noted Spenhoff, without requiring any DTD or XML schema, so there is no need for the usual extra step to "shred" or "chunk" documents so they can be put into a relational database.
Another major new feature for the server is its content processing framework, which Spenhoff said is a powerful attribute, rather than a nugget of industry jargon.
"That's not some 'markitecture' buzzphrase," he said. "It's a trigger-based system that we think customers will appreciate."
The framework lets users create custom content processing pipelinestrigger-based sequences of content processing stepscomposed of native XQuery statements and external applications that are Web services-enabled.
For example, if a PDF version of a medical journal is processed through the server, and then automatically converted to XML, the content processing framework would tag all the medical terms and create an index that's appended to the XML document.
With this capability, the publisher could render the result as a Web page, combine the document with other related articles, or create an index for a book.
An internal search enhancement rounds out the trio of major updates. The capability gives users more refined search capability through multiple areas of text, compared to the previous version of the server.
The changes should give MarkLogic Server users a better grasp of content control, and save processing effort at the same time, Spenhoff noted.