Website Design & Development
We create stunning, user-friendly websites that drive growth.
We create stunning, user-friendly websites that drive growth.
We build custom apps to drive innovation.
We manage your IT, so you can focus on your core business.
We deliver scalable, secure cloud services for seamless operations.
When you read a journal article online, you might view it as a webpage, download it as a PDF, or export its citation to a reference manager. Behind these different formats often lies a single source: XML. Specifically, scholarly publishing increasingly relies on JATS (Journal Article Tag Suite) XML as the foundation for modern article production. Understanding XML and JATS helps publishers appreciate how contemporary publishing workflows create versatile, discoverable content.
XML (eXtensible Markup Language) is a way of structuring information so both humans and computers can understand it. Unlike formats designed purely for visual presentation, XML describes what content means, not just how it looks.
In XML, content is wrapped in descriptive tags that identify its purpose. For example, rather than simply making text bold (a visual instruction), XML might identify text as an "article-title" (a meaning instruction). This semantic approach allows the same content to be presented differently across contexts while maintaining its underlying structure.
Think of XML as a detailed outline that captures not just the words of an article but also identifies every component—title, authors, affiliations, abstract, sections, references, figures, tables—in a machine-readable way.
JATS (Journal Article Tag Suite) is a specific XML vocabulary developed for scholarly journal articles. Maintained as NISO standard Z39.96, JATS provides a standardised set of elements and attributes for describing journal article content.
JATS emerged from earlier standards (NLM DTD) used by PubMed Central and has become the dominant XML format for scholarly articles worldwide. When publishers, archives, and databases need to exchange article content, JATS provides a common language they can all understand.
JATS includes elements for everything scholarly articles contain:
From a single JATS XML source, publishers can generate multiple output formats—HTML for web viewing, PDF for download and print, EPUB for e-readers, and formats for specific databases. Changes to the XML source automatically propagate to all derived formats, ensuring consistency.
JATS XML provides rich, structured metadata that search engines and academic databases can parse effectively. Properly tagged content is more discoverable than unstructured formats because systems can extract and index specific elements—author names, keywords, funding sources, cited references—with precision.
Major archives like PubMed Central require JATS XML for deposit. Content submitted in standardised XML can be preserved, migrated, and rendered consistently over time, even as display technologies change. PDF appearance depends on viewer software; XML content remains interpretable regardless of presentation layer.
Structured XML enables better accessibility features. Screen readers can navigate by semantic sections rather than visual layout. Content can reflow for different screen sizes. Alternative formats for users with disabilities become easier to generate from structured sources.
When references are structured in JATS XML with proper citation elements, automated systems can match citations to source articles, add DOI links, and enable citation tracking services. Unstructured references require manual intervention or error-prone parsing.
Looking to Improve Your Journal's Technical Infrastructure?
Modern journal setups include consideration of XML workflows and structured content. Professional configuration ensures your journal meets contemporary publishing standards.
A JATS XML document divides into major sections:
Contains journal metadata (title, ISSN, publisher) and article metadata (title, contributors, abstract, keywords, publication dates, permissions). This section provides the "who, what, when" information about the article.
Contains the article's main content—sections, paragraphs, figures, tables, equations, and other elements that constitute the scholarly contribution itself.
Contains supporting material including the reference list, appendices, glossaries, and acknowledgments. The reference list (<ref-list>) uses structured citation elements enabling automated linking.
OJS supports XML workflows through various mechanisms:
OJS includes plugins that can generate basic JATS XML from article metadata. These exports provide structured data for archiving and indexing purposes.
Plugins like Texture provide XML editing capabilities within OJS, and viewers like Lens can display JATS XML articles in enhanced web interfaces with features like inline reference viewing and figure browsing.
Some journals adopt "XML-first" production where articles are created in JATS XML and other formats derive from that source. OJS can accommodate these workflows though they require additional tooling and expertise.
OJS allows uploading XML files as article galleys alongside or instead of PDFs, enabling journals to provide structured content to readers who benefit from it.
Despite its benefits, XML publishing presents challenges:
Creating valid JATS XML requires understanding the standard's elements and attributes. Converting manuscripts (typically Word documents) to proper JATS XML involves either manual markup or sophisticated automated conversion tools.
JATS XML must validate against the standard's schema. Invalid XML—missing required elements, improper nesting, incorrect attribute values—fails validation and may be rejected by archives and databases.
Effective XML workflows require appropriate tools for creation, editing, validation, and transformation. Free tools exist but have learning curves; commercial solutions offer convenience at cost.
Editorial and production staff need training to work with XML effectively. The transition from document-based to structured content workflows requires adjustment.
Converting existing PDF archives to JATS XML is laborious. Many journals maintain XML workflows for new content while older content remains in legacy formats.
Several important destinations require or strongly prefer JATS XML:
PubMed Central: The U.S. National Library of Medicine's free archive requires JATS XML for article deposits. Journals included in PMC must provide properly structured XML.
Europe PMC: The European equivalent similarly requires structured XML for archived content.
Institutional Repositories: Some repositories can ingest and display JATS XML, providing enhanced presentation compared to PDF-only deposits.
Aggregators and Databases: Various indexing services prefer or require structured content for optimal processing.
JATS comes in multiple versions and variants:
JATS Versions: The standard has evolved through versions (1.0, 1.1, 1.2, 1.3). Newer versions add elements for emerging needs while maintaining backward compatibility. Most systems accept multiple versions.
Tag Sets: JATS offers different "tag sets" for different purposes—Archiving (comprehensive, for preservation), Publishing (for publisher production), and Authoring (simplified, for manuscript creation).
Publishers typically use the Archiving or Publishing tag sets for production and deposit purposes.
Journals interested in XML workflows might consider:
Start with Metadata: Even without full-text XML, ensuring rich, structured metadata improves discoverability. OJS captures metadata that can export in XML formats.
Evaluate Requirements: Determine whether your target indexes and archives require JATS XML. Not all journals need full XML workflows—requirements vary by discipline and goals.
Explore Tools: Free tools like the JATS4R validator check XML compliance. Production tools range from open-source options to commercial services.
Consider Services: XML conversion services can transform manuscripts to JATS XML without requiring in-house expertise. Costs vary based on complexity and volume.
Plan Gradually: Transitioning to XML-first workflows is significant. Many journals begin with basic XML exports and evolve toward fuller implementation over time.
Altechmind helps journals establish professional publishing infrastructure that supports contemporary standards. From basic OJS setup to advanced workflow configuration, we build foundations for quality publishing.