What is XML/JATS Publishing? Understanding Structured Content

When you read a journal article online, you might view it as a webpage, download it as a PDF, or export its citation to a reference manager. Behind these different formats often lies a single source: XML. Specifically, scholarly publishing increasingly relies on JATS (Journal Article Tag Suite) XML as the foundation for modern article production. Understanding XML and JATS helps publishers appreciate how contemporary publishing workflows create versatile, discoverable content.

What is XML? #

XML (eXtensible Markup Language) is a way of structuring information so both humans and computers can understand it. Unlike formats designed purely for visual presentation, XML describes what content means, not just how it looks.

In XML, content is wrapped in descriptive tags that identify its purpose. For example, rather than simply making text bold (a visual instruction), XML might identify text as an "article-title" (a meaning instruction). This semantic approach allows the same content to be presented differently across contexts while maintaining its underlying structure.

Think of XML as a detailed outline that captures not just the words of an article but also identifies every component—title, authors, affiliations, abstract, sections, references, figures, tables—in a machine-readable way.

What is JATS? #

JATS (Journal Article Tag Suite) is a specific XML vocabulary developed for scholarly journal articles. Maintained as NISO standard Z39.96, JATS provides a standardised set of elements and attributes for describing journal article content.

JATS emerged from earlier standards (NLM DTD) used by PubMed Central and has become the dominant XML format for scholarly articles worldwide. When publishers, archives, and databases need to exchange article content, JATS provides a common language they can all understand.

JATS includes elements for everything scholarly articles contain:

Front matter: title, authors, affiliations, abstract, keywords, funding information
Body content: sections, paragraphs, figures, tables, equations, lists
Back matter: references, appendices, acknowledgments
Metadata: DOIs, publication dates, permissions, article types

Why JATS Matters for Journals #

Format Flexibility #

From a single JATS XML source, publishers can generate multiple output formats—HTML for web viewing, PDF for download and print, EPUB for e-readers, and formats for specific databases. Changes to the XML source automatically propagate to all derived formats, ensuring consistency.

Enhanced Discoverability #

JATS XML provides rich, structured metadata that search engines and academic databases can parse effectively. Properly tagged content is more discoverable than unstructured formats because systems can extract and index specific elements—author names, keywords, funding sources, cited references—with precision.

Archival Preservation #

Major archives like PubMed Central require JATS XML for deposit. Content submitted in standardised XML can be preserved, migrated, and rendered consistently over time, even as display technologies change. PDF appearance depends on viewer software; XML content remains interpretable regardless of presentation layer.

Accessibility #

Structured XML enables better accessibility features. Screen readers can navigate by semantic sections rather than visual layout. Content can reflow for different screen sizes. Alternative formats for users with disabilities become easier to generate from structured sources.

Citation Linking #

When references are structured in JATS XML with proper citation elements, automated systems can match citations to source articles, add DOI links, and enable citation tracking services. Unstructured references require manual intervention or error-prone parsing.

Looking to Improve Your Journal's Technical Infrastructure?

Modern journal setups include consideration of XML workflows and structured content. Professional configuration ensures your journal meets contemporary publishing standards.

Explore Journal Development Services →

JATS XML Structure #

A JATS XML document divides into major sections:

Front Matter (<front>) #

Contains journal metadata (title, ISSN, publisher) and article metadata (title, contributors, abstract, keywords, publication dates, permissions). This section provides the "who, what, when" information about the article.

Body (<body>) #

Contains the article's main content—sections, paragraphs, figures, tables, equations, and other elements that constitute the scholarly contribution itself.

Back Matter (<back>) #

Contains supporting material including the reference list, appendices, glossaries, and acknowledgments. The reference list (<ref-list>) uses structured citation elements enabling automated linking.

XML/JATS in OJS #

OJS supports XML workflows through various mechanisms:

JATS Template Plugin #

OJS includes plugins that can generate basic JATS XML from article metadata. These exports provide structured data for archiving and indexing purposes.

Texture/Lens Viewers #

Plugins like Texture provide XML editing capabilities within OJS, and viewers like Lens can display JATS XML articles in enhanced web interfaces with features like inline reference viewing and figure browsing.

XML-First Workflows #

Some journals adopt "XML-first" production where articles are created in JATS XML and other formats derive from that source. OJS can accommodate these workflows though they require additional tooling and expertise.

Galley Options #

OJS allows uploading XML files as article galleys alongside or instead of PDFs, enabling journals to provide structured content to readers who benefit from it.

Challenges of XML Publishing #

Despite its benefits, XML publishing presents challenges:

Production Complexity #

Creating valid JATS XML requires understanding the standard's elements and attributes. Converting manuscripts (typically Word documents) to proper JATS XML involves either manual markup or sophisticated automated conversion tools.

Validation Requirements #

JATS XML must validate against the standard's schema. Invalid XML—missing required elements, improper nesting, incorrect attribute values—fails validation and may be rejected by archives and databases.

Tooling Investment #

Effective XML workflows require appropriate tools for creation, editing, validation, and transformation. Free tools exist but have learning curves; commercial solutions offer convenience at cost.

Staff Training #

Editorial and production staff need training to work with XML effectively. The transition from document-based to structured content workflows requires adjustment.

Legacy Content #

Converting existing PDF archives to JATS XML is laborious. Many journals maintain XML workflows for new content while older content remains in legacy formats.

Who Requires JATS XML? #

Several important destinations require or strongly prefer JATS XML:

PubMed Central: The U.S. National Library of Medicine's free archive requires JATS XML for article deposits. Journals included in PMC must provide properly structured XML.

Europe PMC: The European equivalent similarly requires structured XML for archived content.

Institutional Repositories: Some repositories can ingest and display JATS XML, providing enhanced presentation compared to PDF-only deposits.

Aggregators and Databases: Various indexing services prefer or require structured content for optimal processing.

JATS Versions and Variants #

JATS comes in multiple versions and variants:

JATS Versions: The standard has evolved through versions (1.0, 1.1, 1.2, 1.3). Newer versions add elements for emerging needs while maintaining backward compatibility. Most systems accept multiple versions.

Tag Sets: JATS offers different "tag sets" for different purposes—Archiving (comprehensive, for preservation), Publishing (for publisher production), and Authoring (simplified, for manuscript creation).

Publishers typically use the Archiving or Publishing tag sets for production and deposit purposes.

Getting Started with XML Publishing #

Journals interested in XML workflows might consider:

Start with Metadata: Even without full-text XML, ensuring rich, structured metadata improves discoverability. OJS captures metadata that can export in XML formats.

Evaluate Requirements: Determine whether your target indexes and archives require JATS XML. Not all journals need full XML workflows—requirements vary by discipline and goals.

Explore Tools: Free tools like the JATS4R validator check XML compliance. Production tools range from open-source options to commercial services.

Consider Services: XML conversion services can transform manuscripts to JATS XML without requiring in-house expertise. Costs vary based on complexity and volume.

Plan Gradually: Transitioning to XML-first workflows is significant. Many journals begin with basic XML exports and evolve toward fuller implementation over time.

OJS

Clients Portal

Services Offered

Dspace

WordPress

What is XML/JATS Publishing? Understanding Structured Content

What is XML? #

What is JATS? #

Why JATS Matters for Journals #

Format Flexibility #

Enhanced Discoverability #

Archival Preservation #

Accessibility #

Citation Linking #

JATS XML Structure #

Front Matter (<front>) #

Body (<body>) #

Back Matter (<back>) #

XML/JATS in OJS #

JATS Template Plugin #

Texture/Lens Viewers #

XML-First Workflows #

Galley Options #

Challenges of XML Publishing #

Production Complexity #

Validation Requirements #

Tooling Investment #

Staff Training #

Legacy Content #

Who Requires JATS XML? #

JATS Versions and Variants #

Getting Started with XML Publishing #

Build a Modern Journal Infrastructure #

What are your Feelings

Registered Office

Get in Touch

Website Design & Development

Software & App Development

Advanced IT Management Solutions

Cloud-Based Solutions & Services

What is XML? #

What is JATS? #

Why JATS Matters for Journals #

Format Flexibility #

Enhanced Discoverability #

Archival Preservation #

Accessibility #

Citation Linking #

JATS XML Structure #

Front Matter (<front>) #

Body (<body>) #

Back Matter (<back>) #

XML/JATS in OJS #

JATS Template Plugin #

Texture/Lens Viewers #

XML-First Workflows #

Galley Options #

Challenges of XML Publishing #

Production Complexity #

Validation Requirements #

Tooling Investment #

Staff Training #

Legacy Content #

Who Requires JATS XML? #

JATS Versions and Variants #

Getting Started with XML Publishing #

Build a Modern Journal Infrastructure #

What are your Feelings

Share This Article :

How can we help?