This specification defines the Mathematical Markup Language, or MathML. MathML is a markup language for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as [[HTML]] has enabled this functionality for text.
This specification of the markup language MathML is intended primarily for a readership consisting of those who will be developing or implementing renderers or editors using it, or software that will communicate using MathML as a protocol for input or output. It is not a User's Guide but rather a reference document.
MathML can be used to encode both mathematical notation and mathematical content. About thirty-eight of the MathML tags describe abstract notational structures, while another about one hundred and seventy provide a way of unambiguously specifying the intended meaning of an expression. Additional chapters discuss how the MathML content and presentation elements interact, and how MathML renderers might be implemented and should interact with browsers. Finally, this document addresses the issue of special characters used for mathematics, their handling in MathML, their presence in Unicode, and their relation to fonts.
While MathML is human-readable, authors typically will use equation editors, conversion programs, and other specialized software tools to generate MathML. Several versions of such MathML tools exist, both freely available software and commercial products, and more are under development.
MathML was originally specified as an XML application and most of the examples in this specification assume that syntax. Other syntaxes are possible most notably [[HTML5]] specifies the syntax for MathML in HTML. Unless explictly noted, the examples in this specification are also valid HTML syntax.
For MathML 4, the MathML refresh community group plans to split the current MathML 3 spec into three separate documents:
MathML core is a distillation of the commonly used parts of MathML, rewritten to align with current W3C standards such as HTML and CSS. The full MathML spec maintains (mostly) backward compatibility with MathML 3. Examples of elements in MathML core include the elements for fractions, roots, scripts, limits, rows, and token elements (identifiers, numbers, operators, etc). Problematic, harder to implement, and less commonly used MathML features will make use of polyfills (likely via in part using Shadow DOM) to implement these features. Examples of features left out of MathML core but part of full MathML are mfenced
, menclose
, the elementary math elements, and linebreaking. Some features such as linebreaking will likely be part of a MathML Core Level 2 recommendation in the future.
MathML 3 is a very lengthy spec. We expect the Full MathML 4 document to be considerably shorter because we intend to pull most of the informative sections into a notes document. Additionally, much of chapter on presentation MathML concerns layout, and the full spec will reference MathML Core for details on layout.
The changes to the MathML 3 spec have yet to be made. However, many changes are a result of the above decision to split the spec into three parts. Hence we expect MathML Full to alter the MathML 3 spec in the following ways:
math
element. It will likely have a similar structure to that of MathML 3. The committee is discussing adding some semantic disambiguation to presentation MathML, and a new "subject" attribute (name not finalized) might be added to give some context to the math.The remainder of this document has not been updated to MathML 4.
The basic chapter structure of this document is based on the earlier MathML 2.0 Recommendation [MathML2]. That MathML 2.0 itself was a revision of the earlier W3C Recommendation MathML 1.01 [MathML1]; MathML 3.0 is a revision of the W3C Recommendation MathML 2.0. It differs from it in that all previous chapters have been updated, some new elements and attributes added and some deprecated. Much has been moved to separate documents containing explanatory material, material on characters and entities and on the MathML DOM. The discussion of character entities has led to the document XML Entity Definitions for Characters [Entities], which is now a W3C Recommendation. The concern with use of CSS with MathML has led to the document A MathML for CSS Profile [MathMLforCSS], which was a W3C Recommendation accompanying MathML 3.0.
The biggest differences from MathML 2.0 (Second Edition) are in Chapters 4 and 5, although there have been smaller improvements throughout the specification. A more detailed description of changes from the previous Recommendation follows.
Much of the non-normative explication that formerly was found in Chapters 1 and 2, and many examples from elsewhere in the previous MathML specifications, were removed from the MathML3 specification and planned to be incorporated into a MathML Primer to be prepared as a separate document. It is expected this will help the use of this formal MathML3 specification as a reference document in implementations, and offer the new user better help in understanding MathML's deployment. The remaining content of Chapters 1 and 2 has been edited to reflect the changes elsewhere in the document, and in the rapidly evolving Web environment. Some of the text in them went back to early days of the Web and XML, and its explanations are now commonplace.
Chapter 3, on presentation-oriented markup, adds new material on linebreaking,
and on markup for elementary math notations used in many
countries (mstack
, mlongdiv
and other
associated elements). Other changes include revisions to
the mglyph
, mpadded
and maction
elements and significant unification and cleanup of attribute
values.
Earlier work, as recorded in the W3C Note Arabic
mathematical notation, has allowed clarification of the
relationship with bidirectional text and examples
with RTL text have been added.
Chapter 4, on content-oriented markup, contains major changes and additions. The meaning of the actual content remains as before in principle, but a lot of work has been done on expressing it better. A few new elements have been added.
Chapter 5 has been refined as its purpose has been further clarified to deal with the mixing of markup languages. This chapter deals, in particular, with interrelations of parts of the MathML specification, especially with presentation and content markup.
Chapter 6 is a new addition which deals with the issues of interaction of MathML with a host environment. This chapter deals with interrelations of the MathML specification with XML and HTML, but in the context of deployment on the Web. In particular there is a discussion of the interaction of CSS with MathML.
Chapter 7 replaces the previous Chapter 6, and has been rewritten and reorganized to reflect the new situation in regard to Unicode, and the changed W3C context with regard to named character entities. The new W3C specification XML Entity Definitions for Characters, which incorporates those used for mathematics has become a a W3C Recommendation, [Entities].
The Appendices, of which there are eight shown, have been reworked. Appendix A now contains the new RelaxNG schema for MathML3 as well as discussion of MathML3 DTD issues. Appendix B addresses media types associated with MathML and implicitly constitutes a request for the registration of three new ones, as is now standard for work from the W3C. Appendix C contains a new simplified and reconsidered Operator Dictionary. Appendices D, E, F, G and H contain similar non-normative material to that in the previous specification, now appropriately updated.
A fuller discussion of the document's evolution can be found in Appendix F Changes.
MathML documents should be validated using the RelaxNG Schema for MathML, either in the XML encoding (http://www.w3.org/Math/RelaxNG/mathml4/mathml4.rng) or in compact notation (http://www.w3.org/Math/RelaxNG/mathml4/mathml4.rnc) which is also shown below.
In contrast to DTDs there is no in-document method to associate a RelaxNG schema with a document.
We provide five RelaxNG schema for Mathml4 in five parts:
The grammar for full MathML
The grammar for elements common to Content and Presentation
The grammar for Presentation MathML
The grammar for Strict Content MathML
The grammar for Content Mathml4
The RelaxNG schema for full MathML builds on the schema describing the various parts of the language which are given in the following sections. It can be found at http://www.w3.org/Math/RelaxNG/mathml4/mathml4.rnc.
The grammar for Strict Content Mathml4 can be found at http://www.w3.org/Math/RelaxNG/mathml4/mathml4-strict-content.rnc.
The grammar for Content Mathml4 builds on the grammar for the Strict Content MathML subset, and can be found at http://www.w3.org/Math/RelaxNG/mathml4/mathml4-content.rnc.
Normally, a MathML expression does not constitute an entire XML document. MathML is designed to be used as the mathematics fragment of larger markup languages. In particular it is designed to be used as a module in documents marked up with the XHTML family of markup languages. As RelaxNG directly supports modular development, this is usually very easy: an XHTML+MathML schema can be specified as simply as
# A RelaxNG Schema for XHTML+MathML include "xhtml.rnc" math = external "mathml4.rnc" Inline.class |= math Block.class |= math
Inline.class
and Block.class
to collect the the content models
for inline and block-level elements.
Customizing the Mathml4 schema so that we can restrict the content of
annotation-xml
elements is similarly simple, for example:
# A RelaxNG Schema for MathML with OpenMath3 annotations omobj = external "openmath3.rnc" include "mathml4.rnc" {anotation-xml.model = omobj}
The Mathml4 schema is organized so that subsetting to one of the sublanguages specified here is easy. To include strict content Mathml4 in a schema just include
include "mathml4-common.rnc" include "mathml4-strict-content.rnc"
include mathml4.rnc
.
For details about RelaxNG grammars and modularization see [[RELAX-NG]] or [[RelaxNGBook]].
