MIaS (Math Indexer and Searcher) is a maths-aware full-text based search engine. It is based on the state-of-the-art system Apache Lucene, however, its maths processing capabilities are standalone and can be easily integrated into any Lucene/Solr based system, as in EuDML search service. MIaS processes documents containing mathematics encoded in MathML format and in several steps allowing formulae similarity search transforms problem of matching XML structures to regular full-text searching. Principles are described in our publications. Currently, our MREC and other collections such as NTCIR Math Task data set are being used to evaluate the system.

We are working hard to deliver much refactored and updated version 2 of the system which will be more precise and efficient. It supports both Presentation and Content MathML for indexing and searching and uses our own MathML Canonicalizer for better precision, it will allow access through web services and support OpenSearch standard for better accessibility and many more...

NTCIR Math Task Evaluation Competitions

Last year MIR@MU group joined with MIaS the first ever official math information retrieval evaluation task roofed by an international IR evaluation initiative NTCIR.

MIaS was sucessful with then development version. This year we are going to compete with other systems with improved current MIaS again in NTCIR-12 Math Task.

Cite as


SOJKA, Petr and Martin LÍŠKA. The Art of Mathematics Retrieval. In Matthew R. B. Hardy, Frank Wm. Tompa. Proceedings of the 2011 ACM Symposium on Document Engineering. Mountain View, CA, USA: ACM, 2011. p. 57–60. ISBN 978-1-4503-0863-2. doi:10.1145/2034691.2034703.


     author = "Petr Sojka and Martin L{\'\i}{\v s}ka",
      title = "{The Art of Mathematics Retrieval}",
  booktitle = "{Proceedings of the ACM Conference on Document Engineering,
  		DocEng 2011}",
  publisher = "{Association of Computing Machinery}",
    address = "{Mountain View, CA}",
       year = 2011,
      month = Sep,
       isbn = "978-1-4503-0863-2",
      pages = "57--60",
        url = {http://doi.acm.org/10.1145/2034691.2034703},
        doi = {10.1145/2034691.2034703},
   abstract = {The design and architecture of MIaS (Math Indexer and Searcher), 
	       a system for mathematics retrieval is presented, and design 
	       decisions are discussed. We argue for an approach based on 
	       Presentation MathML using a similarity of math subformulae. The 
	       system was implemented as a math-aware search engine based on the 
	       state-of-the-art system Apache Lucene. Scalability issues were 
	       checked against more than 400,000 arXiv documents with 158 
	       million mathematical formulae. Almost three billion MathML 
	       subformulae were indexed using a Solr-compatible Lucene.},
Selected Publications

Relevant projects

