Mímir
Mímir is a Multi-paradigm Information Management Index and Repository. It supports indexing of (and searching over) the text, annotations, and semantics of documents.
From this page you can access two deployments of Mímir that demonstrate some of the novel search facilities provided. Both corpora used in these demos have been annotated for part of speech and morphological root, which are accessible using the category: and root: modifiers (see the examples in the document linked below). They also contain Sentence annotations.
The Demos
Patents
The "Patent Search" demo shows Mímir running over a corpus of 300,000 patent documents. The corpus has been annotated for document structure (document metadata and document sections), references, and measurements.
Web Pages
The "Web Archive Search" demo is running over a corpus of about 1 million web pages. The documents are annotated for measurements, and typical named entities (Address, CabinetMinister, Date, Money, Percent, Organization, Location, Person).
BBC News
The BBC News demo uses just over 8,000 news web pages crawled from the BBC website to demonstrate some elements of the GATE Process.
Examples of possible queries include:
- People born in Sheffield:
{Person sparql = "SELECT ?inst WHERE { ?inst :birthPlace <http://dbpedia.org/resource/Sheffield>}"}
- The Location of Steel Industries (this query finds all mentions of organizations involved in the steel industry where their location is mentioned in the text):
{Organization sparql = "SELECT ?inst WHERE { ?inst :industry <http://dbpedia.org/resource/Steel>}"} [0..4] in {Location}
- A Labour Party member being quoted in a document written since the start of 2011 and classified as Scotland by the BBC:
( {Person sparql = "SELECT ?inst WHERE { ?inst :party <http://dbpedia.org/resource/Labour_Party_%28UK%29>}"} root:say ) IN ( {Document date > 20110000} OVER {DocumentClassification sparql = "SELECT ?inst WHERE { ?inst a bbc:Classification . FILTER (?inst = bbc:Scotland)}"} )
People in the (BBC) News
This demo is a example of a specialised front-end that uses the same underlying index as the BBC News demo above. It provides a user-friendly interface for searching the news archive for mentions of people's names. Queries similar to the ones above are thus much easier to formulate, without any specialised knowledge being required of the end user.
Query Examples
A quick introduction-by-example to the query language is provided here.
An example opf a possible interactive query session is shown in this PDF file.