iepy icon indicating copy to clipboard operation
iepy copied to clipboard

Information Extraction in Python


IEPY is an open source tool for Information Extraction <>_ focused on Relation Extraction.

To give an example of Relation Extraction, if we are trying to find a birth date in:

`"John von Neumann (December 28, 1903 – February 8, 1957) was a Hungarian and
American pure and applied mathematician, physicist, inventor and polymath."`

then IEPY's task is to identify "John von Neumann" and "December 28, 1903" as the subject and object entities of the "was born in" relation.

It's aimed at: - users <>_ needing to perform Information Extraction on a large dataset. - scientists <>_ wanting to experiment with new IE algorithms.


- `A corpus annotation tool <>`_
  with a `web-based UI <>`_
- `An active learning relation extraction tool <>`_
  pre-configured with convenient defaults.
- `A rule based relation extraction tool <>`_
  for cases where the documents are semi-structured or high precision is required.
- A web-based user interface that:
    - Allows layman users to control some aspects of IEPY.
    - Allows decentralization of human input.
- A shallow entity ontology with coreference resolution via `Stanford CoreNLP <>`_
- `An easily hack-able active learning core <>`_,
  ideal for scientist wanting to experiment with new algorithms.


Install the required packages:

.. code-block:: bash

sudo apt-get install build-essential python3-dev liblapack-dev libatlas-dev gfortran openjdk-7-jre

Then simply install with pip:

.. code-block:: bash

pip install iepy

Full details about the installation is available on the Read the Docs <>__ page.

Running the tests

If you are contributing to the project and want to run the tests, all you have to do is:

- Make sure your JAVAHOME is correctly set. `Read more about it here <>`_
- In the root of the project run `nosetests`

Learn more

The full documentation is available on Read the Docs <>__.


IEPY is © 2014 Machinalis <>_ in collaboration with the NLP Group at UNC-FaMAF <>_. Its primary authors are:

You can follow the development of this project and report issues at

You can join the mailing list here <>__