Research project at TU Dortmund
Using Stanford's DeepDive

Discovering the

Machine Learning Genome

Machine Reading of Machine Learning Paper and creating
the first Machine Learning Database.
Machine Learning Genome Extractor

The research project The Machine Learning Genome attempts to create a computer system to read scientific machine learning paper to create the first Machine Learning Database. So to say, using machine learning and other techniques to build a knowledge base.
Our computer system called MALGx (Machine Learning Genome Extractor) is reading, analyzing and structuring all machine learning papers ever published. MALGx proceeds as follows:

  • Extracting relevant content from scientific papers (PDF) and transferring it to DB
  • Analyzing the content with Natural Language Processing (NLP) and tagging Machine Learning phrases
  • Finding relations between Machine Learning Phrases. Example: Comparison-Relation between ML-Algorithms (e.g. compared(SVM,ANN))

Since mid 2016 MALGx is developed and so far has analyzed over 6500 paper. With each new paper MALGx is learning and understanding more Machine Learning phrases and facts. You can track MALGx's progress on this website. Later, we will build a online accessible knowledge base to let you browse the Machine Learning Genome.

THE VISION. Mainly there are 2 motivations which drives us to build MALGx and The Machine Learning Genome. We can see them as short-term and long-term goals:

Autonomous Machine Learning

Researching instead of Searching

Some DB Statistics

Last update:: 17th January 2017 - 01:12:43 am

over 6500 Papers

already converted. From dark data into structured data

More then 24k and 6k unique

Machine Learning Algorithm mentions recognized.

800k sentences

analyzed, structured and prepared for more inference

