• graph-tool
    • interactive GUI, fast, nice drawing engine
    • C++ backend (using boost graph), Python bindings –> harder to install
    • Kirell’s preferred tool
  • NetworkX: creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
    • more algorithms
    • Pure Python, quite slow
  • NetworKit
  • igraph
  • GraphLab (then dato, now turi, bought by Apple)
  • GraphX (Apache Spark)
    • Used by Volodymir
    • Need to be programmed in Scala
    • Two approaches: Pregel and aggregateMessages.
  • Apache Giraph
    • Used by Facebook
    • Utilize Apache Hadoop’s MapReduce
    • Based on Pregel (developed by Google)


  • Gephi: The Open Graph Viz Platform
  • Graphviz: implementation of the dot language
  • Sigmajs: JavaScript library, can be exported from gephi


  • GraphML (XML, large files)
  • gt (graph-tool binary format)
  • GML
  • dot (Graphviz)

Approximate nearest neighbour (ANN) search (benchmark)

Natural Language Processing

  • NLTK: most complete, historically
  • TextBlob
  • Stanford CoreNLP
  • gensim: word2vec embedding
  • spaCy: new comer, industrial strength

Comparison: 5 Heroic Python NLP Libraries

Deep learning frameworks / librairies

Backends (perform actual computations)

Low level (solve dataflow graphs of tensor data): good for research on NN

  • Theano (Bengio group): Python library, low-level (symbolic math) API, slow at compiling models
    • CGT (Berkeley robot learning lab): replicates Theano’s API, very short compilation time, supports multithreading, GPU not yet ready
  • Torch (Facebook AI, Google DeepMind): Lua, midlle-level API
  • TensorFlow (Google Brain): Python / C++, automatic differentiation, good at managing multiple nodes
  • Brainstorm (Schmidhuber, IDSIA): focused on RNN, LSTM.

Middle level (building blocks for deep learning): good for fast experimentations (while still allowing low-level modifications)

  • Caffe (Berkeley Vision and Learning Center): models specified in declarative configuration files (framework), most used (also in scientific research), geared toward applying existing models, experimenting with new models require modifying the framework C++ code
  • Libraries built on Theano (Python)
    • Lasagne: work in conjunction with Theano (assume knowledge of it), focused on feed-forward only, examples
      • nolearn: sklearn-like abstraction of Lasagne
    • Blocks (Bengio group): lower-level (more pure Theano code) than Lasagne, support for RNN, examples
      • Fuel (Bengio group): interface to data formats and datasets (MNIST, CIFAR-10, ImageNet, etc.)
    • Keras: TensorFlow backend (in addition to Theano), Torch / sklearn -like API (hide Theano)
    • Pylearn2 (LISA lab): discontinued
  • Libraries built on TensorFlow
    • TFlearn: by the TensorFlow team, similar API as scikit-learn
    • Keras
  • PyBrain (Schmidhuber group): mostly discontinued

High level (production / industrial grade)

  • VELES (Samsung): Python, OpenCL and CUDA backends
  • DL4J: java
  • DeepDetect: server architecture, Caffe backend, C++, architecture templates
  • neon (Nervana Systems): CPU / GPU / custom hardware, fastest implementation


Computing ressources