josephmisiti/awesome-machine-learning

ucto by LanguageMachines

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --

created at March 26, 2013, 11:16 a.m.

C++

13 +0

65 +0

13 +0

GitHub

MITIE by mit-nlp

MITIE: library and tools for information extraction

created at April 1, 2014, 10:47 p.m.

C++

193 +0

2,916 +1

534 +0

GitHub

meta by meta-toolkit

A Modern C++ Data Sciences Toolkit

created at Feb. 2, 2014, 11:54 p.m.

C++

63 +0

693 +0

235 +0

GitHub

libfolia by LanguageMachines

FoLiA library for C++

created at March 26, 2013, 12:46 p.m.

C++

10 +0

15 +0

7 +0

GitHub

frog by LanguageMachines

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

created at June 5, 2014, 1:39 p.m.

C++

16 +0

73 +0

11 +0

GitHub

LKYDeepNN by mosdeo

Low dependency（C++11 STL only）, good portability, header-only, deep neural networks for embedded

created at Dec. 31, 2016, 12:05 p.m.

C++

8 +0

49 +0

13 +0

GitHub

Fido by FidoProject

A lightweight C++ machine learning library for embedded electronics and robotics.

created at Aug. 7, 2015, 2:54 a.m.

C++

37 +0

432 +0

81 +0

GitHub

dynet by clab

DyNet: The Dynamic Neural Network Toolkit

created at Feb. 8, 2015, 11:09 p.m.

C++

184 +0

3,417 +1

704 +0

GitHub

banditlib by jkomiyama

Multi-armed bandit simulation library

created at March 11, 2014, 4:09 a.m.

C++

8 +0

137 +0

44 +0

GitHub

caffe by BVLC

Caffe: a fast open framework for deep learning.

created at Sept. 12, 2013, 6:39 p.m.

C++

2,094 -1

34,035 +3

18,697 -4

GitHub

opencv by opencv

Open Source Computer Vision Library

created at July 19, 2012, 9:40 a.m.

C++

2,653 -1

77,982 +131

55,692 +1

GitHub

deepdetect by jolibrain

Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

created at May 22, 2015, 2:45 p.m.

C++

132 +0

2,513 +1

561 +0

GitHub

xlearn by aksnzhy

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

created at June 10, 2017, 8:09 a.m.

C++

110 +0

3,084 +2

518 +0

GitHub

rgf by RGF-team

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

created at June 8, 2016, 3:48 p.m.

C++

18 +0

373 +0

57 +0

GitHub

thundersvm by Xtra-Computing

ThunderSVM: A Fast SVM Library on GPUs and CPUs

created at Dec. 11, 2014, 4:24 a.m.

C++

56 +0

1,561 +1

217 +1

GitHub

vowpal_wabbit by VowpalWabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

created at July 31, 2009, 7:36 p.m.

C++

352 +0

8,461 +4

1,927 +2

GitHub

libfm by srendle

Library for factorization machines

created at Sept. 15, 2014, 12:36 a.m.

C++

72 +0

1,487 +0

415 +0

GitHub

thundergbm by Xtra-Computing

ThunderGBM: Fast GBDTs and Random Forests on GPUs

created at Nov. 11, 2016, 9:58 a.m.

C++

25 +0

690 +0

87 +0

GitHub

LightGBM by Microsoft

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

created at Aug. 5, 2016, 5:45 a.m.

C++

433 -1

16,533 +18

3,819 -2

GitHub

CNTK by Microsoft

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

created at Nov. 26, 2015, 9:52 a.m.

C++

1,251 +0

17,499 -2

4,287 +0

GitHub

All languages 774 Python 269 C++ 54 Julia 53 Go 45 Jupyter Notebook 44 JavaScript 43 Clojure 34 Scala 28 Lua 24 C 22 Java 22 Ruby 17 Rust 15 HTML 9 Objective-C 9 TypeScript 9 Unknown languages 9 Swift 8 Haskell 6 Cuda 5 R 5 PHP 4 C# 3 Dockerfile 3 MATLAB 3 Common Lisp 2 Crystal 2 Cython 2 Elixir 2 Fortran 2 OCaml 2 OpenEdge ABL 2 SAS 2 Shell 2 TeX 2 APL 1 CoffeeScript 1 GAP 1 Gleam 1 Kotlin 1 Makefile 1 Matlab 1 Perl 1 PostScript 1 Raku 1 Scheme 1