josephmisiti/awesome-machine-learning

zpar by frcchang

ZPar statistical parser. Universal language support (depending on the availability of training data), with language-specific features for Chinese and English. Currently support word segmentation, POS tagging, dependency and phrase-structure parsing.

created at June 30, 2015, 1:55 p.m.

C++

13 +0

134 +0

33 +0

GitHub

python-zpar by EducationalTestingService

A python wrapper around the ZPar parser for English.

created at Sept. 8, 2014, 1:41 p.m.

Python

21 +0

49 +1

19 +0

GitHub

python-frog by proycon

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

created at Sept. 7, 2014, 8:32 p.m.

Cython

6 +0

47 +0

10 +0

GitHub

python-ucto by proycon

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

created at May 21, 2014, 5:28 p.m.

Cython

4 +0

29 +0

5 +0

GitHub

pynlpl by proycon

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

created at July 6, 2010, 11:42 a.m.

Python

31 +0

479 +0

67 +0

GitHub

rosetta by columbia-applied-data-science

Tools, wrappers, etc... for data science with a concentration on text processing

created at Nov. 3, 2013, 4:13 p.m.

Jupyter Notebook

22 +0

206 +0

47 +0

GitHub

nut by pprett

Natural language Understanding Toolkit

created at Oct. 14, 2010, 8:08 a.m.

8 +0

118 +0

25 +0

GitHub

genius by duanhongyi

a chinese segment base on crf

created at Aug. 20, 2013, 6:54 a.m.

Python

26 +0

234 +0

65 +0

GitHub

snownlp by isnowfy

Python library for processing Chinese text

created at Nov. 26, 2013, 11:46 a.m.

Python

350 +0

6,439 +9

1,367 +2

GitHub

jieba by fxsjy

结巴中文分词

created at Sept. 29, 2012, 7:52 a.m.

Python

1,279 +0

33,350 +25

6,725 +3

GitHub

yalign by machinalis

A sentence aligner for comparable corpora

created at Aug. 26, 2013, 3:46 p.m.

Python

16 +0

127 +0

31 +0

GitHub

quepy by machinalis

A python framework to transform natural language questions to queries in a database query language.

created at Dec. 3, 2012, 3:46 p.m.

Python

95 +0

1,255 +0

295 -1

GitHub

Detectron by facebookresearch

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

created at Oct. 5, 2017, 5:32 p.m.

Python

940 +0

26,271 +7

5,452 +0

GitHub

dockerface by natanielruiz

Face detection using deep learning.

created at June 5, 2017, 10:25 p.m.

Dockerfile

12 +0

190 +0

32 +0

GitHub

face_recognition by ageitgey

The world's simplest facial recognition api for Python and the command line

created at March 3, 2017, 9:52 p.m.

Python

1,567 +1

53,455 +73

13,492 +10

GitHub

vigra by ukoethe

a generic C++ library for image analysis

created at July 6, 2011, 8:34 a.m.

C++

41 +1

412 +1

192 +0

GitHub

scikit-image by scikit-image

Image processing in Python

created at July 7, 2011, 10:07 p.m.

Python

186 +0

6,091 +9

2,235 +4

GitHub

prediction-builder by denissimon

A library for machine learning that builds predictions using a linear regression.

created at June 8, 2015, 4:52 p.m.

PHP

8 +0

111 +0

13 +0

GitHub

jieba-php by fukuball

"結巴"中文分詞：做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.

created at April 23, 2015, 8:29 a.m.

PHP

56 +0

1,323 -1

260 +0

GitHub

data-science-ipython-notebooks by donnemartin

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

created at Jan. 23, 2015, 7:38 p.m.

Python

1,616 +0

27,477 +31

7,882 +2

GitHub

First
Previous
8
9
10
11
12 (current)
13
14
15
16
Next
Last

All languages 775 Python 269 C++ 55 Julia 53 Go 45 Jupyter Notebook 45 JavaScript 43 Clojure 34 Scala 28 Lua 24 C 22 Java 22 Ruby 17 Rust 15 HTML 9 Objective-C 9 TypeScript 9 Unknown languages 9 Swift 8 Haskell 6 Cuda 5 R 5 PHP 4 C# 3 Dockerfile 3 MATLAB 3 Common Lisp 2 Crystal 2 Cython 2 Elixir 2 Fortran 2 OCaml 2 OpenEdge ABL 2 SAS 2 Shell 2 TeX 2 APL 1 CoffeeScript 1 GAP 1 Gleam 1 Kotlin 1 Makefile 1 Matlab 1 Perl 1 Raku 1 Scheme 1