Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --
created at March 26, 2013, 11:16 a.m.
Cybertron: the home planet of the Transformers in Go
created at June 21, 2022, 1:45 p.m.
Open solution to the Toxic Comment Classification Challenge
created at Jan. 4, 2018, 4:29 p.m.
Pure C ONNX runtime with zero dependancies for embedded devices
created at Oct. 5, 2019, 8:16 p.m.
A native Go clean room implementation of the Porter Stemming algorithm.
created at June 2, 2013, 5:18 p.m.
Lightweight, Python library for fast and reproducible experimentation
created at Jan. 15, 2018, 9:40 a.m.
Kernel density estimators for Julia
created at April 13, 2014, 7:14 p.m.
Examples of Machine Learning code using Comet.ml
created at Nov. 21, 2018, 6 p.m.
Kaggle Submission for "Detecting Insults in Social Commentary"
created at Sept. 22, 2012, 2:16 p.m.
Julia package for loading many of the data sets available in R
created at Nov. 24, 2012, 5:16 a.m.
A Julia package for Gaussian Processes
created at April 30, 2015, 2:46 p.m.
TensorFlow C API Class Wrapper in Server Side Swift.
created at June 14, 2017, 2:06 a.m.