A fast and accurate rule-based sentence segmentation tool for Ruby.
created at Aug. 15, 2012, 5:14 a.m.
Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing
created at July 15, 2009, 1 p.m.
Distance Measurements are Awesome!
created at Sept. 18, 2014, 7:38 a.m.
A wrapper module for using spaCy natural language processing library from the Ruby programming language via PyCall
created at June 19, 2021, 2:04 a.m.
Machine Learning & Data Mining with JRuby
created at Dec. 2, 2015, 6:58 p.m.
A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.
created at March 8, 2009, 10:54 a.m.
Find a lot of kinds of common information in a string. CommonRegex port for Ruby
created at Jan. 23, 2015, 12:39 p.m.
Accurate Bayesian sentence tokenizer in Ruby.
created at March 10, 2010, 3:17 a.m.
Official Ruby client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Ruby apps.
created at Oct. 16, 2015, 4:30 p.m.
Unicode normalization library. (Mirror of Yoshida-san's code base to maintain the RubyGem.)
created at March 1, 2010, 12:26 p.m.
Fast Ruby FFI string edit distance algorithms
created at Feb. 25, 2013, 6:55 a.m.
This is the Ruby interface to LIBLINEAR (much more efficient than LIBSVM for text classification and other large linear classifications)
created at Feb. 28, 2009, 7:17 p.m.
A multilingual tokenizer to split a string into tokens
created at Jan. 5, 2016, 7:30 a.m.
Ruby bindings to the OpenNLP Java toolkit.
created at Dec. 19, 2012, 2:44 a.m.