Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coefficient (aka Pair Similiarity) and Levenshtein Distance internally.
created at Jan. 13, 2012, 4:46 p.m.
Ruby gem to calculate the similarity between texts using tf*idf
created at Sept. 10, 2012, 1:29 a.m.
Natural language processing framework for Ruby.
created at Jan. 24, 2012, 2:07 a.m.
ID3-based implementation of the ML Decision Tree algorithm
created at Feb. 23, 2009, 4:52 a.m.
Sphinx/Manticore plugin for ActiveRecord/Rails
created at April 14, 2008, 1:28 a.m.
REST client for Google APIs
created at Jan. 26, 2012, 9:54 p.m.
Elasticsearch integrations for ActiveModel/Record and Ruby on Rails
created at Nov. 8, 2013, 5 p.m.
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
created at June 27, 2013, 9:13 p.m.
Tesseract Open Source OCR Engine (main repository)
created at Aug. 12, 2014, 6:04 p.m.