Tesseract Open Source OCR Engine (main repository)
updated at May 5, 2024, 1:52 a.m.
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
updated at May 4, 2024, 11:38 p.m.
fuzzy string matching library for ruby
updated at May 4, 2024, 1:41 p.m.
Elasticsearch integrations for ActiveModel/Record and Ruby on Rails
updated at May 3, 2024, 2:56 p.m.
Syntax tree generator for linguistic research
updated at May 2, 2024, 10:49 p.m.
Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coefficient (aka Pair Similiarity) and Levenshtein Distance internally.
updated at May 2, 2024, 8:41 p.m.
A simple Ruby natural language parser for elapsed time
updated at April 29, 2024, 4:13 a.m.
Calculates edit distance using Damerau-Levenshtein algorithm
updated at April 27, 2024, 9:17 a.m.
Machine Learning & Data Mining with JRuby
updated at April 26, 2024, 12:28 p.m.
A wrapper module for using spaCy natural language processing library from the Ruby programming language via PyCall
updated at April 26, 2024, 6:57 a.m.
Ruby gem to calculate the similarity between texts using tf*idf
updated at April 26, 2024, 2:41 a.m.