words_counted by abitdodgy

A Ruby natural language processor.

created at April 30, 2014, 3:07 a.m.

Ruby

12 +0

159 +0

29 +0

GitHub
yomu by yomurb

Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)

created at March 25, 2012, 10:03 a.m.

Ruby

12 +0

492 +1

122 +0

GitHub
ruby-nlp by tiendung

Ruby Binding for Stanford Pos-Tagger and Name Entity Recognizer

created at Aug. 11, 2008, 10:50 a.m.

Ruby

11 +0

92 +0

14 +0

GitHub
fuzzy_match by seamusabshere

Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coefficient (aka Pair Similiarity) and Levenshtein Distance internally.

created at Jan. 13, 2012, 4:46 p.m.

Ruby

11 +0

667 +1

47 +0

GitHub
ferret by dbalmain

Ferret: the extensible information retrieval library for ruby.

created at Sept. 14, 2008, 12:31 p.m.

C

10 +0

279 +1

59 +0

GitHub
amatch by flori

Approximate String Matching library

created at Aug. 26, 2009, 12:18 a.m.

C

9 +0

370 +0

35 +0

GitHub
open-nlp by louismullie

Ruby bindings to the OpenNLP Java toolkit.

created at Dec. 19, 2012, 2:44 a.m.

Ruby

9 +0

91 +0

11 +0

GitHub
phobos by phobos

Simplifying Kafka for ruby apps

created at Aug. 13, 2016, 6:14 p.m.

Ruby

9 +0

218 +0

40 +0

GitHub
lemmatizer by yohasebe

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

created at Oct. 27, 2012, 11:16 p.m.

Ruby

8 +0

108 +0

15 +0

GitHub
att_speech by adhearsion

A Ruby library for consuming the AT&T Speech API for speech to text.

created at Aug. 15, 2012, 4:02 p.m.

Ruby

8 +0

20 +0

6 +0

GitHub
berkeleyparser by slavpetrov

Automatically exported from code.google.com/p/berkeleyparser

created at July 7, 2015, 8:35 a.m.

Java

8 +0

181 +0

49 +0

GitHub
ruby-spacy by yohasebe

A wrapper module for using spaCy natural language processing library from the Ruby programming language via PyCall

created at June 19, 2021, 2:04 a.m.

Ruby

8 +0

52 +1

4 +0

GitHub
scalpel by louismullie

A fast and accurate rule-based sentence segmentation tool for Ruby.

created at Aug. 15, 2012, 5:14 a.m.

Ruby

8 +0

50 +0

5 +0

GitHub
TF-IDF by reddavis

Term Frequency - Inverse Document Frequency in Ruby

created at Dec. 18, 2009, 3:23 p.m.

Ruby

7 +0

35 +0

6 +0

GitHub
summarize by ssoper

A Ruby C wrapper for Open Text Summarizer

created at Nov. 30, 2010, 8:21 p.m.

Ruby

7 +0

208 +0

14 +0

GitHub
rwordnet by doches

A pure Ruby interface to the WordNet database

created at Nov. 10, 2008, 7:21 p.m.

Ruby

7 +0

88 +0

25 +0

GitHub
ruby-nlp by nathankleyn

Various NLP tools for Ruby

created at Sept. 14, 2013, 5:27 p.m.

Ruby

7 +0

33 +0

2 +0

GitHub
ruby-stemmer by aurelian

Expose libstemmer_c to Ruby

created at Oct. 23, 2008, 8:02 p.m.

C

7 +0

251 +0

22 +0

GitHub
raingrams by postmodern

A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.

created at March 8, 2009, 10:54 a.m.

Ruby

7 +0

69 +0

7 +0

GitHub
rsyntaxtree by yohasebe

Syntax tree generator for linguistic research

created at Nov. 29, 2009, 2:14 p.m.

Ruby

7 +0

95 +0

16 +0

GitHub