LatinamericanTextResources by dav009

A collection of Latinamerican Corpora, dictionaries to serve as resources for Text processing and Text mining.

updated at Jan. 9, 2023, 12:47 p.m.

Unknown languages

4 +0

6 +0

4 +0

GitHub
ixa-pipe-pos by ixa-ehu

IXA pipes Part of Speech tagger and Lemmatizer (http://ixa2.si.ehu.es/ixa-pipes)

updated at March 23, 2023, 11:39 a.m.

Java

9 +0

17 +0

15 +0

GitHub
Multilingual-Latent-Dirichlet-Allocation-LDA by ArtificiAI

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

updated at May 10, 2024, 10:29 a.m.

Python

10 +0

82 +0

29 +0

GitHub
wiki2vec by idio

Generating Vectors for DBpedia Entities via Word2Vec and Wikipedia Dumps. Questions? https://gitter.im/idio-opensource/Lobby

updated at Oct. 25, 2024, 8:33 a.m.

Java

46 +0

601 +0

137 +0

GitHub
estem by MaG21

Spanish stemming

updated at Oct. 25, 2024, 11:09 a.m.

Ruby

4 +0

4 +0

0 +0

GitHub