id | 20030361 |
name | python-ucto |
full_name | proycon/python-ucto |
html_url | https://github.com/proycon/python-ucto |
description | This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto). |
created_at | May 21, 2014, 5:28 p.m. |
updated_at | Sept. 12, 2024, 2:02 p.m. |
pushed_at | Sept. 12, 2024, 2:01 p.m. |
size | 75 |
stargazers_count | 29 |
watchers_count | 4 |
forks_count | 5 |
open_issues | 5 |
language | Cython |
awesome_list |
https://github.com/josephmisiti/awesome-machine-learning
|