Editor: @lockwo (all papers)
Reviewers: @atantos (all reviews), @akki2825 (all reviews)
Alexander V. Mantzaris (0000-0002-0026-5725)
Mantzaris, A. V., (2026). KeemenaPreprocessing.jl: Unicode-Robust Cleaning, Multi-Level Tokenisation and Streaming Offset Bundling for Julia NLP. Journal of Open Source Software, 11(118), 9348, https://doi.org/10.21105/joss.09348
NLP Text Processing Tokenization Corpus Cleaning
Authors of JOSS papers retain copyright.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Journal of Open Source Software is an affiliate of the Open Source Initiative.
Journal of Open Source Software is part of Open Journals, which is a NumFOCUS-sponsored project.
Table of Contents
Public user content licensed CC BY 4.0 unless otherwise specified.
ISSN 2475-9066