Shirabe

Credits

Where Shirabe gets its data

Shirabe stands on decades of open-source lexicography. Every entry, reading, and tag here originates from the projects below - please visit them, support them, and check their licences before redistributing.

Dictionary data

JMdict - Japanese↔multilingual word entries
Compiled by the Electronic Dictionary Research and Development Group (EDRDG, James Breen et al.) and distributed under the EDRDG licence. We use the jmdict-simplified JSON conversion by Stanislav Petrov.
JMnedict - proper-noun (names) dictionary
Also from EDRDG, same licence terms; consumed via the jmdict-simplified JMnedict release.
KANJIDIC2 - kanji dictionary
EDRDG, same licence; sourced from jmdict-simplified's KANJIDIC2 release.
Kangxi radical names & meanings
The Japanese reading, English gloss, stroke count, and positional category for each of the 214 classical radicals is sourced from Kanji alive (kanjialive.com), licensed under CC BY 4.0.
Wikipedia abstracts (via DBpedia)
Lead-paragraph summaries of Wikipedia articles in every supported language come from the DBpedia project's long_abstracts dataset. The text remains the property of the Wikipedia contributors who wrote it and is dual-licensed under CC BY-SA 3.0 and the GNU Free Documentation License. Each abstract card on a word page links back to the source article.

Tokenisation & analysis

Sudachi
Japanese morphological analyser by Works Applications. We use the Rust port, sudachi.rs, plus its SudachiDict dictionary releases. Apache-2.0.
kabosu
Ruby bindings for sudachi.rs maintained alongside Shirabe - github.com/davafons/kabosu. Apache-2.0.

Software

Shirabe is built on Ruby on Rails, Hotwire, and a stack of other open-source gems. Typeset in Inter Tight, Newsreader, Noto Sans JP, and JetBrains Mono. Source on GitHub.

Found something missing? Open an issue on the repository.