Clean, structured JSON API for Romanian word lookups with SQLite caching. It fetches from:
- DOOM (doom.lingv.ro)
- DEXonline (dexonline.ro)
- m.dex.ro (DEX)
No raw HTML in responses. Definitions are normalized, deduplicated, and enriched with word type, gender, examples, and etymology.
- 🔎 Multiple sources: DOOM, DEXonline, m.dex.ro
- 🧼 Clean JSON: no
<sup>,<span>or raw markup - � SQLite caching (fast, offline-friendly)
- 🧠 Smart parsing and duplicate consolidation
- 📊 Search and stats endpoints
- 🌐 CORS-enabled REST API
# Install dependencies
npm install
# Start the API (default: http://localhost:3000)
npm start
# Or run in dev mode with auto-restart
npm run dev
# Change port (e.g., 3001) for current shell session
$env:PORT = 3001; npm startQuery the dictionaries for a word.
Query params:
source(optional):doom|dexonline|mdex(default: all)refresh(optional):trueforces refetch (ignores cache)
Examples:
/api/word/casă/api/word/casă?source=dexonline/api/word/casă?refresh=true
Response (sanitized example):
{
"word": "casă",
"results": [
{
"word": "casă",
"source": "dexonline",
"definitions": [
{
"type": "dexonline_definition",
"word": "casă",
"wordType": "substantiv",
"gender": "feminin",
"grammaticalInfo": { "plural": "case" },
"definitions": [
"Clădire care servește drept locuință."
],
"examples": [
"A cumpărat o casă la țară."
],
"etymology": "Lat. casa.",
"notes": [] ,
"source": "DEX '09 (2009)",
"index": 0
}
],
"url": "https://dexonline.ro/definitie/casă",
"parsedAt": "2025-09-21T10:00:00.000Z",
"cached": false
}
],
"cached": false,
"timestamp": "2025-09-21T10:00:00.000Z"
}Notes:
- No
htmlfields are returned. - Headwords are normalized (e.g.,
CASĂ1→casă). - Definitions and examples are plain text.
Search cached words in SQLite.
Example: /api/search/cas
Database stats (totals, by source, recent activity).
Test parsing with local HTML snapshots in the repository (return-*.html).
Discover endpoints and usage.
Simple health check.
-
First request for a word → fetch from sources → parse & clean → save to SQLite → return JSON.
-
Next requests for the same word → returned from cache instantly (unless
refresh=true). -
Normalized definition objects include:
word,wordType,gender,grammaticalInfodefinitions[],examples[],etymology,notes[]source,index(position within the page)
server.js– Express server and endpointsdatabase.js– SQLite helper and schemaparser.js– Parsers for DOOM, DEXonline, m.dex.roreturn-*.html– Local snapshots for parser testingvocabulary.db– Generated SQLite cache (gitignored)
# Run tests (lightweight harness)
npm test
# Dev mode with hot reload
npm run dev- SQLite file:
vocabulary.db(created at project root) - This file is not committed to git (see
.gitignore).
Issues and PRs are welcome. If you add a source or tweak parsing, please:
- Keep responses HTML-free and normalized.
- Add a brief note in the README (sources/features).
- Avoid committing local DB files or secrets.
MIT (see LICENSE).
- DOOM (doom.lingv.ro)
- DEXonline (dexonline.ro)
- m.dex.ro