Educational Word documents
88,676 documents across 75 languages and 9 topics. See /classification for what this label covers.
Documents classified as teaching or learning materials. Examples include syllabi, lesson plans, course outlines, study guides, worksheets, theses, and dissertations.
Useful for: education-domain NLP, curriculum classification, learning-objective extraction, multilingual academic text analysis.
| Topic | Count |
|---|---|
| Education | 48,092 |
| General | 14,183 |
| Healthcare | 8,644 |
| Technology | 5,889 |
| Environment | 3,549 |
| Finance | 2,733 |
| Government | 2,432 |
| Legal / Judicial | 2,235 |
| Nonprofit | 919 |
| Lang | Count | Share |
|---|---|---|
| en | 37,061 | 44.0% |
| ru | 7,510 | 8.9% |
| fr | 6,008 | 7.1% |
| es | 5,618 | 6.7% |
| pt | 3,823 | 4.5% |
| tr | 3,507 | 4.2% |
| + 69 more | ||
Share is computed against the top 20 languages for this type (84,233 docs), matching what the API returns. A handful of documents fall outside the top 20 or have no detected language.
| ID | Filename | Topic | Lang | Conf |
|---|---|---|---|---|
| 8ba371b0a7fe | radio_production_2_-_curriculum_map_v2.docx | Education | en | 0.99 |
| f0052677c12b | unknown.docx | Education | en | 0.99 |
| c1a44af9f9d3 | download.php | Education | pl | 0.99 |
| 6f3d263c3eaf | viewcontent.cgi | Education | en | 0.99 |
| b6370f677e3b | jahrgang-13-g9-q2-phase | Education | de | 0.99 |
ID column shows the first 12 characters of the SHA-256 content hash; the full hash is the stable reference. Real public-web filenames vary widely: descriptive, numeric, or URL-fragment shaped.
# All educational documents
curl "https://api.docxcorp.us/manifest?type=educational" -o educational-manifest.txt
# High-confidence English subset
curl "https://api.docxcorp.us/manifest?type=educational&lang=en&min_confidence=0.8" See /download for full access patterns.