Topic · general

General Word documents

59,877 documents across 76 languages and 10 document types. See /classification for what this label covers.

59,877documents
76languages
10types

Documents that did not fit cleanly into one of the other eight topics. Includes general-interest publications, multi-domain content, and organizational documents without a single dominant subject.

Useful for: general-purpose text classification, cross-sector retrieval, fall-back analysis when domain is unknown.

TypeCount
Educational 14,183
Forms 9,921
Creative 7,848
Reference 6,959
Administrative 6,037
Correspondence 5,772
Policies 3,057
Technical 2,695
Legal 1,945
Reports 1,460
LangCountShare
en 18,692 33.4%
fr 7,052 12.6%
cs 4,415 7.9%
unknown 4,155 7.4%
de 3,242 5.8%
es 3,041 5.4%
+ 70 more

Share is computed against the top 20 languages for this topic (55,908 docs), matching what the API returns. A handful of documents fall outside the top 20 or have no detected language.

ID Filename Type Lang Conf
58b3b96fd02e Inscription-1.docx Forms fr 0.98
dca81eef8914 Vereinsvorstand_mit_Text.docx Administrative de 0.98
3e5bdee50017 49549 Technical en 0.98
f38771953325 52381 Technical en 0.98
3df8dafff14c 49485 Technical fr 0.98

ID column shows the first 12 characters of the SHA-256 content hash; the full hash is the stable reference. Real public-web filenames vary widely: descriptive, numeric, or URL-fragment shaped.

# All general documents
curl "https://api.docxcorp.us/manifest?topic=general" -o general-manifest.txt

# High-confidence English subset
curl "https://api.docxcorp.us/manifest?topic=general&lang=en&min_confidence=0.8"

See /download for full access patterns.

All topicsAll types/classification