Correspondence Word documents
51,204 documents across 71 languages and 9 topics. See /classification for what this label covers.
Documents classified as correspondence or announcements. Examples include letters, memos, press releases, notices, and newsletters.
Useful for: stylometric analysis, correspondence classification, named-entity recognition on organizational text, automated reply generation training.
| Topic | Count |
|---|---|
| Government | 15,547 |
| Education | 10,481 |
| Healthcare | 5,837 |
| General | 5,772 |
| Finance | 4,626 |
| Nonprofit | 3,284 |
| Environment | 2,748 |
| Legal / Judicial | 1,778 |
| Technology | 1,131 |
| Lang | Count | Share |
|---|---|---|
| en | 15,844 | 32.5% |
| cs | 7,206 | 14.8% |
| es | 4,186 | 8.6% |
| de | 3,360 | 6.9% |
| ru | 3,134 | 6.4% |
| pl | 2,456 | 5.0% |
| + 65 more | ||
Share is computed against the top 20 languages for this type (48,688 docs), matching what the API returns. A handful of documents fall outside the top 20 or have no detected language.
| ID | Filename | Topic | Lang | Conf |
|---|---|---|---|---|
| fa416810866f | 202 | Healthcare | en | 0.98 |
| 8d0b0e24f173 | CPE_update_4th_August_2025.docx | Healthcare | en | 0.98 |
| f2abdc93658b | download.php | Healthcare | cs | 0.98 |
| b3a360567a99 | file.php | Education | cs | 0.98 |
| d7549de46225 | @@download | Government | ca | 0.98 |
ID column shows the first 12 characters of the SHA-256 content hash; the full hash is the stable reference. Real public-web filenames vary widely: descriptive, numeric, or URL-fragment shaped.
# All correspondence documents
curl "https://api.docxcorp.us/manifest?type=correspondence" -o correspondence-manifest.txt
# High-confidence English subset
curl "https://api.docxcorp.us/manifest?type=correspondence&lang=en&min_confidence=0.8" See /download for full access patterns.