Document type · forms

Forms Word documents

165,489 documents across 74 languages and 9 topics. See /classification for what this label covers.

165,489documents
74languages
9topics

Documents classified as forms or other structured data-collection artifacts. Examples include fillable forms, applications, registrations, surveys, ballots, and questionnaires.

Useful for: form-field extraction, structured-document parsing, intake automation training data, multilingual form understanding, accessibility audits of public forms.

TopicCount
Government 54,927
Education 49,356
Healthcare 13,285
Finance 12,557
General 9,921
Nonprofit 9,733
Legal / Judicial 7,737
Environment 5,746
Technology 2,227
LangCountShare
en 42,226 27.6%
pl 19,019 12.4%
cs 13,060 8.5%
zh 12,023 7.8%
es 7,438 4.9%
ja 6,769 4.4%
+ 68 more

Share is computed against the top 20 languages for this type (153,170 docs), matching what the API returns. A handful of documents fall outside the top 20 or have no detected language.

ID Filename Topic Lang Conf
0e19a76eb2bf open Healthcare en 0.99
a4f00f48a951 mf.ashx Healthcare en 0.99
50cd070cb1c3 Klimbos-invullijst-2025.docx Education nl 0.99
7db5995bf0bf 232-au-kyt-frm-171-yatay-gecis-basvuru-formu.docx Education tr 0.99
275ec91b9b43 EHU-ITE-Trainee-Profile-Secondary.docx Education en 0.99

ID column shows the first 12 characters of the SHA-256 content hash; the full hash is the stable reference. Real public-web filenames vary widely: descriptive, numeric, or URL-fragment shaped.

# All forms documents
curl "https://api.docxcorp.us/manifest?type=forms" -o forms-manifest.txt

# High-confidence English subset
curl "https://api.docxcorp.us/manifest?type=forms&lang=en&min_confidence=0.8"

See /download for full access patterns.

All typesAll topics/classification