Dataset | | Base Model’ | | Notes | PubLayNet | [38] F/M | Layouts of modern scientific documents |
PRImA [3] | M | Layouts of scanned modern magazines and scientific reports |
Newspaper | F | Layouts of scanned US newspapers from the 20th century |
TableBank | F | Table region on modern scientific and business document |
HJDataset [31] | F/M | Layouts of history Japanese documents |
```
### Data connector metadata fields
Documents processed through source connectors include additional document metadata. These additional fields only ever
appear if the source document was processed by a connector.
#### Common data connector metadata fields
* Data Source metadata (on json output):
* url
* version
* date created
* date modified
* date processed
* record locator
* Record locator is specific to each connector
#### Additional metadata fields by connector type (via record locator)
| Source connector | Additional metadata |
| --------------------- | -------------------------------- |
| airtable | base id, table id, view id |
| azure (from fsspec) | protocol, remote file path |
| box (from fsspec) | protocol, remote file path |
| confluence | url, page id |
| discord | channel |
| dropbox (from fsspec) | protocol, remote file path |
| elasticsearch | url, index name, document id |
| fsspec | protocol, remote file path |
| google drive | drive id, file id |
| gcs (from fsspec) | protocol, remote file path |
| jira | base url, issue key |
| onedrive | user pname, server relative path |
| outlook | message id, user email |
| s3 (from fsspec) | protocol, remote file path |
| sharepoint | server path, site url |
| wikipedia | page title, age url |
# Examples
Source: https://docs.unstructured.io/api-reference/partition/examples
This page provides some examples of accessing Unstructured Partition Endpoint via different methods.
To use these examples, you'll first need to set an environment variable named `UNSTRUCTURED_API_KEY`,
representing your Unstructured API key. [Get your API key](/api-reference/partition/overview).
Also, you'll need to set an environment variable named `UNSTRUCTURED_API_URL` to the
value of the Unstructured API URL for your account. This API URL was provided to you when your Unstructured account was created.
If you do not have this API URL, contact Unstructured Sales at [sales@unstructured.io](mailto:sales@unstructured.io).