Data Enrichment Overview
Attach SAP business metadata (vendor, amount, posting date, line items) to every document FileRelay relays — automatically, on ingest.
By default, FileRelay forwards the bytes of a document and very little else. Customers asking "which vendor was on doc X?" or "did invoice 4500001234 land?" have no answer beyond the SAP document ID.
Data Enrichment closes that gap. You define one or more enricher functions per repository — saved RFC / BAPI calls into SAP — and FileRelay runs them automatically after every new document is ingested. The structured result is stored against the document and surfaces on the document detail page, in exports, and (soon) the public read API.
How it fits together
There are three pieces:
- The library — a workspace-scoped catalogue of saved RFC functions, each with its own SAP connection details, input mapping, and whitelist/blacklist rules. Manage it on the RFC Console.
- Bindings — each function in the library can be bound to one or more content repositories. Bound + active = it auto-fires on every new document in that repo. Manage bindings on the Data Enrichment page.
- Per-document results — every successful run creates a
document_enrichmentsrow keyed by(document, function). Stored data survives even if the function is later deleted (snapshot fields preserve the function name + module).
What an enricher looks like
An enricher is a self-contained "saved RFC call" with five parts:
| Part | Example |
|---|---|
| Name | "Invoice header lookup" |
| Function module | BAPI_DOCUMENT_GETDETAIL2 |
| SAP connection | host, system number, client, user, password (encrypted at rest) |
| Input mapping | { "DOCUMENTNUMBER": "{{SAP_OBJECT_ID}}", "FISCALYEAR": "{{TODAY:Y}}" } |
| Whitelist / Blacklist | dot-paths controlling which response fields are kept or hidden |
Placeholders in the input mapping resolve against the document being enriched. The full list:
{{DOC_ID}}— FileRelay document id{{SAP_OBJECT_ID}}— the SAP business object key (invoice number, PO, etc.){{SAP_OBJECT_TYPE}}— SAP object type code (e.g.BUS2081){{ARCHIVE_DATE}}— when the doc was archived{{CONT_REP}}— the content repository id{{EXTERNAL_ID}}— caller-supplied external id, if any{{TODAY}}/{{NOW}}— current date / timestamp (supports{{TODAY:Y}},{{TODAY:Ymd}}patterns){{USER}}/{{WORKSPACE}}— current user / workspace identifiers
The legacy lowercase form ({doc_id}) still works alongside the {{UPPER}} syntax.
Whitelist & blacklist
Both rules use dot-paths with * wildcards. They are applied server-side before the result is serialized to the browser — blacklisted fields never traverse the wire.
whitelist:
- DOC_HEADER.VENDOR
- DOC_HEADER.AMOUNT
- ITEMS[*].MATERIAL
blacklist:
- "*.PASSWORD"
- DOC_HEADER.INTERNAL_NOTES
Empty whitelist = keep everything. Blacklist always wins.
What happens at ingest
- The connector writes the document bytes and inserts the
documentsrow. ProcessIncomingDocumentfans out two parallel tracks:ForwardToDestinationjobs (one per binding) — the existing delivery path. Enrichment does not block delivery.RunMetadataEnricherjobs (one per active enricher bound to the repo) — onto a separateenrichmentqueue.
- Each enricher job: resolves the input mapping → calls SAP via the JCo bridge → applies whitelist/blacklist → upserts a
document_enrichmentsrow withstatus='fetched'. - On failure, the row is upserted with
status='failed'+ the error text. The enricher row recordslast_errorfor the library view.
Run a worker with both queues:
php artisan queue:work --queue=default,enrichment
Where results show up
- Document detail — a SAP Metadata section renders one collapsible card per bound enricher with the fetched fields as a key:value tree. Per-card "Fetch" lets you re-run on demand; if the latest result differs from what's stored you'll see an Override button before anything is saved.
- Document export — successful enrichments are nested under a
metadataobject keyed by enricher name. Pending/failed rows are omitted. - Deleted functions — if you delete the enricher in the library, stored rows remain visible read-only with a strike-through name and a "Deleted" badge (no Fetch CTA, since the function is gone).
Scope today, scope later
In scope:
- Define + duplicate + rename + activate/deactivate enrichers
- Auto-fire on every new document for the bound repository
- Per-document fetch / re-fetch / override
- Whitelist + blacklist filtering
- Export inclusion
Deferred (not yet built):
- Backfill — running an enricher across already-ingested documents
- Query API —
?metadata.DOC_HEADER.VENDOR=ACMEfiltering on listing endpoints - Failure notifications — emailing the workspace when an enricher fails repeatedly
- Public metadata API — exposing structured metadata to customer apps
Next: see the RFC Console guide for the playground + library UI.