The data-driven thesis
xly is sold to many printing-industry customers, each of whom wants the ERP to behave a little differently — different forms, different reports, different approval rules, sometimes different stored procedures. The naive solution is a fork per customer: copy the codebase, modify, deploy. That is unmaintainable past two or three customers.
xly's solution is the opposite: a single codebase, a single deployment,
and per-customer behaviour expressed as data. The application's modules,
forms, fields, dropdowns, permissions, document numbering, even the URL
slugs are all rows in metadata tables (gdsmodule, gdsconfigformmaster,
gdsconfigformslave, gdsroute, gdsjurisdiction, gdsformconst, …).
The runtime is an interpreter. When a request comes in, the framework loads the relevant rows, joins the user's tenant context onto them, and renders the resulting form / list / report on demand. The Java code is generic; the application's behaviour is in the database. PMs (not engineers) own the metadata and therefore own the application.
The cost
Three costs are baked into this design and worth being explicit about:
Per-request metadata reads. Every page load runs five queries on cache miss:
gdsconfigformmaster(with personalize/customslave overlays for the matching slave rows),gdsformconst,sysjurisdiction(per-user grants — the map key is namedgdsjurisdictionbut the actual table read issysjurisdiction; skipped for ADMIN),sysbillnosettings,sysreport. The runtime caches aggressively, but those reads are unavoidable on cache miss.A schema that won't stop growing. New module = a row in
gdsmoduleplus 1-50 rows ingdsconfigformslaveplus a backing physical table (often per-document-type). The base-table count climbs as more business modules are introduced; production tenants typically carry more tables than a clean dev schema, since every customer- bespoke module survives in the shared schema.Relationships are conventions, not constraints. With FKs disabled for performance and migration agility, every join from
gdsconfigformmaster.sParentIdtogdsmodule.sId(and a hundred similar joins) is a semantic FK. Orphan rows are possible.
What the design enables (and what each enabler still costs)
-
One codebase serves dozens of customers. Each customer's tenant
has its own metadata rows; the Java is identical. — Limit: it
doesn't serve all customers. The 18 directories under
script/客户/(see Slice 5) are the wall the data-driven design hits — when a customer needs different procedural logic, "single codebase" stops being true and becomes "single Java codebase + a fan-out of customer-specific SQL the database carries silently". -
PMs evolve the application without engineering time. They open
BACK, add a module, define a form, set permissions, and the next user
load shows the change. — Limit: the PM's effective vocabulary is
whatever
gdsconfigformmaster/gdsconfigformslavecolumns expose. Anything genuinely new (a custom calculation, a non-standard validation, a different save path) requires a stored procedure — which takes engineering time again, just in SQL instead of Java. And PMs without DB access can't reason about why their metadata change produced wrong output, because the procedural side is invisible from BACK. -
Customizations are layered "cleanly" (Slice 4):
per-tenant overrides sit on top of the shared base without forking.
— Limit: the cleanliness is a Java-side property. The runtime
merge logic in
BusinessBaseServiceImplis non-trivial (3,900+ lines), debugging "why does this tenant see field X but not Y" involves chasing throughgdsconfigformpersonalize+gdsconfigformcustomslave+gdsconfigformuserslaveinteractions. And the overlay model can'tALTER TABLE— adding a real new column still needs a coordinated schema migration.
A more candid reading: the data-driven design shifts complexity out of Java and into the database and the PM-built metadata. The total complexity isn't lower; it's redistributed to people and tools the framework can't compile-check.
When it breaks down
Data-driven works until a customer needs behaviour that can't be expressed
as metadata — different SQL, different procedure body, an aggregation rule
that doesn't fit the framework's vocabulary. xly's response is the
per-customer SQL override channel:
hand-written SQL committed to script/客户/<customer>/ and applied
directly to that customer's schema, bypassing the framework entirely.
It's worth being blunt about what this means. "Bypassing the framework"
makes the entire data-driven thesis a partial property of the system.
For the 18 customers under script/客户/ the runtime is no longer
single-codebase — the Java is shared but the actual proc bodies
running on each customer's DB diverge, with no automated way to
detect drift. A reviewer reading Sp_SalSalesCheck in source has no
guarantee it's what runs in production for any given customer. The
"escape hatch" framing is generous; in practice the override channel
has become the standard answer for material business-logic
differences, which is the failure mode the data-driven design was
supposed to prevent.
What this means for reading the wiki
Every slice in this wiki documents one application of the thesis. Slice 1 is the metadata read on a CRUD module — the canonical instance. Slice 2 is multi-tenant scoping through every layer. Slice 3 is the read-only / view-backed variant. Slice 4 is the customization overlay. Slice 5 is the escape hatch when the overlay isn't enough. Together they cover the data-driven design from its centre out.