thesis.md 5.84 KB

Edit Raw Blame History Permalink



The data-driven thesis

xly is sold to many printing-industry customers, each of whom wants the ERP
to behave a little differently — different forms, different reports,
different approval rules, sometimes different stored procedures. The naive
solution is a fork per customer: copy the codebase, modify, deploy. That
is unmaintainable past two or three customers.

xly's solution is the opposite: a single codebase, a single deployment,
and per-customer behaviour expressed as data. The application's modules,
forms, fields, dropdowns, permissions, document numbering, even the URL
slugs are all rows in metadata tables (gdsmodule, gdsconfigformmaster,
gdsconfigformslave, gdsroute, gdsjurisdiction, gdsformconst, …).

The runtime is an interpreter. When a request comes in, the framework
loads the relevant rows, joins the user's tenant context onto them, and
renders the resulting form / list / report on demand. The Java code is
generic; the application's behaviour is in the database. PMs (not
engineers) own the metadata and therefore own the application.


The cost

Three costs are baked into this design and worth being explicit about:


Per-request metadata reads. Every page load runs five queries
on cache miss: gdsconfigformmaster (with personalize/customslave
overlays for the matching slave rows), gdsformconst,
sysjurisdiction (per-user grants — the map key is named
gdsjurisdiction but the actual table read is sysjurisdiction;
skipped for ADMIN), sysbillnosettings, sysreport. The runtime
caches aggressively, but those reads are unavoidable on cache miss.
A schema that won't stop growing. New module = a row in
gdsmodule plus 1-50 rows in gdsconfigformslave plus a backing
physical table (often per-document-type). The base-table count climbs
as more business modules are introduced; production tenants typically
carry more tables than a clean dev schema, since every customer-
bespoke module survives in the shared schema.
Relationships are conventions, not constraints. With FKs disabled
for performance and migration agility, every join from
gdsconfigformmaster.sParentId to gdsmodule.sId (and a hundred
similar joins) is a semantic FK. Orphan rows are
possible.


What the design enables (and what each enabler still costs)


One codebase serves dozens of customers. Each customer's tenant
has its own metadata rows; the Java is identical. — Limit: it
doesn't serve all customers. The 18 directories under
script/客户/ (see Slice 5)
are the wall the data-driven design hits — when a customer needs
different procedural logic, "single codebase" stops being true and
becomes "single Java codebase + a fan-out of customer-specific SQL
the database carries silently".

PMs evolve the application without engineering time. They open
BACK, add a module, define a form, set permissions, and the next user
load shows the change. — Limit: the PM's effective vocabulary is
whatever gdsconfigformmaster / gdsconfigformslave columns
expose. Anything genuinely new (a custom calculation, a non-standard
validation, a different save path) requires a stored procedure —
which takes engineering time again, just in SQL instead of Java. And
PMs without DB access can't reason about why their metadata change
produced wrong output, because the procedural side is invisible from
BACK.

Customizations are layered "cleanly" (Slice 4):
per-tenant overrides sit on top of the shared base without forking.
— Limit: the cleanliness is a Java-side property. The runtime
merge logic in BusinessBaseServiceImpl is non-trivial (3,900+
lines), debugging "why does this tenant see field X but not Y"
involves chasing through gdsconfigformpersonalize +
gdsconfigformcustomslave + gdsconfigformuserslave interactions.
And the overlay model can't ALTER TABLE — adding a real new
column still needs a coordinated schema migration.


A more candid reading: the data-driven design shifts complexity
out of Java and into the database and the PM-built metadata. The
total complexity isn't lower; it's redistributed to people and tools
the framework can't compile-check.


When it breaks down

Data-driven works until a customer needs behaviour that can't be expressed
as metadata — different SQL, different procedure body, an aggregation rule
that doesn't fit the framework's vocabulary. xly's response is the
per-customer SQL override channel:
hand-written SQL committed to script/客户/<customer>/ and applied
directly to that customer's schema, bypassing the framework entirely.

It's worth being blunt about what this means. "Bypassing the framework"
makes the entire data-driven thesis a partial property of the system.
For the 18 customers under script/客户/ the runtime is no longer
single-codebase — the Java is shared but the actual proc bodies
running on each customer's DB diverge, with no automated way to
detect drift. A reviewer reading Sp_SalSalesCheck in source has no
guarantee it's what runs in production for any given customer. The
"escape hatch" framing is generous; in practice the override channel
has become the standard answer for material business-logic
differences, which is the failure mode the data-driven design was
supposed to prevent.


What this means for reading the wiki

Every slice in this wiki documents one application of the thesis. Slice 1
is the metadata read on a CRUD module — the canonical instance. Slice 2 is multi-tenant scoping
through every layer. Slice 3 is the read-only / view-backed variant. Slice
4 is the customization overlay. Slice 5 is the escape hatch when the
overlay isn't enough. Together they cover the data-driven design from
its centre out.