Migrating from Excel to a Custom ERP: Data Mapping, Validation, and Cutover Strategy

Photo by Unsplash

Photo by Unsplash
For most Indonesian SMEs, the starting point for an ERP project is a collection of Excel files — one for customers, one for vendors, one for each year's invoices, and probably several versions of each with names like 'Final_v3_FINAL_USE_THIS.xlsx'. Migrating this data into a new ERP system is one of the most underestimated and most critical tasks in the entire project. Get it wrong and you go live with corrupted data that erodes user trust immediately. Get it right and users arrive on day one to find their familiar customer list, invoice history, and open balances already in the system.
Before writing a single line of migration code, spend at least one week auditing the source Excel files. The audit should answer: how many unique customers exist across all files (with deduplication), what date formats are used, what fields are required vs optional, what values appear in 'status' columns, and what cross-references exist between sheets.
The most common data quality issues in Indonesian business Excel files are: NPWP numbers stored as text with dots and dashes needing normalization to 15 digits; dates entered as Indonesian text strings ('15 Maret 2023') that standard date libraries can't parse; duplicate customer records with slightly different names ('PT Maju Bersama' vs 'PT. Maju Bersama'); currency amounts stored as text with 'Rp' prefix; and merged cells causing rows to appear empty.
Create a mapping document that shows every Excel column, its target ERP field, the transformation required, and validation rules. This document becomes the specification for your migration script and the agreement between the technical team and business stakeholders about what data is moving and how. For each target field, document whether it's required or optional, its data type, allowed values (for enums), and what happens when the source value is missing or invalid.
Excel → Custom ERP Migration Pipeline
┌─────────────────┐
│ SOURCE: Excel │ Multiple sheets, ad-hoc formulas,
│ (.xlsx files) │ merged cells, inconsistent formats
└────────┬────────┘
│ Step 1: Extract & Parse
▼
┌─────────────────────────────────────┐
│ EXTRACT (Node.js / exceljs) │
│ • Read all sheets │
│ • Strip formatting, get raw values │
│ • Map columns → field names │
└────────────────┬────────────────────┘
│ Step 2: Validate
▼
┌─────────────────────────────────────┐
│ VALIDATE (Zod schemas) │
│ • Required fields present? │
│ • Date formats parseable? │
│ • FK references exist? (customer) │
│ • Duplicate invoice numbers? │
│ • Amount totals match? │
└──────┬──────────────────────┬───────┘
│ PASS │ FAIL
▼ ▼
┌─────────────┐ ┌─────────────────┐
│ TRANSFORM │ │ ERROR REPORT │
│ & LOAD │ │ (.xlsx output) │
│ PostgreSQL │ │ Row + reason │
└─────────────┘ └─────────────────┘
Step 3: Cutover
┌────────────────────────────────────┐
│ Parallel run: Excel + ERP (2 wk) │
│ → Reconcile totals daily │
│ → Hard cutover on month-end │
└────────────────────────────────────┘Run your migration script against the real Excel files at least three times before go-live: once at the start of data preparation to establish a baseline, once after data cleaning to measure improvement, and once on go-live day with the final data export. Each run should produce an error report — the goal is to drive that report to zero rows.
We use ExcelJS for reading Excel files in Node.js — it handles merged cells, rich text, formula values, and all the quirks of real-world Excel files. Zod provides schema validation with TypeScript type inference — you define the expected shape of each row, and Zod tells you exactly which field failed and why for each invalid row.
The migration script should be idempotent — you should be able to run it multiple times against the same database without creating duplicates. Use TypeORM's upsert with a unique conflict key so that reruns update existing records rather than creating new ones. Wrap the bulk insert in a database transaction so that if any row fails, the entire batch is rolled back.
// NestJS: Excel import service using exceljs + Zod validation
import * as ExcelJS from 'exceljs';
import { z } from 'zod';
const CustomerRowSchema = z.object({
customerCode: z.string().min(1),
name: z.string().min(1),
taxId: z.string().regex(/^d{15,16}$/, 'NPWP must be 15-16 digits').optional(),
creditLimit: z.coerce.number().min(0),
paymentTerms: z.enum(['NET_7', 'NET_14', 'NET_30', 'NET_45', 'NET_60']),
phone: z.string().optional(),
email: z.string().email().optional(),
});
@Injectable()
export class ExcelImportService {
async importCustomers(buffer: Buffer): Promise<ImportResult> {
const workbook = new ExcelJS.Workbook();
await workbook.xlsx.load(buffer);
const sheet = workbook.getWorksheet('Customers');
const errors: ImportError[] = [];
const valid: CustomerRow[] = [];
const HEADER_ROW = 1;
sheet.eachRow({ includeEmpty: false }, (row, rowNumber) => {
if (rowNumber === HEADER_ROW) return; // skip header
const rawData = {
customerCode: row.getCell('A').text.trim(),
name: row.getCell('B').text.trim(),
taxId: row.getCell('C').text.trim() || undefined,
creditLimit: row.getCell('D').value,
paymentTerms: row.getCell('E').text.trim().toUpperCase(),
phone: row.getCell('F').text.trim() || undefined,
email: row.getCell('G').text.trim() || undefined,
};
const result = CustomerRowSchema.safeParse(rawData);
if (result.success) {
valid.push(result.data);
} else {
errors.push({
row: rowNumber,
data: rawData,
issues: result.error.issues.map(i => i.message),
});
}
});
// Bulk insert valid rows in a transaction
if (valid.length > 0) {
await this.dataSource.transaction(async manager => {
await manager.upsert(Customer, valid, ['customerCode']);
});
}
return { imported: valid.length, failed: errors.length, errors };
}
}Historical data comes in two flavors: closed records (paid invoices, completed leave requests) and open records (unpaid invoices, pending approvals). Open records need to be imported at exactly the right state so that the automation picks them up correctly. The most dangerous migration mistake is importing an overdue invoice as DRAFT status — it won't trigger any reminders.
Never delete the source Excel files after migration. Archive them in read-only storage and keep them for at least 12 months after go-live. Users will inevitably find discrepancies between the ERP and what they remember from Excel, and you'll need the original files to investigate. Also, before go-live, take a full PostgreSQL database dump and store it offline — if something catastrophic happens in the first 48 hours, you need a clean rollback point.
After the initial migration, run the old Excel system and the new ERP in parallel for at least 2 weeks before the hard cutover. During parallel running, transactions are entered in both systems. At the end of each week, reconcile key totals between the two systems. Any discrepancy triggers an investigation before cutover.
Schedule go-live for a Friday afternoon or a quiet mid-month day. Final steps: export the last Excel data on Thursday evening, run the migration script one final time Friday morning, verify record counts match, run reconciliation totals, have the finance manager sign off, disable Excel file editing, send company-wide announcement, and have the implementation team on-site or on call for the full first week.