Microsoft Access databases are everywhere in Indonesian SMEs. They were built in the late 2000s by someone who 'knew a bit about computers,' they run critical business functions, and nobody is entirely sure how they work anymore. The person who built them often left years ago. The VBA code has no documentation. The data model has evolved organically into something that was never designed but just grew. Yet the business depends on it every day. Access databases have a 2GB file size limit, no concurrent multi-user support beyond a handful of users, no API, no audit trail, and security that amounts to a file system password. Migrating to a modern custom ERP is overdue for most businesses still running Access — and it's a project with a specific, manageable set of challenges.
Before planning the migration, audit the Access system in detail. What tables exist and what do they contain? What forms and reports are used, and by whom, how often? What queries exist and what business logic do they implement? Are there any VBA modules with custom calculations or validation rules? What integrates with or depends on the Access database — other spreadsheets that reference it, print templates, external reports? This audit takes 1–2 weeks and is non-negotiable. Without it, you'll discover critical dependencies halfway through the migration when they're expensive to handle.
Extracting data from Access is technically straightforward — Access databases are MDB or ACCDB files that can be read by any ODBC-compatible tool. In Python, use pyodbc or the msaccessdb library to connect and read tables. Export all tables to CSV or directly to a PostgreSQL staging database. The challenge is not the extraction — it's what the data looks like once extracted. Access data commonly has: text fields used for numeric values (amounts stored as text with currency symbols), date fields with multiple formats, yes/no fields stored as -1/0 (the Access convention), memo fields with embedded line breaks, and null values that are actually empty strings.
Access VBA code often contains critical business logic that must be replicated in the new system. Calculation formulas, validation rules, data transformation functions, report generation logic — all of this needs to be understood and re-implemented. The challenge is that VBA code is often undocumented and written in an ad-hoc style that reflects the original developer's thought process, not a clear business rule. For each VBA module or function, document what it does in plain language, get validation from a business user ('does this function correctly calculate the late payment penalty?'), and then re-implement it in the new system with proper tests.
Access Database → Modern ERP Migration Pipeline
┌─────────────────────────────────────────────────────────────┐
│ STAGE 1: AUDIT │
│ ● List all tables, forms, reports, VBA modules │
│ ● Interview 5+ users: "What do you do in Access every day?"│
│ ● Map each form → business workflow │
│ ● Identify custom VBA logic (pricing, tax, calculations) │
└────────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ STAGE 2: EXTRACT (Python + pyodbc) │
│ ● Read all .mdb/.accdb tables into staging PostgreSQL DB │
│ ● Preserve original column names for traceability │
│ ● Profile: NULL counts, duplicates, type mismatches │
└────────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ STAGE 3: TRANSFORM │
│ ● Standardize date formats (dd/mm/yyyy → ISO 8601) │
│ ● Convert Yes/No → boolean (Access uses -1/0) │
│ ● Deduplicate records (keep most recent by date) │
│ ● Map old codes → new ERP codes │
│ ● Validate referential integrity │
│ ● Flag rows needing business owner review │
└────────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ STAGE 4: VALIDATE │
│ ● Business data steward reviews flagged records │
│ ● Record count check: source = target (±intentional drops) │
│ ● Financial balance check: AR, AP, inventory totals match │
│ ● Spot check: 20+ records per entity type manually verified│
└────────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ STAGE 5: PARALLEL RUN (4 weeks minimum) │
│ ● Both Access and ERP running simultaneously │
│ ● Compare identical transactions from both systems daily │
│ ● Access = read-only after go-live decision │
└────────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ STAGE 6: ARCHIVE + SHUTDOWN │
│ ● Archive Access .mdb file in secure storage │
│ ● Set hard shutdown date (Day 30–60 post go-live) │
│ ● Remove Access from shared drives │
└─────────────────────────────────────────────────────────────┘From my experience implementing ERPs at Commsult: when migrating from Access, run the old system and new ERP in parallel for at least 4 weeks, comparing outputs for the same set of transactions. Generate the same reports from both systems and diff them line by line. Any discrepancy is either a migration bug or a case where the old system was wrong — and you need a business decision on which interpretation is correct before go-live.
A solid Access-to-ERP migration pipeline has these stages. Stage 1 — Extract: Python script reads all Access tables via pyodbc, exports to a staging PostgreSQL database with exact column names and types preserved. Stage 2 — Profile: SQL queries identify data quality issues in the staging database (null counts, distinct value counts, min/max ranges, type violations). Stage 3 — Transform: data cleansing scripts fix quality issues and map Access schema to ERP schema (renaming columns, splitting combined fields, converting types, merging duplicate records). Stage 4 — Validate: business data stewards verify transformed data against the Access source. Stage 5 — Load: migration script inserts validated data into the production ERP database via the application import API.
# Python: Extract all tables from Microsoft Access .mdb/.accdb
import pyodbc
import pandas as pd
from sqlalchemy import create_engine
# Connect to Access database (Windows ODBC driver required)
ACCESS_CONN = (
r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
r"DBQ=C:\legacy\commsult_legacy.accdb;"
)
conn = pyodbc.connect(ACCESS_CONN)
cursor = conn.cursor()
# List all user tables (exclude system tables)
tables = [row.table_name for row in cursor.tables(tableType='TABLE')
if not row.table_name.startswith('MSys')]
print(f"Found {len(tables)} tables: {tables}")
# PostgreSQL staging database
pg_engine = create_engine(
"postgresql://user:pass@localhost:5432/erp_staging"
)
for table in tables:
df = pd.read_sql(f"SELECT * FROM [{table}]", conn)
# Fix Access-specific type issues
for col in df.columns:
if df[col].dtype == object:
# Handle Yes/No fields stored as True/False
if set(df[col].dropna().unique()).issubset({True, False, -1, 0}):
df[col] = df[col].map({True: True, -1: True, False: False, 0: False})
# Load to staging (preserve Access schema for reference)
df.to_sql(f"access_{table.lower()}", pg_engine,
if_exists='replace', index=False)
print(f"✓ Loaded {len(df):,} rows from [{table}]")
# Profile data quality issues
with pg_engine.connect() as pg_conn:
for table in tables:
tbl = f"access_{table.lower()}"
result = pg_conn.execute(f"""
SELECT
'{table}' as table_name,
COUNT(*) as total_rows,
COUNT(*) - COUNT(DISTINCT *) as potential_duplicates
FROM {tbl}
""").fetchone()
print(f"{result}")
conn.close()The people who use Access every day have built their expertise around it. They know the keyboard shortcuts, the form layouts, the report locations. Replacing that knowledge base is a real loss, and it deserves acknowledgment. The transition communication should be honest: the new system is more capable, more secure, and more maintainable. The first weeks will be slower as you learn the new workflows. Your current expertise is still valuable — the business logic you know will inform how we build the new system. Involving key Access power users in UAT — letting them test the new system and compare it to Access — converts potential resisters into advocates.
The temptation after a successful ERP go-live is to keep the Access database running 'just in case.' This is how shadow systems persist for years. Set a hard shutdown date for the Access database — 30–60 days after ERP go-live — and communicate it clearly to all users before go-live. After shutdown, archive the Access MDB file in secure storage. Keeping it accessible as a working system means users will use it when the ERP workflow seems slower or unfamiliar, and parallel data entry begins. Parallel data entry is the end of your data quality and the beginning of reconciliation hell.
Access reports are often the most business-critical output of the system, and they're the hardest to migrate because report formats are highly specific to business needs. Inventory every Access report: name, who uses it, how often, what decisions it supports. For each report, determine whether it can be replicated in the ERP's built-in reporting, whether it needs a custom report built in the ERP, or whether it should be replaced by a better report that the ERP makes possible. Don't try to replicate every Access report exactly in the new system — this is an opportunity to consolidate 20 ad-hoc reports into 5 well-designed dashboards.
The first 90 days after migrating from Access to ERP reveal any migration gaps that testing didn't catch. Run the Access system in read-only mode for the first 30 days (no new entries, but the data is still accessible for reference). Monitor for data discrepancies — if a user asks 'why does the ERP show 15 units but Access shows 17?' that's a migration bug that needs investigation. Track all data discrepancy reports in a log and resolve them within 48 hours. By day 60, all known discrepancies should be resolved and the Access system can be safely archived.