No ERP exists in isolation. Every business has an ecosystem of systems that must exchange data with the ERP: a payment gateway for AR collections, an email provider for automated notifications, a government tax portal for e-faktur submission, a banking API for payment processing, a logistics API for delivery tracking, a legacy accounting system during the migration period. Getting integrations wrong causes data corruption, duplicate transactions, and missed events — problems that are expensive to diagnose and even more expensive to fix in a production system. This post covers how to design ERP integration architecture that's reliable, maintainable, and easy to debug.
Enterprise integration has evolved from point-to-point APIs to event-driven architectures to modern iPaaS platforms. For a custom ERP serving an Indonesian SME, three patterns cover the vast majority of integration needs. Direct API integration: ERP calls an external API synchronously and waits for the response. Suitable for: real-time queries (check payment status, verify tax ID), actions where the user expects an immediate response (payment initiation). Event-driven integration: ERP publishes an event (invoice_created, payment_received) and one or more downstream systems consume it. Suitable for: notifications, logging, analytics, and any integration where the downstream system doesn't need to respond to the ERP. Queue-based async processing: ERP puts a job in a queue, a worker processes it and handles retries on failure. Suitable for: email sending, PDF generation, external API calls that might fail.
Direct API integration is the most common pattern and the most fragile. If the external API is down, your ERP request fails. If the network is slow, your user waits. If you call the external API twice due to a timeout retry, you may create a duplicate transaction. Reliable integration design solves these problems with three techniques: idempotency keys (each API request includes a unique ID that prevents duplicate processing), circuit breakers (stop retrying a failing API after N consecutive failures, fail fast and alert), and async fallback (convert synchronous calls to async queued jobs when latency or reliability is a concern).
For an ERP built on NestJS, BullMQ (Redis-backed job queue) is the standard choice for async integration processing. The pattern: when the ERP needs to call an external API (send an email, generate a PDF, submit an e-faktur), instead of calling the API directly, the ERP creates a job in the BullMQ queue and returns immediately to the user. A worker process picks up the job, executes the API call, handles failures with configurable retry logic (exponential backoff, maximum retry count), and logs the result to the audit trail. The user gets immediate confirmation that the action was accepted; the external call happens in the background.
ERP Integration Architecture — Queue-Based Async
┌──────────────────────────────────────────────────────────────┐
│ ERP Application (NestJS) │
│ │
│ User Action → Service → BullMQ Queue → Response to User │
│ ↓ │
│ ┌──────────┐ │
│ │ Redis │ (job storage) │
│ └──────┬───┘ │
└────────────────────────────────┼─────────────────────────────┘
↓
┌──────────────────────────────────────────────────────────────┐
│ Worker Processes │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐│
│ │ Email Worker │ │ PDF Worker │ │ e-Faktur Worker ││
│ │ (SendGrid) │ │ (Puppeteer) │ │ (DJP API) ││
│ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘│
│ │ │ │ │
│ ┌──────▼──────────────────▼─────────────────────▼─────────┐│
│ │ Circuit Breaker + Retry Logic ││
│ │ Attempt 1 → wait 1s → Attempt 2 → wait 4s → Attempt 3 ││
│ │ On 3 failures: route to Dead Letter Queue (DLQ) ││
│ └──────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────┘
EXTERNAL SYSTEMS:
┌─────────────┐ ┌─────────────┐ ┌──────────────┐ ┌────────┐
│ Midtrans │ │ SendGrid │ │ DJP e-Faktur│ │ BPJS │
│ (payments) │ │ (email) │ │ (tax portal)│ │(social)│
└─────────────┘ └─────────────┘ └──────────────┘ └────────┘
Dead Letter Queue review: check daily.
DLQ items = integration broken or data quality issue.From my experience implementing ERPs at Commsult: always implement a dead letter queue (DLQ) for failed integration jobs. Jobs that exhaust their retry attempts should go into a DLQ with full context — the job payload, the error messages from each attempt, and the timestamp. Review the DLQ daily. A pattern of failures in the DLQ often reveals an API contract change, an authentication expiry, or a data quality issue that isn't visible in application logs.
Your ERP's integration API defines how external systems can read data from and write data to the ERP. Design it with these principles: versioning (prefix all API endpoints with /api/v1/ to allow breaking changes without disrupting existing integrations), idempotency (POST endpoints for creating resources should accept a client-provided idempotency key that prevents duplicate creation), pagination (never return unbounded lists — all list endpoints must support pagination parameters), and authentication (machine-to-machine integrations use API keys or service account JWTs, not user credentials). Document every endpoint in an OpenAPI specification before implementation — this allows parallel development by both the ERP team and integration partners.
// NestJS: BullMQ Integration Service with DLQ
import { InjectQueue } from '@nestjs/bullmq';
import { Queue, Worker, QueueEvents } from 'bullmq';
// Integration queue definitions
const QUEUES = {
EMAIL: 'email-queue',
PDF: 'pdf-queue',
EFAKTUR: 'efaktur-queue',
WEBHOOK: 'webhook-queue',
} as const;
// Service: Add jobs to queue (async — returns immediately to user)
@Injectable()
export class IntegrationService {
constructor(
@InjectQueue(QUEUES.EMAIL) private emailQueue: Queue,
@InjectQueue(QUEUES.EFAKTUR) private efakturQueue: Queue,
) {}
async sendInvoiceEmail(invoiceId: string, recipientEmail: string) {
await this.emailQueue.add(
'send-invoice-email',
{ invoiceId, recipientEmail },
{
attempts: 3,
backoff: { type: 'exponential', delay: 1000 },
removeOnComplete: { age: 86400 }, // keep 24hrs for debugging
removeOnFail: false, // DLQ — keep failed jobs forever
}
);
// Returns immediately. User gets "Email queued" response.
}
async submitEFaktur(invoiceId: string) {
await this.efakturQueue.add(
'submit-efaktur',
{ invoiceId },
{
attempts: 5,
backoff: { type: 'exponential', delay: 2000 },
// e-Faktur DJP API is slow — allow more retries
timeout: 30_000,
}
);
}
}
// Worker: Processes jobs (handles retries, logs failures)
@Processor(QUEUES.EMAIL)
export class EmailWorker extends WorkerHost {
async process(job: Job<{ invoiceId: string; recipientEmail: string }>) {
const { invoiceId, recipientEmail } = job.data;
const invoice = await this.invoiceService.findOne(invoiceId);
const pdfUrl = await this.pdfService.generateInvoicePdf(invoiceId);
await this.emailProvider.send({
to: recipientEmail,
subject: `Invoice ${invoice.number} from Commsult Indonesia`,
html: this.templates.invoiceEmail(invoice),
attachments: [{ url: pdfUrl, filename: `invoice-${invoice.number}.pdf` }],
});
// Log to audit trail on success
await this.auditService.log({
action: 'EMAIL_SENT',
resource: 'invoice',
resourceId: invoiceId,
metadata: { recipient: recipientEmail, jobId: job.id },
});
}
}
// Webhook validation — ALWAYS verify signatures
@Post('/webhooks/midtrans')
async handleMidtransWebhook(
@Headers('x-callback-token') token: string,
@Body() payload: MidtransWebhookPayload,
) {
// Verify signature before processing
const expectedToken = createHash('sha512')
.update(`${payload.order_id}${payload.status_code}${payload.gross_amount}${MIDTRANS_SERVER_KEY}`)
.digest('hex');
if (token !== expectedToken) {
this.logger.warn(`Invalid Midtrans webhook signature: ${payload.order_id}`);
throw new UnauthorizedException('Invalid webhook signature');
}
// Safe to process
await this.paymentService.processCallback(payload);
}Indonesian ERP implementations typically require integrations that are not part of any standard platform's built-in connector library. E-Faktur (DJP): the tax authority's electronic invoice system requires submitting invoice data in a specific CSV format and receiving a NSFP (Nomor Seri Faktur Pajak) in return. This integration is mandatory for PKP-status businesses and requires careful error handling for validation failures. BPJS Kesehatan and BPJS Ketenagakerjaan: monthly reporting of employee health and employment insurance contributions requires submitting files in specific formats to government portals. Midtrans or Xendit: payment gateway integrations for AR collection, including virtual account creation, real-time payment notification webhooks, and reconciliation of payment records against AR invoices.
Webhooks — HTTP callbacks from external systems to your ERP — are a common integration pattern for payment confirmations, delivery status updates, and approval results. But webhooks are also a security risk: any system that knows your webhook URL can send you arbitrary payloads. Always validate webhook signatures. Payment gateways (Midtrans, Xendit) include a signature in the webhook header computed from the payload and your secret key. Verify this signature before processing the webhook. Reject any webhook with an invalid signature. Log all webhook receipts — including invalid ones — for audit and debugging purposes.
Integration failures are invisible unless you're actively monitoring for them. Build an integration health dashboard that shows: job queue depth (a growing queue means workers can't keep up), job failure rate by integration type, dead letter queue item count, external API error rate and latency, and webhook delivery success rate. Alert on: DLQ depth exceeding threshold (integration breaking), external API error rate above 5% (external system degraded), queue depth growing for 15+ minutes (worker capacity issue). These alerts should reach the team within 5 minutes of a threshold breach — integration failures often compound rapidly if not caught early.
Every integration should be documented in an integration contract: the external system name and version, the integration pattern (API, webhook, file, queue), the data exchanged (request/response schemas or file format specification), the authentication method, the error handling behavior (retry count, backoff strategy, DLQ behavior), the expected throughput and latency, and the owner responsible for maintaining the integration. When an external API provider changes their API (which happens without warning more often than you'd like), the integration contract tells you exactly what code needs updating. Without it, you're debugging in the dark.