Technical SEO for Developers: Canonical, hreflang, JSON-LD

Photo by Agence Olloweb

Photo by Agence Olloweb
Technical SEO has a reputation problem among developers: it sounds like marketing, so we ignore it until someone asks why the Indonesian version of a page ranks in Germany, or why Google indexed the staging domain. But canonical tags, hreflang, structured data, and sitemaps are not marketing — they are protocol-level contracts between your HTML and a crawler. Getting them wrong is a bug, with the same root causes as any other bug.
This checklist comes from shipping this very portfolio — a bilingual English-Indonesian Next.js site with a couple hundred blog posts — plus a handful of client builds. Everything here is the developer half of SEO: things you fix in code, verify with curl, and regression-test in CI. No keyword research, no content strategy, just the plumbing that decides whether your content is even eligible to rank.
Every page on your site is reachable through more URLs than you think: with and without trailing slash, with UTM parameters, via http before the redirect, sometimes through both a vanity route and a real one. To Google these are separate pages competing against each other, splitting ranking signals and burning crawl budget. The canonical tag is how you declare which address is the real one — and per Google's documentation, redirects are an even stronger signal, with sitemap inclusion the weakest.
In the App Router the clean implementation is the alternates field of the Metadata API, computed in generateMetadata so every dynamic route declares itself. Two rules I enforce: canonicals are always absolute URLs, and every locale version is its own canonical. Pointing the Indonesian page's canonical at the English page is a classic mistake — it tells Google the Indonesian content is a duplicate to be dropped, which is the exact opposite of what a bilingual site wants.
// app/[locale]/blog/[slug]/page.tsx — canonical + hreflang in one place
export async function generateMetadata({ params }): Promise<Metadata> {
const { locale, slug } = await params
const base = "https://example.com"
return {
title: post.title,
description: post.description,
alternates: {
// ONE canonical per language version — not one shared
// canonical pointing every locale at English.
canonical: `${base}/${locale}/blog/${slug}`,
languages: {
en: `${base}/en/blog/${slug}`,
id: `${base}/id/blog/${slug}`,
// safety net for every unmatched visitor:
"x-default": `${base}/en/blog/${slug}`,
},
},
}
}hreflang tells Google which language versions of a page exist so it can route Indonesian searchers to /id/ and everyone else to /en/. The implementation rules from Google's docs are strict enough that most sites get at least one wrong:
The mistake I made on an early version of this site: my hreflang annotations were generated in one component but a refactor left one template emitting only the current locale's link. No error, no warning — just hreflang silently void on those pages because the bidirectional contract broke. The fix that sticks is structural: generate canonical and the full languages map from a single helper, so a page cannot declare one without the other.
hreflang failures are invisible in the browser and in Lighthouse. The only places they show up are Search Console's international targeting reports and raw HTML inspection. If you have not curled your production pages and read the link tags with your own eyes, you do not know your hreflang works.
Structured data is how you tell Google what a page is, not just what it says — this is an Article, by this Person, published on this date. Google recommends the JSON-LD format over microdata, and in a Server Component it costs nothing: build a plain object, serialize it into a script tag, done. No client JavaScript, no library.
// JSON-LD in a Server Component — no client JS needed
export default async function BlogPost({ params }) {
const jsonLd = {
"@context": "https://schema.org",
"@type": "Article",
headline: post.title,
datePublished: post.datePublished,
author: { "@type": "Person", name: "Matthews Wong" },
image: `https://example.com${post.image}`,
}
return (
<>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(jsonLd) }}
/>
<article>...</article>
</>
)
}Be honest in the markup. Rich-result eligibility depends on required properties per schema type, and Google validates aggressively — marking content with ratings it does not have or authorship it cannot show is how sites earn manual actions. I keep it boring: Article for posts, Person for the about page, BreadcrumbList where the UI actually shows breadcrumbs. Validate every template in the Rich Results Test before shipping, because one malformed JSON-LD block silently disqualifies the whole page.
The App Router turned sitemaps from an annoying build artifact into a typed function: app/sitemap.ts exports your URL list, generated from the same data source that renders the pages — on this site, the blog registry feeds both, so a new post cannot exist without a sitemap entry. Include lastModified honestly from real content dates; crawlers use it to prioritize re-crawls, and a sitemap where everything changed today reads as noise.
robots.txt via app/robots.ts follows the same pattern. Keep it minimal: point at the sitemap, block genuinely useless paths like API routes, and never use robots.txt to hide duplicate content — a blocked page cannot be crawled, which means Google can never see the canonical tag you put on it. Disallow and canonical are mutually exclusive tools, a subtlety straight out of Google's consolidation docs.
Four items that are absolutely the developer's job, found broken on most sites I audit:
Honest status codes
A missing page must return 404, not a styled error component with HTTP 200. Soft 404s pollute the index, and redirect chains bleed signal — permanent moves get a single 301, not a 302 left over from testing.
One H1 and a real heading tree
Crawlers reconstruct document structure from headings, same as screen readers. The accessibility audit and the SEO audit converge here: heading hierarchy fixes serve both masters for free.
Metadata for the link preview economy
OpenGraph and Twitter card tags decide how every share on WhatsApp and LinkedIn renders. For Indonesian audiences, where WhatsApp link sharing dominates traffic, a missing og-image measurably depresses click-through.
Core Web Vitals as a ranking input
Page experience signals feed ranking. The perf work — LCP, CLS, INP — is not separate from SEO; it is the part of SEO that lives entirely in your codebase.
Technical SEO is a contract surface, and contracts are what we are good at. Canonical declares identity, hreflang declares language routing with a strict reciprocity rule, JSON-LD declares meaning, and the sitemap declares inventory. None of it requires marketing intuition — it requires the same discipline as an API: generate from one source of truth, validate in CI, verify in production with curl. Do the boring plumbing once, and your content competes on its actual merit.
Add a CI step that fetches three rendered pages — home, one localized post in each language — and asserts on canonical, hreflang completeness, and JSON-LD parseability. It takes an hour to write and has caught every SEO regression on this site since.