Building a Small Simple Comment System
/ 10 min read
Overview
I built this comments service for my own site. The scope stays small on purpose. A reader signs in, opens a thread, writes a comment, replies, and leaves a like. I still want moderation, session control, and browser security work that holds up over time.
Small does not mean weak. Small means I can hold the whole service in my head, audit the request path, and fix problems without hunting across six systems.
What Small Means Here
I set a few rules early. Each blog post gets one thread from the site RSS feed. Replies are comments with parentCommentId and depth. Markdown is allowed. Raw HTML is blocked.
Every write goes through Origin checks, CSRF validation, and auth.
Those limits keep the service maintainable. They cut hidden scope, which matters more than feature count for a comments tool.
Request Flow
The schema matters, but the request flow explains the service better.
A reader opens a post. The frontend asks the service to map the post slug to a thread. The client then loads comments for that thread. For signed-in readers, the response carries whether that reader liked each comment.
Login uses GitHub OAuth with PKCE. The service stores temporary OAuth state, receives the callback, upserts the user, creates a session row, and sets a session cookie.
Every write follows the same gate:
- Check the Origin.
- Check the CSRF token.
- Check the session.
- Apply rate limits.
- Run the write.
Cheap rejection comes first. Stateful work comes later.
Data Model
Each table has a narrow job.
User stores GitHub identity and moderation flags. Session stores server-side session state, including expiresAt, revokedAt, and lastUsedAt.
Thread maps a (siteKey, resourceType, resourceId) tuple to one thread. Comment stores the parent pointer, depth, markdown body, rendered HTML, and edit or delete timestamps.
CommentReaction stores likes with a unique (commentId, userId, reaction) key. OAuthState stores short-lived PKCE state with codeVerifier and returnTo. PrebannedUser blocks identities before login.
That shape covers the current feature set. More tables do not make this service safer or easier to run.
Soft Delete And Cleanup
Comments start with a soft delete. Each row stores deletedAt and deletedBy. A cron job hard deletes items older than 72 hours.
That gives moderators room to react without instant data loss. It keeps the active tables clean over time too.
RSS-Gated Thread Resolution
The resolve endpoint takes siteKey, resourceType, and resourceId, then returns a threadId. The upsert is ordinary. The guardrail does the real work.
The service fetches the site RSS feed, pulls valid slugs, and creates threads only for posts that exist. Random slugs cannot fill the database with junk rows.
Why PKCE Fits
PKCE fits this public web login flow. The browser does not hold a secret. The callback still proves that the flow matches the request that started earlier.
The start route generates state, codeVerifier, and codeChallenge. It stores the verifier and return path in OAuthState, then redirects to GitHub with the state and challenge.
The browser stays simple. The service keeps the sensitive exchange on the server.
:::note
PKCE lets a public client prove it started the flow without shipping a secret to the browser. The server holds the codeVerifier and checks it on the callback.
:::
Return Path Validation
Every OAuth flow needs a safe return path after login. I validate returnTo against known blog origins plus the service origin.
Without that check, login can turn into an open redirect risk.
Server-Side Sessions
Sessions live in Postgres. The cookie is only a pointer: lh_comments_session=<uuid>.
On each authenticated request, the service reads the cookie, loads the session row, checks revocation, checks expiry, and returns the user. lastUsedAt updates in the background.
I picked this over JWTs. Revocation stays simple. Ban enforcement stays simple. Session invalidation does not need extra token rules.
Mutation Gating
Every write checks Origin, even with CORS configured. In production, the service demands an Origin header and rejects requests outside the allowlist. The server owns write boundaries.
// Pseudocode shaped like the real route guard
export async function mutationAllowed(request: NextRequest) {
const origin = request.headers.get('origin')
if (env.NODE_ENV === 'production') {
if (!origin) return { ok: false, code: 'MUTATION_ORIGIN_REQUIRED' }
if (!isAllowedOrigin(origin)) return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' }
} else {
if (origin && !isAllowedOrigin(origin)) {
return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' }
}
}
// CSRF check happens here too
return { ok: true }
}
CSRF Mechanism
The service uses a CSRF cookie and a request header. The cookie stores csrf_token=<random>. The client sends the same value in X-CSRF-Token.
The server checks presence, equal length, and constant-time equality. Failure timing should not leak token shape.
const CSRF_COOKIE = 'csrf_token'
export async function verifyCsrf(request: NextRequest) {
const cookieToken = (await cookies()).get(CSRF_COOKIE)?.value
const headerToken = request.headers.get('x-csrf-token')
if (!cookieToken || !headerToken) return false
if (cookieToken.length !== headerToken.length) return false
// Constant-time compare avoids timing leaks
return crypto.timingSafeEqual(
Buffer.from(cookieToken),
Buffer.from(headerToken)
)
}
How The Client Gets The Token
The /v1/me endpoint returns the current user, plus csrfToken. The client calls that endpoint on load, keeps the token in memory, and attaches it to each write request.
That keeps the write path explicit. Random components do not depend on hidden state.
Where Failures Come From
Most failures are ordinary browser problems. A new origin is missing from the allowlist. One request forgets the CSRF header. Cookies stop crossing the boundary after http and https get mixed. A write starts before the /v1/me request.
The narrow flow makes debugging plain. Once the request path is fixed, the work gets dull in the best way.
Read Latency
One chart now carries both the median and the tail. Use the Δ Compare toggle to shade the gap between p50 and p95, or open the table view to read the raw days.
Most read wins came from doing less work. The service returns bodyHtml instead of rendering on each client load. Likes are grouped in one query. Response shapes stay consistent, so the frontend does not need follow-up fetches for normal list views.
The tail tells the truth. Spikes usually point to cold starts, slow database setup, or a large thread without a proper limit. Watch those numbers. Readers notice tail latency first.
Write Latency
Writes cost more. They include markdown render, sanitizing, and the full set of checks.
That is fine. Reads should stay cheap on a blog. A heavier write path is acceptable when the common path stays fast.
Error Rate
The day 4 bump matches failures I expect from an RSS fetch problem, an allowlist mismatch after a domain change, or a client request missing credentials: 'include'.
Problems I Hit In Practice
The hard parts were not SQL. Most issues came from browser state and cross-origin rules.
Cookies Across Multiple Origins
The core checks were simple. Is the blog on HTTPS? Is the service on HTTPS? Is the browser sending credentials?
Most random 401s are not random. A cookie failed to cross the boundary.
CSRF Token Ordering
A frontend post before /v1/me should fail. That failure feels surprising until the strict write path is clear.
Allowlist Drift
Preview domains come and go. A missing allowlist update makes writes fail.
The fix is small. Origin policy still needs one source of truth.
Rate Limits In Multi-Instance Setups
The current rate limiter uses an in-memory map. That works on one instance.
:::caution The in-memory rate limiter only counts requests on the box that handled them. Run two instances and each keeps its own tally, so the real limit doubles. A shared store such as Redis is the next step after the service runs across more than one instance. :::
Closing
This project stays small through simple identity, server-side sessions, guarded writes, sanitized markdown, and built-in moderation.
That is the goal: a comments service that does the job, stays understandable, and does not turn routine maintenance into a weekend project.