Building a Small Simple Comment System
/ 9 min read
Overview
I built this comments service for my own site. The scope stays small on purpose. A reader signs in, opens a thread, writes a comment, replies, and leaves a like. I still need moderation, session control, and browser security work that holds up over time.
Small does not mean weak. Small means I can understand the whole service, audit the request path, and fix problems without hunting across six systems.
What Small Means Here
I set a few rules early. Each blog post gets one thread from the site RSS feed. Replies are comments with parentCommentId and depth. Markdown is allowed. Raw HTML is blocked.
Every write goes through Origin checks, CSRF validation, and auth.
Those limits keep the service maintainable. They remove hidden scope, which matters more than feature count for a comments tool.
Request Flow
The schema matters, but the request flow explains the service better.
A reader opens a post. The frontend asks the service to map the post slug to a thread. The client then loads comments for that thread. For signed-in readers, the response includes whether that reader liked each comment.
Login uses GitHub OAuth with PKCE. The service stores temporary OAuth state, receives the callback, upserts the user, creates a session row, and sets a session cookie.
Every write follows the same gate:
- Check the Origin.
- Check the CSRF token.
- Check the session.
- Apply rate limits.
- Run the write.
Cheap rejection comes first. Stateful work comes later.
Data Model
Each table has a narrow job.
User stores GitHub identity and moderation flags. Session stores server-side session state, including expiresAt, revokedAt, and lastUsedAt.
Thread maps a (siteKey, resourceType, resourceId) tuple to one thread. Comment stores the parent pointer, depth, markdown body, rendered HTML, and edit or delete timestamps.
CommentReaction stores likes with a unique (commentId, userId, reaction) key. OAuthState stores short-lived PKCE state with codeVerifier and returnTo. PrebannedUser blocks identities prior to login.
That shape covers the current feature set. More tables do not make this service safer or easier to run.
Soft Delete And Cleanup
Comments start with a soft delete. Each row stores deletedAt and deletedBy. A cron job hard deletes items older than 72 hours.
That gives moderators room to react without instant data loss. It keeps the active tables clean over time too.
RSS-Gated Thread Resolution
The resolve endpoint takes siteKey, resourceType, and resourceId, then returns a threadId. The upsert is ordinary. The guardrail does the real work.
The service fetches the site RSS feed, extracts valid slugs, and creates threads only for posts that exist. Random slugs cannot fill the database with junk rows.
Why PKCE Fits
PKCE fits this public web login flow. The browser does not hold a secret. The callback still proves that the flow matches the request that started earlier.
The start route generates state, codeVerifier, and codeChallenge. It stores the verifier and return path in OAuthState, then redirects to GitHub with the state and challenge.
The browser stays simple. The service keeps the sensitive exchange on the server.
Return Path Validation
Every OAuth flow needs a safe return path after login. I validate returnTo against known blog origins plus the service origin.
Without that check, login can become an open redirect risk.
Server-Side Sessions
Sessions live in Postgres. The cookie is only a pointer: lh_comments_session=<uuid>.
On each authenticated request, the service reads the cookie, loads the session row, checks revocation, checks expiry, and returns the user. lastUsedAt updates in the background.
I picked this over JWTs. Revocation stays simple. Ban enforcement stays simple. Session invalidation does not need extra token rules.
Mutation Gating
Every write checks Origin, even with CORS configured. In production, the service requires an Origin header and rejects requests outside the allowlist. The server owns write boundaries.
// Pseudocode shaped like the real route guardexport async function mutationAllowed(request: NextRequest) { const origin = request.headers.get('origin')
if (env.NODE_ENV === 'production') { if (!origin) return { ok: false, code: 'MUTATION_ORIGIN_REQUIRED' } if (!isAllowedOrigin(origin)) return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' } } else { if (origin && !isAllowedOrigin(origin)) { return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' } } }
// CSRF check happens here too return { ok: true }}CSRF Mechanism
The service uses a CSRF cookie and a request header. The cookie stores csrf_token=<random>. The client sends the same value in X-CSRF-Token.
The server checks presence, equal length, and constant-time equality. Failure timing should not reveal token shape.
const CSRF_COOKIE = 'csrf_token'
export async function verifyCsrf(request: NextRequest) { const cookieToken = (await cookies()).get(CSRF_COOKIE)?.value const headerToken = request.headers.get('x-csrf-token')
if (!cookieToken || !headerToken) return false if (cookieToken.length !== headerToken.length) return false
// Constant-time compare avoids timing leaks return crypto.timingSafeEqual( Buffer.from(cookieToken), Buffer.from(headerToken) )}How The Client Gets The Token
The /v1/me endpoint returns the current user, plus csrfToken. The client calls that endpoint on load, keeps the token in memory, and attaches it to each write request.
That keeps the write path explicit. Random components do not depend on hidden state.
Where Failures Come From
Most failures are ordinary browser problems. A new origin is missing from the allowlist. One request forgets the CSRF header. Cookies stop crossing the boundary after http and https get mixed. A write starts prior to the /v1/me request.
The narrow flow makes debugging plain. Once the request path is fixed, the work gets dull in the best way.
Read Latency, p50
Most read wins came from doing less work. The service returns bodyHtml instead of rendering on each client load. Likes are grouped in one query. Response shapes stay consistent, so the frontend does not need follow-up fetches for normal list views.
Read Latency, p95
The tail tells the truth. Spikes usually point to cold starts, slow database setup, or a large thread without a proper limit.
Watch those numbers. Readers notice tail latency first.
Write Latency
Writes cost more. They include markdown render, sanitizing, and the full set of checks.
That is fine. Reads should stay cheap on a blog. A heavier write path is acceptable when the common path stays fast.
Error Rate
The day 4 bump matches failures I expect from an RSS fetch problem, an allowlist mismatch after a domain change, or a client request missing credentials: 'include'.
Problems I Hit In Practice
The hard parts were not SQL. Most issues came from browser state and cross-origin rules.
Cookies Across Multiple Origins
The core checks were simple. Is the blog on HTTPS? Is the service on HTTPS? Is the browser sending credentials?
Most random 401s are not random. A cookie failed to cross the boundary.
CSRF Token Ordering
A frontend post prior to /v1/me should fail. That failure feels surprising until the strict write path is clear.
Allowlist Drift
Preview domains come and go. A missing allowlist update makes writes fail.
The fix is small. Origin policy still needs one source of truth.
Rate Limits In Multi-Instance Setups
The current rate limiter uses an in-memory map. That works on one instance.
A shared store such as Redis is the next step after the service runs across multiple instances.
Closing
This project stays small through simple identity, server-side sessions, guarded writes, sanitized markdown, and built-in moderation.
That is the goal: a comments service that does the job, stays understandable, and does not turn routine maintenance into a weekend project.