Maintaining JWS in Production
Testing, Migration, and Troubleshooting
This article is part of a series on understanding the hows and whys of JSON Web Signatures (JWS).
There's accompanying code: it's refered to and linked throughout the content. But if you'd rather just read raw code, head over here.
You’ve read the theory. You’ve implemented JWS signing and verification. Your code looks good in the PR. But before you deploy this to production, you need answers to three questions:
- How do I know this actually works? (Testing)
- How do I roll this out without breaking production? (Migration)
- When it breaks, what do I do? (Troubleshooting)
This post covers the operational discipline that separates toy implementations from production-grade systems.
Testing Strategies
The Critical Test Cases
These tests verify the core security guarantees of JWS verification. Each test case corresponds to a specific attack vector or edge case that must be handled correctly.
- valid signature from partner
- expired token
- tampered payload
- wrong signature (different private key)
- algorithm substitution attack (both "none" and substituted algorithm)
- clock skew tolerance (but rejects if excessive)
- whitespace in JSON (canonicalization)
- unknown
kid(handled by the JWKS cache miss mechasnism) - key rotation during request (checked via tests for the JWKS cache)
Contract Testing with Partners
The problem: Your tests use keys you generated. But in production, partners have their own keys. How do you verify interoperability?
Operational discipline: When onboarding a new partner:
- Exchange test JWKS endpoints
- Both sides generate sample signed requests
- Both sides verify they can validate the other's signatures
- Only then proceed to production
Migration Strategies
The Challenge
You have an existing API without JWS. Partners are sending authorization requests today. You need to add JWS without:
- Breaking existing partners
- Deploying to all partners simultaneously
- Risking production availability
Phased Rollout
Phase 1: Detection Mode (Week 1-2)
Deploy verification code but don’t enforce. Log results.
What you learn:
- Which partners are already sending signatures
- What verification failures occur (clock skew issues, algorithm mismatches, etc.)
- Performance impact of verification (typical: <10ms with cached JWKS)
Phase 2: Warning Mode (Week 3-4)
Add warnings to API responses for partners not sending signatures.
Send emails to partner technical contacts: “Your integration is missing JWS signatures. These will be required in 30 days.”
Phase 3: Enforcement Mode with Allowlist (Week 5-6)
Start rejecting requests without signatures, but maintain allowlist for partners who need more time.
Phase 4: Full Enforcement (Week 7+)
Remove allowlist. All requests must have valid signatures.
Troubleshooting Guide
“Signature Verification Failed” - The Debugging Protocol
CODE LINK: Verification error handling:
lib/jws_demo/jws/verifier.ex:85 - Validation pipeline with detailed
error messages
When you see signature_verification_failed in logs:
Step 1: Verify the JWS structure
defmodule JWSDebug do
def diagnose(jws_string) do
IO.puts("=== JWS Diagnostic Report ===\n")
# Check basic structure
parts = String.split(jws_string, ".")
IO.puts("Parts count: #{length(parts)} (should be 3)")
if length(parts) != 3 do
IO.puts("❌ Invalid JWS structure")
return
end
[header_b64, payload_b64, signature_b64] = parts
# Decode header
case Base.url_decode64(header_b64, padding: false) do
{:ok, header_json} ->
header = Jason.decode!(header_json)
IO.puts("✓ Header decoded successfully")
IO.puts(" Algorithm: #{header["alg"]}")
IO.puts(" Key ID: #{header["kid"]}")
# Check algorithm is allowed
if header["alg"] not in ["ES256", "ES384", "ES512"] do
IO.puts("⚠ Algorithm #{header["alg"]} not in allowlist")
end
{:error, _} ->
IO.puts("❌ Header Base64 decoding failed")
end
# Decode payload
case Base.url_decode64(payload_b64, padding: false) do
{:ok, payload_json} ->
claims = Jason.decode!(payload_json)
IO.puts("✓ Payload decoded successfully")
IO.puts(" Claims: #{inspect(claims, pretty: true)}")
# Check timestamps
now = System.system_time(:second)
if claims["exp"] do
if claims["exp"] < now do
IO.puts("❌ Token expired #{now - claims["exp"]} seconds ago")
else
IO.puts("✓ Token expires in #{claims["exp"] - now} seconds")
end
end
if claims["iat"] do
if claims["iat"] > now + 300 do
IO.puts("⚠ Token issued #{claims["iat"] - now} seconds in future (clock skew?)")
else
IO.puts("✓ Token issued #{now - claims["iat"]} seconds ago")
end
end
{:error, _} ->
IO.puts("❌ Payload Base64 decoding failed")
end
# Check signature
IO.puts("\nSignature (Base64URL): #{String.slice(signature_b64, 0, 50)}...")
IO.puts("Signature length: #{byte_size(signature_b64)} characters")
end
end
# In IEx:
iex> JWSDebug.diagnose(jws_from_request)
=== JWS Diagnostic Report ===
Parts count: 3 (should be 3)
✓ Header decoded successfully
Algorithm: ES256
Key ID: 2025-01-15
✓ Payload decoded successfully
Claims: %{
"amount" => 50000,
"exp" => 1736950800,
"iat" => 1736950500
}
✓ Token expires in 289 seconds
✓ Token issued 11 seconds ago
Signature (Base64URL): MEUCIQDEx7...
Signature length: 86 characters
Step 2: Verify you have the right public key
def check_key_availability(kid, partner_id) do
# Check JWKS cache
case JWKSCache.get_key(partner_id, kid) do
{:ok, key} ->
IO.puts("✓ Key #{kid} found in cache for #{partner_id}")
IO.inspect(key, label: "Cached key")
{:error, :not_found} ->
IO.puts("❌ Key #{kid} not in cache")
# Try fetching fresh JWKS
IO.puts("Fetching fresh JWKS...")
case HTTPoison.get("https://#{partner_domain}/jwks.json") do
{:ok, %{status_code: 200, body: body}} ->
jwks = Jason.decode!(body)
kids = Enum.map(jwks["keys"], & &1["kid"])
IO.puts("Available keys in JWKS: #{inspect(kids)}")
if kid in kids do
IO.puts("⚠ Key exists in JWKS but not in cache - cache stale?")
else
IO.puts("❌ Key #{kid} not in partner's JWKS")
end
error ->
IO.puts("❌ Failed to fetch JWKS: #{inspect(error)}")
end
end
end
Step 3: Check for whitespace issues
def check_payload_reconstruction(jws_string, body_from_request) do
# Extract payload from JWS
[_, payload_b64, _] = String.split(jws_string, ".")
{:ok, payload_from_jws} = Base.url_decode64(payload_b64, padding: false)
IO.puts("Payload from JWS signature:")
IO.puts(payload_from_jws)
IO.puts("\nPayload from request body:")
IO.puts(body_from_request)
if payload_from_jws == body_from_request do
IO.puts("\n✓ Payloads match exactly")
else
IO.puts("\n❌ Payloads differ!")
IO.puts("This is the problem - signature was computed over different bytes")
# Show differences
diff = String.myers_difference(payload_from_jws, body_from_request)
IO.inspect(diff, label: "Difference")
end
end
Step 4: Manually verify with OpenSSL
# Extract components
echo "eyJhbGc..." > header.b64
echo "eyJhbW91..." > payload.b64
echo "MEUCIQDEx..." > signature.b64
# Decode header and payload
cat header.b64 | base64 -d > header.json
cat payload.b64 | base64 -d > payload.json
# Create signing input
echo -n "$(cat header.b64).$(cat payload.b64)" > signing_input.txt
# Get public key from partner JWKS
curl https://partner.example/jwks.json | jq -r '.keys[0]' > key.jwk
# Convert JWK to PEM (using script from "Audit Trails" Post)
elixir -e '
{:ok, jwk_json} = File.read("key.jwk")
jwk = Jason.decode!(jwk_json)
pem = jwk_to_pem(jwk)
File.write!("public_key.pem", pem)
'
# Decode signature from Base64URL to DER
elixir -e '
{:ok, sig_b64} = File.read("signature.b64")
sig_b64 = String.trim(sig_b64)
{:ok, sig_der} = Base.url_decode64(sig_b64, padding: false)
File.write!("signature.der", sig_der)
'
# Verify
openssl dgst -sha256 -verify public_key.pem -signature signature.der signing_input.txt
# Output: "Verified OK" or "Verification Failure"
Common Issues and Fixes
Unknown key ID
Error: kid "2025-01-15" not found in JWKS
Diagnosis:
- Partner rotated keys but your cache hasn't refreshed
- Partner's JWKS endpoint is down
- Kid is misspelled
Fix: Implement automatic cache refresh when unknown kid is encountered.
Clock skew - token not yet valid
Error: iat 1736951000 is 400 seconds in the future
Diagnosis:
- Partner's server clock is ahead
- Your server clock is behind
- Timezone confusion (using local time instead of UTC)
Fix:
# Check your server time
System.system_time(:second) |> DateTime.from_unix!() |> IO.inspect()
# Check partner's time (from their token)
claims["iat"] |> DateTime.from_unix!() |> IO.inspect()
# If difference > 5 minutes, investigate
# Are you both using UTC? Is NTP configured?
Temporary fix: Increase clock skew tolerance for this partner:
defmodule PartnerConfig do
def get_clock_skew_tolerance(partner_id) do
case partner_id do
"partner_with_clock_issues" -> 600 # 10 minutes
_ -> 300 # 5 minutes default
end
end
end
Permanent fix: Contact partner to fix their clock synchronization.
Algorithm not allowed
Error: Algorithm RS256 not in allowlist
Diagnosis:
- Partner is using RSA instead of ECDSA
- Your allowlist is too restrictive
Fix:
# Check what algorithm they're using
{:ok, header} = peek_header(jws)
IO.inspect(header["alg"])
# If RS256/RS384/RS512, decide:
# - Are you willing to accept RSA? (slower, larger keys)
# - Or should partner migrate to ES256?
# If accepting RSA, update allowlist:
@allowed_algorithms ["ES256", "ES384", "RS256"]
"Signature verification failed" but everything looks right
Diagnosis:
- Payload canonicalization mismatch
- Partner signing one thing, you're verifying another
The smoking gun:
# What partner signed:
partner_signed = ~s({"amount":50000,"merchant_id":"merch_789"})
# What you're verifying:
you_received = ~s({"merchant_id":"merch_789","amount":50000})
# DIFFERENT JSON key order = different signature!
Fix: Never reconstruct the JSON payload yourself. Verify the JWS structure as received, then extract the verified payload from it. The payload embedded in the JWS is what was signed: that's your source of truth.
If you try to re-parse or re-serialize JSON from the request body, you'll introduce canonicalization differences (whitespace, key order, number formatting) that break signature verification. Always verify the JWS first, then use the payload that comes out of successful verification.
See the demo's verification flow: Verifier.verify/3 verifies the JWS structure and returns the decoded payload, and VerifyJWSPlug shows the full request handling pattern.
When to Escalate
Some failures require partner involvement:
Escalate to partner when:
- Their JWKS endpoint is down for >1 hour
- You consistently see unknown kid values
- Clock skew exceeds 10 minutes
- Algorithm used is not agreed upon
The Complete Testing Checklist
Before deploying JWS to production:
- Valid signature from partner test environment passes
- Expired signature is rejected
- Tampered payload is rejected
- Wrong signature (different key) is rejected
nonealgorithm is rejected- Clock skew within tolerance is accepted
- Excessive clock skew is rejected
- Unknown kid triggers cache refresh
- Key rotation during request succeeds
- Audit trail can be re-verified after storage
- Contract tests pass with all partners
- Migration plan documented and communicated
- Rollback plan tested (can you disable enforcement quickly?)
- Monitoring alerts configured for verification failures
- On-call team trained on troubleshooting steps
Summary
Testing proves your implementation works. Migration proves you can deploy it safely. Troubleshooting proves you can maintain it when it breaks.
JWS isn't complex, but it's unforgiving. A single mistake in Base64 encoding, JSON canonicalization, or clock skew tolerance will cause production incidents. The difference between teams that succeed and teams that struggle is operational discipline:
- Test the failures, not just the happy path
- Migrate gradually with detection and warning phases
- Build debugging tools before you need them
- Document the troubleshooting protocol
Once you can answer "How do I know this works?", "How do I deploy this?", and "What do I do when it breaks?", you’re ready for production.
This article is part of a series on understanding the hows and whys of JSON Web Signatures (JWS).
There's accompanying code: it's refered to and linked throughout the content. But if you'd rather just read raw code, head over here.