This is part 2 of the AI Recruiting Pipeline Epic.
The traditional approach to LLM extraction is prompting for prose and parsing it:
Extract the shift from this job description. Return "1st", "2nd", "3rd", or "Varies".
This fails in production because LLMs don't follow instructions precisely. They add context, hedging, or explanation:
"Based on the description mentioning 'day shift hours', this appears to be a 1st shift position."
Now you're parsing natural language output to extract structured data — exactly what you were trying to avoid.
JSON Schemas Fix This
OpenAI's structured outputs enforce a schema on the response. The model can only return valid JSON matching your specification:
def response_format
{
type: "json_schema",
json_schema: {
name: "job_description_schema",
strict: true,
schema: {
type: "object",
properties: {
shift: {
type: "string",
enum: ["1st", "2nd", "3rd", "Varies", "Unknown"]
},
hours_per_week: {
type: ["integer", "null"]
},
benefit_eligible: {
type: "boolean"
}
},
required: ["shift", "hours_per_week", "benefit_eligible"],
additionalProperties: false
}
}
}
end
The response is guaranteed to be valid JSON with exactly the structure you need. Parse it directly into your domain:
result = JSON.parse(provider.content)
job_listing.update!(
shift: result["shift"],
hours_per_week: result["hours_per_week"],
benefit_eligible: result["benefit_eligible"]
)
Designing the Schema
Schema design determines extraction quality. Key principles:
1. Use Enums for Categorical Fields
{
type: "string",
enum: ["Full-time", "Part-time", "PRN", "None", "Unknown"]
}
The model must pick from your options. No creative interpretations.
2. Allow Nulls for Optional Information
{
type: ["integer", "null"],
description: "Hours per week if explicitly stated, null otherwise"
}
This prevents the model from inventing data when the source is ambiguous.
3. Use Arrays for Multi-valued Fields
{
required_credentials: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
evidence: { type: ["string", "null"] }
},
required: ["name", "evidence"]
}
}
}
The evidence field captures why the model made the extraction — useful for debugging.
4. Descriptions Guide Interpretation
{
education_requirements: {
type: "array",
items: {
type: "object",
properties: {
name: {
type: "string",
enum: ["HS", "Associate", "BA/BS", "MA/MS", "PhD", "None"]
}
}
},
description: "Required education levels explicitly stated in the posting. Leave empty if not specified."
}
}
The description tells the model when to include entries versus leaving the array empty.
The System Prompt
The system prompt sets extraction rules:
def system_prompt
<<~PROMPT
You extract structured data from healthcare job descriptions.
Rules:
- Return only facts stated in the posting. Do not infer or guess.
- Extract raw credential mentions exactly as written (e.g., "RN", "BLS", "CPR").
- Use null for fields without explicit information.
- Leave arrays empty when the posting does not provide relevant data.
PROMPT
end
The rules prevent common failure modes:
- No inference — Models love to "help" by guessing
- Preserve original text — Don't normalize until you have explicit mappings
- Explicit nulls — Make absence of data explicit
Handling Extraction Failures
Even with schemas, extraction can fail:
def call_with_result
provider = build_provider
provider.call
return rate_limited_result if provider.rate_limited?
return error_result(provider.error) if provider.error
content = provider.content
return empty_result("empty_response") if content.blank?
Result.new(
data: JSON.parse(content),
status: "success"
)
rescue JSON::ParserError => e
Result.new(
data: empty_payload,
status: "invalid_json",
error_message: e.message
)
end
The Result struct lets callers distinguish between successful extraction and various failure modes.
Retry Logic
Rate limits and transient failures need retries:
def call_with_retry(attempts: 3)
attempts.times do |i|
result = call_with_result
return result if result.status == "success"
if result.status == "rate_limited"
sleep(result.retry_after || [i + 1, 5].min)
next
end
return result if i == attempts - 1
end
end
Each retry waits longer. Rate limits respect the API's requested delay.
Production Patterns
After running this system on ~10,000 job descriptions:
What works:
- Strict schemas eliminate parsing code entirely
- Evidence fields help debug misclassifications
- System prompt rules reduce hallucination significantly
What to watch:
- Token costs scale with input length — truncate verbose descriptions
- Rate limits hit hard with concurrent scraping — use semaphores
- Model updates can change behavior — pin versions and test regularly
The structured output approach turns LLM extraction from art into engineering. You define the contract, the model fills it, and you write zero parsing code.