- Published on
- • 8 min read
Building an AI-Powered Health Facility Planner for Australian PHNs
- Authors

- Name
- Ching Chew
- Socials
Personal project. Views and code are my own and do not represent any past or current employer.
A health planner is asked: "Where should we commission two new GP clinics to maximise coverage for rural communities in Western NSW?" They have a budget and a workforce shortage classification map, but no way to answer the question independently.
So they queue a GIS request. The analyst gets to it in two weeks. The recommendation lands in an email thread. No one is certain the assumptions match current policy. The planning cycle moves on.
This is the problem Meridian is built to solve. It is a two-mode spatial decision support tool for Australian Primary Health Networks (PHNs): ask a plain-English question about general practitioner (GP) coverage gaps, get a map and a briefing-quality narrative in seconds.

Architecture
Seven components. Two Claude API calls. One classical optimisation solver.
- Streamlit UI: text input, mode toggle, clickable suggested queries
- NL Input Layer (Claude tool use): translates natural language into typed
QueryParams - Spatial Data Layer (GeoPandas + DuckDB/Parquet): population centres, existing GP locations, PHN boundaries, DPA classifications
- Routing Layer (ArcGIS Online Network Analyst REST API): travel time matrix for demand points vs facility locations; pre-computed and disk-cached as Parquet to minimise app response times
- Optimisation Layer (PuLP, Mode 2 only): Maximal Coverage Location Problem solver finds the
kcandidate sites that maximise covered population - Output Layer (Claude API): generates briefing-quality narrative from structured solver results
- Visualisation (Folium/Plotly): choropleth coverage map, proposed site markers, before/after statistics panel
The LLM does two things: parse the query and write the narrative. It does not do the optimisation. A 40-year-old algorithm does that.
Implementation
1. Tool use for query parsing
Natural language queries map cleanly to structured parameters. Rather than asking the model to reason spatially, I forced it to fill a typed schema via tool use with tool_choice: any.
_PARSE_TOOL = {
"name": "extract_query_params",
"description": "Extract structured health planning query parameters from natural language.",
"input_schema": {
"type": "object",
"properties": {
"mode": {"type": "string", "enum": ["diagnostic", "prescriptive"]},
"region": {"type": "string", "description": "PHN region name, e.g. 'Western NSW'"},
"facility_type": {"type": "string", "enum": ["gp"]},
"threshold_min": {"type": "integer", "description": "Travel time threshold in minutes (10-120)"},
"k": {"type": "integer", "description": "Number of new facilities to place (prescriptive mode only, 1-10)"},
"pop_min": {"type": "integer", "description": "Minimum population for a locality to be included"},
},
"required": ["mode", "region", "facility_type", "threshold_min"],
},
}
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=512,
system=_PARSE_SYSTEM,
tools=[_PARSE_TOOL],
tool_choice={"type": "any"},
messages=[{"role": "user", "content": user_input}],
)
tool_use = next(block for block in response.content if block.type == "tool_use")
params = QueryParams(**tool_use.input)
The model fills the schema directly. No regex, no post-processing, no format failures. Schema validation catches the edge cases.
2. MCLP for facility optimisation
Facility location is a well-studied operations research problem. The Maximal Coverage Location Problem (MCLP) asks: given a set of candidate sites, which k sites maximise the population covered within a threshold travel time?
PuLP formulates this as an integer linear program. The solver runs in seconds for PHN-scale instances (roughly 50 demand points and 100 candidate sites).
prob = LpProblem("MCLP", LpMaximize)
x = [LpVariable(f"x_{j}", cat="Binary") for j in range(n_candidates)]
y = [LpVariable(f"y_{i}", cat="Binary") for i in range(n_demand)]
prob += lpSum(pops[i] * y[i] for i in range(n_demand)) # maximise covered population
prob += lpSum(x) <= k # select at most k sites
for i, did in enumerate(demand_ids):
if covered_by[did]:
prob += y[i] <= lpSum(x[j] for j in covered_by[did])
else:
prob += y[i] == 0
prob.solve(PULP_CBC_CMD(msg=0))
covered_by[did] is the set of candidate indices that cover demand point did within threshold_min minutes. Demand points already covered by existing facilities are pinned to y[i] = 1 and excluded from the optimisation budget.
3. Routing cache
The ArcGIS Online Network Analyst REST API provides accurate Australian road network travel times. For the demo, the full travel time matrix (demand points vs all facilities and candidates) is pre-computed offline and stored as a Parquet file. The live demo reads from disk, so no API call overhead during the presentation.
An open-source alternative is implemented under ROUTING_PROVIDER=ors using OpenRouteService for anyone without an ArcGIS licence.
4. Narrative generation
The output layer receives structured results and generates briefing-quality narrative. Critically, raw user input never reaches the narrative prompt: it receives structured data only.
lines = [
f"Region: {ctx.region}",
f"Mode: {ctx.mode}",
f"Travel time threshold: {ctx.threshold_min} minutes",
f"Population within threshold: {ctx.covered_population:,} ({ctx.coverage_pct:.1f}%)",
f"Towns lacking coverage: {', '.join(ctx.uncovered_towns)}",
]
# For prescriptive mode:
if ctx.proposed_sites:
lines += [
f"Proposed new clinic locations: {', '.join(ctx.proposed_sites)}",
f"Coverage before: {ctx.covered_before:,} ({ctx.coverage_pct_before:.1f}%)",
f"Coverage after: {ctx.covered_after:,} ({ctx.coverage_pct_after:.1f}%)",
]
The system prompt instructs the model to write as a health policy analyst for PHN executives and DHDA officials, referencing current policy context (e.g. the November 2025 Bulk Billing Practice Incentive Program) where relevant to facility viability.
Results
Mode 1 returns a choropleth map of coverage gaps. Towns without access within the specified travel time threshold are highlighted. The narrative names them explicitly and summarises the population affected.

Mode 2 overlays the optimised clinic locations on the same map, with proposed site markers and a before/after statistics panel below.


The narrative output reads as a briefing note, not a raw data dump. A policy officer can take it directly into a planning document.
What Was Hard
Data sourcing took longer than the solver. GP practice locations are sourced from the Geoscience Australia NHSD MapServer REST API (a Nov 2025 snapshot of the National Health Services Directory), filtered to General practice service records. Rural practices without precise coordinates fall back to postcode centroids. There is no single authoritative, geocoded, continuously updated dataset — the NHSD has known gaps in remote areas.
Candidate site derivation is approximate. There is no public dataset of "proposed" facility sites. Candidates are derived from ABS localities above a population threshold with no existing facility within the coverage radius, intersected with DPA-classified areas. The straight-line distance pre-filter means some candidates are slightly mis-ranked before the routing layer validates them.
The LLM is not the hard part. Tool use for query parsing worked on the first attempt with a clear schema and a tight system prompt. The harder problems were data quality, routing API authorisation and the edge cases in the coverage matrix construction where demand points straddle PHN boundary polygons.
Scope discipline was the most important decision. Western NSW PHN only, GP facilities only, one routing provider as primary. Every time I considered extending scope mid-build I added it to a limitations section instead and kept building.
Where It Goes Next
The architecture generalises to any domain where a powerful analytical tool exists but sits behind a specialist translation layer. The pattern is: LLM for NL-to-structured-params, classical solver or domain tool for the hard computation, LLM for structured-results-to-narrative.
For Meridian specifically:
- Broader PHN coverage: the data pipeline and solver are PHN-agnostic. The constraint is data validation across diverse PHN geographies and routing API call volume.
- Richer facility types: allied health, specialists, hospitals. Each adds a new facility type to the schema enum and a new dataset to the spatial layer.
- Demand weighting: MCLP currently maximises raw population coverage. A weighted variant could incorporate health need indicators (chronic disease prevalence, socioeconomic vulnerability) as demand weights.
- National scale: the two blockers are routing API cost at scale (OD matrix for all 2,400+ Australian localities is a significant call volume) and PHN boundary edge cases. A pre-computed national matrix resolves both; it is a one-time compute cost.
Code and Further Reading
The full source is on GitHub. The stack is Python 3.12, GeoPandas, PuLP, Streamlit and the Anthropic Python SDK.
For the MCLP formulation and its health planning applications, Church and ReVelle (1974) is the original paper. ReVelle and Eiselt (2005) surveys the broader facility location literature.
For Australian PHN geography and health workforce data: AIHW, Geoscience Australia NHSD and DHDA Distribution Priority Areas.
AI Tools
Claude Code was used to plan and build the demo. Claude was used to draft the blog post.