Initial commit from upstream

This commit is contained in:
Will Bradley
2025-12-05 14:29:17 -08:00
commit 3a85c3e281
21 changed files with 5339 additions and 0 deletions

21
.dockerignore Normal file
View File

@@ -0,0 +1,21 @@
__pycache__
*.pyc
*.pyo
*.pyd
.Python
*.so
*.egg
*.egg-info
dist
build
.git
.gitignore
.vscode
.idea
*.swp
*.swo
*~
.DS_Store
venv/
env/
.env

5
.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
__pycache__
desktop.ini
*.geojson
osm_cache/
.claude

117
DOCKER.md Normal file
View File

@@ -0,0 +1,117 @@
# Docker Deployment Guide
## Quick Start
### Using Docker Compose (Recommended)
1. Build and start the container:
```bash
docker-compose up -d --build
```
Note: The `--build` flag ensures the image is rebuilt with the latest code changes.
2. Access the web interface:
- Main interface: http://localhost:5000
- Map viewer: http://localhost:5000/map
3. Stop the container:
```bash
docker-compose down
```
### Using Docker Directly
1. Build the image:
```bash
docker build -t villages-import .
```
2. Run the container:
```bash
docker run -d \
-p 5000:5000 \
-v "$(pwd)/data:/data" \
--name villages-import \
villages-import
```
3. View logs:
```bash
docker logs -f villages-import
```
4. Stop the container:
```bash
docker stop villages-import
docker rm villages-import
```
## Features
### Main Dashboard (/)
- Run data processing scripts for Lake and Sumter counties
- View real-time script output
- Access to:
- Diff Roads
- Diff Addresses
- Diff Multi-Use Paths
- Download OSM Data
### Map Viewer (/map)
- Interactive map viewer for GeoJSON files
- Upload and compare OSM, Diff, and County data
- Filter by removed/added features
- Hide highway=service roads
- Drag-and-drop layer reordering
- Click on features to view properties
- Accept/reject diff features
## Volume Mounts
The Docker container mounts a single data directory:
- `./data``/data` - All data files, organized by date
Inside `/data`, the structure is:
- `/data/latest/` - Symlink to the most recent data directory
- `/data/YYMMDD/lake/` - Lake County data for that date
- `/data/YYMMDD/sumter/` - Sumter County data for that date
All changes are persisted on the host in the local `./data` folder.
## API Endpoints
- `GET /` - Main dashboard
- `GET /map` - Map viewer
- `POST /api/run-script` - Execute a processing script
- `GET /api/job-status/<job_id>` - Get script status and logs
- `GET /api/list-files` - List available GeoJSON files
- `GET /data/<path>` - Serve GeoJSON files
## Troubleshooting
### Port already in use
If port 5000 is already in use, edit `docker-compose.yml`:
```yaml
ports:
- "8080:5000" # Change 8080 to any available port
```
### Permission issues
Ensure the data directory has proper permissions:
```bash
mkdir -p data
chmod -R 755 data
```
### View container logs
```bash
docker-compose logs -f
```
### Rebuild after code changes
```bash
docker-compose down
docker-compose build --no-cache
docker-compose up -d
```

33
Dockerfile Normal file
View File

@@ -0,0 +1,33 @@
FROM python:3.11-slim
# Install system dependencies
RUN apt-get update && apt-get install -y \
libspatialindex-dev \
libgeos-dev \
libproj-dev \
wget \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy only necessary Python scripts and web files
COPY *.py ./
COPY web/ ./web/
# Create /data directory (will be mounted as volume)
RUN mkdir -p /data
# Expose port
EXPOSE 5000
# Set environment variables
ENV FLASK_APP=web/server.py
ENV PYTHONUNBUFFERED=1
# Run the Flask server
CMD ["python", "web/server.py"]

169
README.md Normal file
View File

@@ -0,0 +1,169 @@
# The Villages Road and Address Import
See [https://wiki.openstreetmap.org/wiki/The_Villages_Road_and_Address_Import](https://wiki.openstreetmap.org/wiki/The_Villages_Road_and_Address_Import)
See compare-addresses.py for an automated way of running the complete address diff toolchain in one step.
- TODO: fails to split out units
## New Instructions
* NOTE: when downloading OSM data towards the end via JOSM, copy-paste the output of the download script but add `(._;>;);out;` to the end instead of `out geom;` so JOSM picks it up.
* NOTE: also add `way["highway"="construction"](area.searchArea);way["highway"="path"](area.searchArea);way["highway"="cycleway"](area.searchArea);` to the end so that roads under construction and cartpaths show up in JOSM to be analyzed/replaced/modified/etc.
### Roads
* Get new data from the county and convert it:
* Sumter (change 041125): `python shp-to-geojson.py "original data/Sumter/RoadCenterlines_041125.shp.zip" "original data/Sumter/RoadCenterlines_041125.geojson"`
* Lake (change 2025-06): `python shp-to-geojson.py "original data/Lake/Streets 2025-06.zip" "original data/Lake/Streets 2025-06.geojson"`
* Get new data from OSM:
* Sumter: `python download-overpass.py --type highways "Sumter County" "Florida" "original data/Sumter/osm-sumter-roads-$(date +%y%m%d).geojson"`
* Lake: `python download-overpass.py --type highways "Lake County" "Florida" "original data/Lake/osm-lake-roads-$(date +%y%m%d).geojson"`
* Diff the roads:
* Sumter (change 041125): `python threaded.py --output "processed data\Sumter\diff-sumter-roads-$(date +%y%m%d).geojson" "original data\Sumter\osm-sumter-roads-$(date +%y%m%d).geojson" "original data\Sumter\RoadCenterlines_041125.geojson"`
* Lake (change 2025-06): `python threaded.py --output "processed data\Lake\diff-lake-roads-$(date +%y%m%d).geojson" "original data\Lake\osm-lake-roads-$(date +%y%m%d).geojson" "original data\Lake\Streets 2025-06.geojson"`
## Data
- Lake County Streets and Address Points: https://c.lakecountyfl.gov/ftp/GIS/GisDownloads/Shapefiles/
- Alternately:
- Streets: https://gis.lakecountyfl.gov/lakegis/rest/services/InteractiveMap/MapServer/73
- Addresses: https://gis.lakecountyfl.gov/lakegis/rest/services/InteractiveMap/MapServer/16
- Highways: https://gis.lakecountyfl.gov/lakegis/rest/services/InteractiveMap/MapServer/9
- Sumter GIS:
- Alternately, roads: https://test-sumter-county-open-data-sumtercountygis.hub.arcgis.com/datasets/9177e17c72d3433aa79630c7eda84add/about
- Addresses: https://test-sumter-county-open-data-sumtercountygis.hub.arcgis.com/datasets/c75c5aac13a648968c5596b0665be28b/about
- Email for Multi-Modal Paths.
- Marion (TODO)
## Instructions
* Always do roads first, addresses second, so new subdivisions don't throw address validation errors.
* Open the original data in QGIS
* Format OSM fields with QGIS functions to have proper capitalization and full spellings without extraneous whitespace, based on original fields. For example OSM uses names like North Main Street, not N MAIN ST. All fields are of the QGIS type "text" even if they're numbers.
* You can use the Attribute Table's Field Calculator for this; you can copy-paste the `qgis-functions.py` file into the Function Editor and then use the Expression tab to create new, formatted virtual fields. Don't worry if the field name limit is too short, it can be fixed in JOSM.
### For Sumter County:
* Always use the Filter with Form function to Select all entries with `"LIFECYCLE"='Current'`
* For roads:
* `NAME` becomes the virtual `name` via the `title(formatstreet("NAME"))`
* `SpeedLimit` becomes the virtual `maxspeed` via `concat("SpeedLimit",' mph')`
* `highway=residential` or similar added manually in JOSM
* `surface=asphalt` added manually in JOSM
* For addresses:
* The Addresses shapefile is recorded in the ESRI:102659 CRS, you may need to convert or reproject to/from the default EPSG:4326 - WGS 84 CRS that OSM uses.
* `ADD_NUM` becomes the virtual `addr:housenumber` (or `addr:house` temporarily, avoiding addr:house which is a real tag) as an integer
* `UNIT` becomes the virtual `addr:unit` (sometimes the LOT key is used for multiple units in a range, but mostly it's unrelated lot IDs and not useful) as a string
* `SADD` becomes the virtual `addr:street` (or `addr:stree` temporarily) via the `title(getstreetfromaddress("SADD"))` custom expression as a string
* `POST_COMM` becomes the virual `addr:city` via the `title("POST_COMM")` expression (we care about postal community addresses not what municipality a place might be governed by) as a string
* `POST_CODE` becomes `addr:postcode` (or `addr:postc` temporarily) as an integer
* Manually add `addr:state` = `'FL'`
* For multi-modal trails (golf cart paths):
* Download all highway=path and highway=cycleway with golf_cart=yes for comparison
* Omit `Part_of_Ro`=`Yes` as separate paths; apply golf cart tagging to the streets directly.
* `bicycle=yes`
* `foot=yes`
* `golf=cartpath`
* `golf_cart=yes`
* `highway=path`
* `motor_vehicle=no`
* `segregated=no`
* `surface=asphalt`
### For Lake County:
* For roads:
* `FullStreet` becomes the virtual `name` via the `title(formatstreet("FullStreet"))`
* `SpeedLimit` becomes the virtual `maxspeed` via `concat("SpeedLimit",' mph')`
* `NumberOfLa` becomes the virtual `lanes`
* `surface=asphalt` added manually
* `StreetClas` becomes the virtual `highway` via the `gethighwaytype("StreetClas")`
* Could use MaxWeight (1.0 - 20.0)
* For addresses:
* The Addresses shapefile is recorded in the NAD83(HARN) / Florida East (ftUS) CRS, you may need to convert or reproject to/from the default EPSG:4326 - WGS 84 CRS that OSM uses.
* `AddressNum` becomes the virtual `addr:housenumber` (or `addr:house` temporarily, avoiding addr:house which is a real tag) as an integer
* `UnitType` becomes the virtual `addr:unit` via `regexp_replace("UnitType",'U ','')` (UnitNumber is blank) as a string
* The virtual `addr:street` (or `addr:stree` temporarily) is created via the `regexp_replace(trim(concat(formatname("PrefixDire"),' ',title(formatstreet("PrefixType")),' ',title(formatstreet("BaseStreet")),' ',formatname("SuffixType"))),'\\s+',' ')` custom expression as a string
* `PostalCity` becomes the virual `addr:city` via the `title("PostalCity")` expression (we care about postal community addresses not what municipality a place might be governed by) as a string
* `ZipCode` becomes `addr:postcode` (or `addr:postc` temporarily) as an integer
* Manually add `addr:state` = `'FL'`
### Continuing instructions for both:
* Export to Geojson, only exporting **selected** entries, **selecting only the OSM-formatted fields we want**.
* Here you can rename temporary columns like `addr:house` to `addr:housenumber`.
* Ensure the export file is in the `EPSG:4326 - WGS84` CRS.
* Open in JOSM. It's suggested to begin with roads first, addresses second, so the addresses can be placed in context.
* In the Roads dataset, select and remove all relations from the geojson/shapefile layer: the data often has one relation per road and this is improper for OSM import.
* Select a small region to work on: one neighborhood or smaller. For this import, we are assuming that only newly-constructed small residential areas will be imported, not main roads or commercial areas or areas with significant existing map data.
* Download the area you're working on from OSM, into a new Data Layer (not your geojson layer.)
* Select all features to be imported at this time and leave them selected until the merge step below.
* Select all ways for roads, or all nodes for addresses. Make sure you aren't about to mass-edit the nodes of a road: deselect the nodes if this happens.
* Ensure the tags are correct and good. (QGIS has a character limit and sometimes doesn't like colons, so double check that `addr:house` is `addr:housenumber`, `addr:postc` is `addr:postcode`, `addr:stree` is `addr:street`, etc.)
* Mass-add new tags like `highway=residential`, `surface=asphalt`, etc, as indicated.
* Remove any spurious tags that may have been brought over in the import (if it's not in the OSM Wiki, we don't want it.)
* Press ctrl-shift-M to merge into the OSM data layer. There will be a warning, but click OK; we will be extra careful about validating the merge in the next steps.
* For addresses, remove any address nodes that seem to not reflect reality or be placed far from the street bearing their name: it's better to not have 123 Adams Street mapped at all, than to claim that 123 Adams Street is hovering over someone's newly-built house at 321 Franklin Avenue, 200 feet away from Adams Street. (Cities often won't remove old addresses, leading to confusion when new streets are built.)
* For roads, highlight multiple street segments which have the same name and press C to combine them: the county data has one way per road segment and that's excessive for OSM.
* Check the edges of the imported areas to ensure new roads are merged with any preexisting roads
* Check the import area to ensure no incorrect overlaps
* Use the JOSM validator to ensure no errors in imported data. Warnings about existing data separate from the import can be ignored.
* If there are duplicate house numbers in the data, investigate and remove the more-unlikely node or both nodes. For example `4650 Ramsell Road` is duplicated in the source data, but the easternmost copy is on the "odd" side of the street and between 4653 and 4663 so it's more likely to actually be 4651, 4655, 4657, 4659, or 4661. We have no way of knowing, so we can either delete it entirely or simply delete the housenumber tag and leave it as an address without a number for a future editor to review. (We may submit incomplete data, just not wrong data.) We then leave the westernmost copy alone since 4650 fits neatly in between 4640/4644 and 4654/4660.
* All known duplicates:
* 4886 C 472 (one is a daycare the other is a church)
* 5626 C THOMAS RD has unit numbers in the original data's Notes field
* 301 CURRY ST
* 401 HALL ST
* 340 HEALD WAY has many buildings and many units per building, in the notes
* 1908 LAUREL MANOR DR (one is a CELL TOWER)
* 1950 LAUREL MANOR DR (each one has multiple units in a range, in the notes)
* 6217 MEGGISON RD (one's note is "restroom/storage")
* 6221 MEGGISON RD (one's note is "pavilion")
* 6227 MEGGISON RD (one's note is "recreation center")
* 102 NE 4TH AVE (one is a cell tower, the other a water tower)
* 11750 NE 62ND TER (one's note is Pebble Springs retirement community building)
* 13813 NE 136TH LOOP UNIT 1306
* 8550 NE 138TH LN (each is a different building number)
* 4650 RAMSELL RD
* 400 RUTLAND ST (both say "church owned")
* 308 SHAWN AVE (one says Wildwood Acres, the other Progress Energy Pump 29)
* 413 S PINE ST
* 2605 TRILLIUM RDG (one says bldg 1, the other says meter)
* 2680 TRILLIUM RDG (bldg 3, meter, meter)
* 13940 US 441 (different building names in the notes)
* 702 WEBSTER ST (one is city of ww, the other retention pond)
* Click upload
* Make sure there are no erroneous Relations or other unwanted objects about to be uploaded.
* Use a descriptive changeset message like "Roads/Addresses in The Villages #villagesimport"
* Set the Source to be "Sumter County GIS"
* You can easily copy-paste the below into the Settings tab:
```
comment=Roads/Addresses in The Villages #villagesimport
import=yes
website=https://wiki.openstreetmap.org/wiki/The_Villages_Road_and_Address_Import
source=Sumter County GIS
source:url=https://gitlab.com/zyphlar/the-villages-import
```
* Review imported data in Achavi or Osmcha to ensure it looks proper.
## Useful queries:
```
[timeout:60];
area["name"="Florida"]->.state;
area["name"="Lake County"](area.state)->.searchArea;nwr["addr:housenumber"](area.searchArea);
(._;>;);
out meta;
```
```
[timeout:60];
area["name"="Florida"]->.state;
area["name"="Lake County"](area.state)->.searchArea;way["highway"](area.searchArea);
(._;>;);
out meta;
```

625
compare-addresses.py Normal file
View File

@@ -0,0 +1,625 @@
#!/usr/bin/env python3
"""
Address Data Comparison Tool for US Counties
Compares local government address data (from ZIP/shapefile) with OpenStreetMap address data.
Downloads OSM data via Overpass API, converts local data to GeoJSON, and performs comprehensive
comparison to identify new, existing, and removed addresses.
Usage:
python compare-addresses.py "Lake" "Florida" --local-zip "original data/Lake/Addresspoints 2025-06.zip"
python compare-addresses.py "Sumter" "Florida" --local-zip "original data/Sumter/Address9_13_2024.zip"
"""
import argparse
import json
import os
import sys
import zipfile
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Tuple, Any, Optional
import urllib.error
import urllib.parse
import urllib.request
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point
from shapely.strtree import STRtree
from shapely.ops import nearest_points
import warnings
# Import local modules
import importlib
qgis_functions = importlib.import_module("qgis-functions")
# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')
class AddressComparator:
def __init__(self, tolerance_meters: float = 50.0, cache_dir: str = "osm_cache"):
"""
Initialize the address comparator.
Args:
tolerance_meters: Distance tolerance for considering addresses as matching
cache_dir: Directory to cache OSM data
"""
self.tolerance_meters = tolerance_meters
self.cache_dir = Path(cache_dir)
self.cache_dir.mkdir(exist_ok=True)
# Convert meters to degrees (approximate)
# 1 degree latitude ≈ 111,000 meters
self.tolerance_deg = tolerance_meters / 111000.0
def download_osm_addresses(self, county: str, state: str, output_file: str = None) -> str:
"""Download address data from OpenStreetMap via Overpass API."""
if output_file is None:
timestamp = datetime.now().strftime("%Y%m%d")
output_file = self.cache_dir / f"osm_addresses_{county.lower()}_{timestamp}.geojson"
else:
output_file = Path(output_file)
# Check if cached file exists and is recent (less than 7 days old)
if output_file.exists():
file_age = datetime.now().timestamp() - output_file.stat().st_mtime
if file_age < 7 * 24 * 3600: # 7 days in seconds
print(f"Using cached OSM data: {output_file}")
return str(output_file)
print(f"Downloading OSM addresses for {county} County, {state}...")
# Build Overpass query for addresses
query = f"""[out:json][timeout:180];
area["name"="{state}"]->.state;
area["name"="{county} County"](area.state)->.searchArea;
nwr["addr:housenumber"](area.searchArea);
out geom;"""
# Query Overpass API
osm_data = self._query_overpass(query)
# Convert to GeoJSON
geojson = self._convert_osm_to_geojson(osm_data)
# Save to file
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(geojson, f, indent=2)
print(f"Downloaded {len(geojson['features'])} OSM addresses to {output_file}")
return str(output_file)
def _query_overpass(self, query: str) -> Dict[str, Any]:
"""Send query to Overpass API and return JSON response."""
url = "https://overpass-api.de/api/interpreter"
data = urllib.parse.urlencode({"data": query}).encode("utf-8")
try:
with urllib.request.urlopen(url, data=data, timeout=300) as response:
return json.loads(response.read().decode("utf-8"))
except urllib.error.HTTPError as e:
print(f"HTTP Error {e.code}: {e.reason}", file=sys.stderr)
try:
error_body = e.read().decode("utf-8")
print(f"Error response: {error_body}", file=sys.stderr)
except:
pass
sys.exit(1)
except Exception as e:
print(f"Error querying Overpass API: {e}", file=sys.stderr)
sys.exit(1)
def _convert_osm_to_geojson(self, overpass_data: Dict[str, Any]) -> Dict[str, Any]:
"""Convert Overpass API response to GeoJSON format."""
features = []
for element in overpass_data.get("elements", []):
properties = element.get("tags", {})
# Extract coordinates based on element type
if element["type"] == "node":
coordinates = [element["lon"], element["lat"]]
geometry = {"type": "Point", "coordinates": coordinates}
elif element["type"] == "way" and "geometry" in element:
# For ways, use the centroid
coords = [[coord["lon"], coord["lat"]] for coord in element["geometry"]]
if len(coords) > 0:
# Calculate centroid
lon = sum(coord[0] for coord in coords) / len(coords)
lat = sum(coord[1] for coord in coords) / len(coords)
coordinates = [lon, lat]
geometry = {"type": "Point", "coordinates": coordinates}
else:
continue
else:
continue # Skip relations and ways without geometry
feature = {
"type": "Feature",
"properties": properties,
"geometry": geometry
}
features.append(feature)
return {
"type": "FeatureCollection",
"features": features
}
def load_local_addresses(self, zip_path: str, output_geojson: str = None) -> str:
"""Load and convert local address data from ZIP file."""
zip_path = Path(zip_path)
if output_geojson is None:
output_geojson = zip_path.parent / f"{zip_path.stem}_converted.geojson"
else:
output_geojson = Path(output_geojson)
# Check if conversion already exists and is newer than the ZIP
if (output_geojson.exists() and
zip_path.exists() and
output_geojson.stat().st_mtime > zip_path.stat().st_mtime):
print(f"Using existing converted data: {output_geojson}")
return str(output_geojson)
print(f"Converting local address data from {zip_path}...")
# Extract and find shapefile in ZIP
temp_dir = zip_path.parent / "temp_extract"
temp_dir.mkdir(exist_ok=True)
try:
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(temp_dir)
# Find the shapefile
shp_files = list(temp_dir.glob("*.shp"))
if not shp_files:
raise FileNotFoundError("No shapefile (.shp) found in ZIP")
shp_file = shp_files[0]
# Load and process shapefile
gdf = gpd.read_file(shp_file)
# Convert CRS to WGS84 if needed
if gdf.crs and gdf.crs != 'EPSG:4326':
print(f"Converting from {gdf.crs} to EPSG:4326")
gdf = gdf.to_crs('EPSG:4326')
# Process address fields using existing logic from sumter-address-convert.py
gdf = self._process_address_fields(gdf)
# Filter to only point geometries and valid addresses
gdf = gdf[gdf.geometry.type == 'Point'].copy()
gdf = gdf[gdf['addr:housenumber'].notna()].copy()
# Clean output data - keep only OSM address fields
osm_fields = [
'addr:housenumber', 'addr:unit', 'addr:street',
'addr:city', 'addr:postcode', 'addr:state'
]
existing_fields = [field for field in osm_fields if field in gdf.columns]
gdf = gdf[existing_fields + ['geometry']]
# Save to GeoJSON
output_geojson.parent.mkdir(parents=True, exist_ok=True)
gdf.to_file(output_geojson, driver='GeoJSON')
print(f"Converted {len(gdf)} addresses to {output_geojson}")
finally:
# Clean up temp directory
import shutil
if temp_dir.exists():
shutil.rmtree(temp_dir)
return str(output_geojson)
def _process_address_fields(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
"""Process address fields according to OSM schema (handles multiple formats)."""
processed_gdf = gdf.copy()
address_mapping = {}
# House number - try multiple field names
house_number_fields = ['ADD_NUM', 'AddressNum', 'ADDRESS_NUM', 'HOUSE_NUM']
for field in house_number_fields:
if field in processed_gdf.columns:
add_num_series = processed_gdf[field].copy()
add_num_series = pd.to_numeric(add_num_series, errors='coerce')
address_mapping['addr:housenumber'] = add_num_series.round().astype('Int64')
break
# Unit number - try multiple field names
unit_fields = ['UNIT', 'UnitNumber', 'UNIT_NUM', 'APT']
for field in unit_fields:
if field in processed_gdf.columns:
unit_series = processed_gdf[field].copy()
unit_series = unit_series.replace(['nan', 'None', '', None], None)
unit_series = unit_series.where(unit_series.notna(), None)
address_mapping['addr:unit'] = unit_series
break
# Street name - try multiple approaches
if 'SADD' in processed_gdf.columns:
# Sumter County format - full address in SADD field
street_names = []
for sadd_value in processed_gdf['SADD']:
if pd.notna(sadd_value):
street_from_addr = qgis_functions.getstreetfromaddress(str(sadd_value), None, None)
street_titled = qgis_functions.title(street_from_addr)
street_names.append(street_titled)
else:
street_names.append(None)
address_mapping['addr:street'] = street_names
elif 'FullAddres' in processed_gdf.columns:
# Lake County format - full address in FullAddres field
street_names = []
for full_addr in processed_gdf['FullAddres']:
if pd.notna(full_addr):
street_from_addr = qgis_functions.getstreetfromaddress(str(full_addr), None, None)
street_titled = qgis_functions.title(street_from_addr)
street_names.append(street_titled)
else:
street_names.append(None)
address_mapping['addr:street'] = street_names
elif 'BaseStreet' in processed_gdf.columns:
# Lake County alternative - combine street components
street_names = []
for idx, row in processed_gdf.iterrows():
street_parts = []
# Prefix direction
if 'PrefixDire' in row and pd.notna(row['PrefixDire']):
street_parts.append(str(row['PrefixDire']).strip())
# Prefix type
if 'PrefixType' in row and pd.notna(row['PrefixType']):
street_parts.append(str(row['PrefixType']).strip())
# Base street name
if pd.notna(row['BaseStreet']):
street_parts.append(str(row['BaseStreet']).strip())
# Suffix type
if 'SuffixType' in row and pd.notna(row['SuffixType']):
street_parts.append(str(row['SuffixType']).strip())
if street_parts:
street_name = ' '.join(street_parts)
street_titled = qgis_functions.title(street_name)
street_names.append(street_titled)
else:
street_names.append(None)
address_mapping['addr:street'] = street_names
# City - try multiple field names
city_fields = ['POST_COMM', 'PostalCity', 'CITY', 'Jurisdicti']
for field in city_fields:
if field in processed_gdf.columns:
city_names = []
for city_value in processed_gdf[field]:
if pd.notna(city_value):
city_titled = qgis_functions.title(str(city_value))
city_names.append(city_titled)
else:
city_names.append(None)
address_mapping['addr:city'] = city_names
break
# Postal code - try multiple field names
postcode_fields = ['POST_CODE', 'ZipCode', 'ZIP', 'POSTAL_CODE']
for field in postcode_fields:
if field in processed_gdf.columns:
post_code_series = processed_gdf[field].copy()
post_code_series = pd.to_numeric(post_code_series, errors='coerce')
address_mapping['addr:postcode'] = post_code_series.round().astype('Int64')
break
# Manually add addr:state
address_mapping['addr:state'] = 'FL'
# Add the new address columns to the GeoDataFrame
for key, value in address_mapping.items():
processed_gdf[key] = value
return processed_gdf
def compare_addresses(self, local_file: str, osm_file: str) -> Tuple[List[Dict], List[Dict], List[Dict]]:
"""
Compare local and OSM address data.
Returns:
Tuple of (new_addresses, existing_addresses, removed_addresses)
"""
print(f"Comparing addresses: {local_file} vs {osm_file}")
# Load data
local_gdf = gpd.read_file(local_file)
osm_gdf = gpd.read_file(osm_file)
print(f"Loaded local addresses: {len(local_gdf)}")
print(f"Loaded OSM addresses: {len(osm_gdf)}")
# Apply sampling if requested
if hasattr(self, 'sample_size') and self.sample_size:
if len(local_gdf) > self.sample_size:
local_gdf = local_gdf.sample(n=self.sample_size, random_state=42).reset_index(drop=True)
print(f"Sampled local addresses to: {len(local_gdf)}")
if hasattr(self, 'max_osm') and self.max_osm:
if len(osm_gdf) > self.max_osm:
osm_gdf = osm_gdf.sample(n=self.max_osm, random_state=42).reset_index(drop=True)
print(f"Sampled OSM addresses to: {len(osm_gdf)}")
print(f"Processing local addresses: {len(local_gdf)}")
print(f"Processing OSM addresses: {len(osm_gdf)}")
# Ensure both are in the same CRS
if local_gdf.crs != osm_gdf.crs:
osm_gdf = osm_gdf.to_crs(local_gdf.crs)
# Create spatial indexes
local_index = STRtree(local_gdf.geometry.tolist())
osm_index = STRtree(osm_gdf.geometry.tolist())
# Find matches
existing_addresses = []
new_addresses = []
# For each local address, find closest OSM address
for idx, local_row in local_gdf.iterrows():
local_point = local_row.geometry
# Query nearby OSM addresses
nearby_indices = osm_index.query(local_point.buffer(self.tolerance_deg))
best_match = None
min_distance = float('inf')
for osm_idx in nearby_indices:
osm_row = osm_gdf.iloc[osm_idx]
osm_point = osm_row.geometry
distance = local_point.distance(osm_point)
# Additional verification: check if house numbers match (handle type differences)
local_house_num = str(local_row.get('addr:housenumber', ''))
osm_house_num = str(osm_row.get('addr:housenumber', ''))
# Only consider as potential match if house numbers match
if local_house_num == osm_house_num and distance < min_distance:
min_distance = distance
best_match = osm_idx
# Convert distance to meters for comparison
distance_meters = min_distance * 111000.0
if best_match is not None and distance_meters <= self.tolerance_meters:
# Found a match - this is an existing address
local_props = dict(local_row.drop('geometry'))
osm_props = dict(osm_gdf.iloc[best_match].drop('geometry'))
existing_addresses.append({
'geometry': local_point,
'local_data': local_props,
'osm_data': osm_props,
'distance_meters': distance_meters
})
else:
# No match found - this is a new address to add to OSM
local_props = dict(local_row.drop('geometry'))
local_props['status'] = 'new'
new_addresses.append({
'geometry': local_point,
**local_props
})
# Find OSM addresses that don't have local matches (potentially removed)
removed_addresses = []
# Track which OSM addresses were matched during the first pass
# by storing their index during the matching process above
matched_osm_indices = set()
# Re-do the matching to track OSM indices
for idx, local_row in local_gdf.iterrows():
local_point = local_row.geometry
nearby_indices = osm_index.query(local_point.buffer(self.tolerance_deg))
for osm_idx in nearby_indices:
osm_row = osm_gdf.iloc[osm_idx]
osm_point = osm_row.geometry
distance_meters = local_point.distance(osm_point) * 111000.0
# Check if house numbers match (handle type differences)
local_house_num = str(local_row.get('addr:housenumber', ''))
osm_house_num = str(osm_row.get('addr:housenumber', ''))
if (distance_meters <= self.tolerance_meters and
local_house_num == osm_house_num):
matched_osm_indices.add(osm_idx)
break # Only match to first OSM address found
# Find unmatched OSM addresses
for idx, osm_row in osm_gdf.iterrows():
if idx not in matched_osm_indices:
osm_props = dict(osm_row.drop('geometry'))
osm_props['status'] = 'removed'
removed_addresses.append({
'geometry': osm_row.geometry,
**osm_props
})
return new_addresses, existing_addresses, removed_addresses
def save_results(self, new_addresses: List[Dict], existing_addresses: List[Dict],
removed_addresses: List[Dict], output_dir: str):
"""Save comparison results to separate GeoJSON files."""
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
# Save new addresses (to add to OSM)
if new_addresses:
new_gdf = gpd.GeoDataFrame(new_addresses)
new_file = output_dir / "addresses-to-add.geojson"
new_gdf.to_file(new_file, driver='GeoJSON')
print(f"Saved {len(new_addresses)} new addresses to {new_file}")
# Save removed addresses (missing from local data)
if removed_addresses:
removed_gdf = gpd.GeoDataFrame(removed_addresses)
removed_file = output_dir / "addresses-potentially-removed.geojson"
removed_gdf.to_file(removed_file, driver='GeoJSON')
print(f"Saved {len(removed_addresses)} potentially removed addresses to {removed_file}")
# Save existing addresses for reference
if existing_addresses:
# Create simplified format for existing addresses
existing_simple = []
for addr in existing_addresses:
existing_simple.append({
'geometry': addr['geometry'],
'distance_meters': addr['distance_meters'],
'status': 'existing'
})
existing_gdf = gpd.GeoDataFrame(existing_simple)
existing_file = output_dir / "addresses-existing.geojson"
existing_gdf.to_file(existing_file, driver='GeoJSON')
print(f"Saved {len(existing_addresses)} existing addresses to {existing_file}")
def print_summary(self, new_addresses: List[Dict], existing_addresses: List[Dict],
removed_addresses: List[Dict]):
"""Print a summary of the comparison results."""
print("\n" + "="*60)
print("ADDRESS COMPARISON SUMMARY")
print("="*60)
print(f"\nTOTAL ADDRESSES ANALYZED:")
print(f" • New addresses (to add to OSM): {len(new_addresses)}")
print(f" • Existing addresses (matched): {len(existing_addresses)}")
print(f" • Potentially removed addresses: {len(removed_addresses)}")
if existing_addresses:
distances = [addr['distance_meters'] for addr in existing_addresses]
avg_distance = sum(distances) / len(distances)
print(f"\nMATCHING STATISTICS:")
print(f" • Average distance of matches: {avg_distance:.1f} meters")
print(f" • Max distance of matches: {max(distances):.1f} meters")
if new_addresses:
print(f"\nNEW ADDRESSES TO ADD:")
print(f" These addresses exist in local data but not in OSM")
# Group by street name
streets = {}
for addr in new_addresses[:10]: # Show first 10
street = addr.get('addr:street', 'Unknown Street')
if street not in streets:
streets[street] = 0
streets[street] += 1
for street, count in sorted(streets.items()):
print(f"{street}: {count} address(es)")
if len(new_addresses) > 10:
print(f" • ... and {len(new_addresses) - 10} more")
if removed_addresses:
print(f"\nPOTENTIALLY REMOVED ADDRESSES:")
print(f" These addresses exist in OSM but not in local data")
print(f" (May indicate addresses that were removed or demolished)")
def main():
parser = argparse.ArgumentParser(
description="Compare local government address data with OpenStreetMap addresses",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python compare-addresses.py "Lake" "Florida" --local-zip "original data/Lake/Addresspoints 2025-06.zip"
python compare-addresses.py "Sumter" "Florida" --local-zip "original data/Sumter/Address9_13_2024.zip" --tolerance 30
python compare-addresses.py "Orange" "Florida" --local-zip "addresses.zip" --output-dir "results/orange"
"""
)
parser.add_argument('county', help='County name (e.g., "Lake", "Sumter")')
parser.add_argument('state', help='State name (e.g., "Florida")')
parser.add_argument('--local-zip', required=True, help='Path to local address data ZIP file')
parser.add_argument('--tolerance', '-t', type=float, default=50.0,
help='Distance tolerance in meters for matching addresses (default: 50)')
parser.add_argument('--output-dir', '-o', help='Output directory for results (default: processed data/[County])')
parser.add_argument('--cache-dir', default='osm_cache',
help='Directory to cache OSM downloads (default: osm_cache)')
parser.add_argument('--force-download', action='store_true',
help='Force re-download of OSM data (ignore cache)')
parser.add_argument('--sample', '-s', type=int,
help='Process only a sample of N addresses for testing')
parser.add_argument('--max-osm', type=int, default=50000,
help='Maximum number of OSM addresses to process (default: 50000)')
args = parser.parse_args()
# Validate input file
local_zip = Path(args.local_zip)
if not local_zip.exists():
print(f"Error: Local ZIP file {args.local_zip} does not exist")
return 1
# Set output directory
if args.output_dir:
output_dir = Path(args.output_dir)
else:
output_dir = Path("processed data") / args.county
try:
# Create comparator
comparator = AddressComparator(
tolerance_meters=args.tolerance,
cache_dir=args.cache_dir
)
# Set sampling parameters
if args.sample:
comparator.sample_size = args.sample
if args.max_osm:
comparator.max_osm = args.max_osm
# Download/load OSM data
if args.force_download:
# Remove existing cache for this county
for cache_file in Path(args.cache_dir).glob(f"osm_addresses_{args.county.lower()}_*.geojson"):
cache_file.unlink()
osm_file = comparator.download_osm_addresses(args.county, args.state)
# Convert local data
local_file = comparator.load_local_addresses(args.local_zip)
# Perform comparison
new_addresses, existing_addresses, removed_addresses = comparator.compare_addresses(
local_file, osm_file
)
# Save results
comparator.save_results(new_addresses, existing_addresses, removed_addresses, output_dir)
# Print summary
comparator.print_summary(new_addresses, existing_addresses, removed_addresses)
return 0
except Exception as e:
print(f"Error: {str(e)}")
import traceback
traceback.print_exc()
return 1
if __name__ == "__main__":
sys.exit(main())

660
diff-highways.py Normal file
View File

@@ -0,0 +1,660 @@
#!/usr/bin/env python3
"""
GeoJSON Road Comparison Script - Optimized Version
Compares two GeoJSON files containing road data and identifies:
1. Roads in file1 that don't have corresponding coverage in file2 (removed roads)
2. Roads in file2 that don't have corresponding coverage in file1 (added roads)
Only reports differences that are significant (above minimum length threshold).
Optimized for performance with parallel processing and spatial indexing.
TODO:
- ignore points outside of lines
- put properties properly on removed roads, so they're visible in JOSM
- handle polygons properly (on previous geojson step?) for circular roads
- ignore roads that aren't LIFECYCLE ACTV or Active
- include OneWay=Y
- handle C 44a -> County Road 44A
- handle Tpke -> Turnpike
- handle Trce -> Trace/Terrace?
- handle Cor -> Corner
- handle Obrien -> O'Brien
- handle Oday -> O'Day
- Ohara -> O'Hara
"""
import json
import argparse
from pathlib import Path
from typing import List, Dict, Any, Tuple
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString, Point, Polygon
from shapely.ops import unary_union
from shapely.strtree import STRtree
import pandas as pd
import warnings
import multiprocessing as mp
from functools import partial
import numpy as np
from concurrent.futures import ProcessPoolExecutor, as_completed
import gc
import importlib
qgisfunctions = importlib.import_module("qgis-functions")
# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')
import re
def titlecase(s):
return re.sub(
r"[A-Za-z0-9]+('[A-Za-z0-9]+)?",
lambda word: word.group(0).capitalize(),
s)
class RoadComparator:
def __init__(self, tolerance_feet: float = 50.0, min_gap_length_feet: float = 100.0,
n_jobs: int = None, chunk_size: int = 1000, exclude_unnamed: bool = False):
"""
Initialize the road comparator.
Args:
tolerance_feet: Distance tolerance for considering roads as overlapping (default: 50 feet)
min_gap_length_feet: Minimum length of gap/extra to be considered significant (default: 100 feet)
n_jobs: Number of parallel processes to use (default: CPU count - 1)
chunk_size: Number of geometries to process per chunk (default: 1000)
exclude_unnamed: Exclude features without name/highway tags from coverage (default: False)
"""
self.tolerance_feet = tolerance_feet
self.min_gap_length_feet = min_gap_length_feet
# Reduce worker count for Windows to prevent memory issues
self.n_jobs = n_jobs or min(2, max(1, mp.cpu_count() // 2))
self.chunk_size = chunk_size
self.exclude_unnamed = exclude_unnamed
# Convert feet to degrees (approximate conversion for continental US)
# 1 degree latitude ≈ 364,000 feet
# 1 degree longitude ≈ 288,000 feet (at 40° latitude)
self.tolerance_deg = tolerance_feet / 364000.0
self.min_gap_length_deg = min_gap_length_feet / 364000.0
print(f"Using {self.n_jobs} parallel processes with chunk size {self.chunk_size}")
if self.exclude_unnamed:
print("Excluding unnamed features from coverage calculation")
def _has_name(self, row) -> bool:
"""Check if a feature has a name tag (for OSM data filtering)."""
# Check for OSM-style tags (stored as JSON string)
if 'tags' in row.index:
tags = row.get('tags')
if isinstance(tags, dict):
return bool(tags.get('name'))
elif isinstance(tags, str):
# Tags stored as JSON string
try:
tags_dict = json.loads(tags)
return bool(tags_dict.get('name'))
except (json.JSONDecodeError, TypeError):
return False
return False
# Check for direct name properties
name = row.get('name') or row.get('NAME') or row.get('FULLNAME')
return bool(name)
def load_geojson(self, filepath: str, filter_unnamed: bool = False) -> gpd.GeoDataFrame:
"""Load and validate GeoJSON file with optimizations."""
try:
# Use pyogr engine for faster loading of large files
gdf = gpd.read_file(filepath, engine='pyogrio')
# Filter only LineString, MultiLineString, and Polygon geometries
line_types = ['LineString', 'MultiLineString', 'Polygon']
gdf = gdf[gdf.geometry.type.isin(line_types)].copy()
if len(gdf) == 0:
raise ValueError(f"No line geometries found in {filepath}")
# Reset index for efficient processing
gdf = gdf.reset_index(drop=True)
# Ensure geometry is valid and fix simple issues
invalid_mask = ~gdf.geometry.is_valid
if invalid_mask.any():
print(f"Fixing {invalid_mask.sum()} invalid geometries...")
gdf.loc[invalid_mask, 'geometry'] = gdf.loc[invalid_mask, 'geometry'].buffer(0)
# Filter unnamed features if requested
if filter_unnamed:
original_count = len(gdf)
named_mask = gdf.apply(self._has_name, axis=1)
gdf = gdf[named_mask].copy()
gdf = gdf.reset_index(drop=True)
filtered_count = original_count - len(gdf)
print(f"Filtered out {filtered_count} unnamed features")
print(f"Loaded {len(gdf)} road features from {filepath}")
return gdf
except Exception as e:
raise Exception(f"Error loading {filepath}: {str(e)}")
def create_buffered_union_optimized(self, gdf: gpd.GeoDataFrame) -> Any:
"""Create a buffered union using chunked processing for memory efficiency."""
print("Creating optimized buffered union...")
# Process in chunks to manage memory - extract geometries as lists
chunks = [gdf.iloc[i:i+self.chunk_size].geometry.tolist() for i in range(0, len(gdf), self.chunk_size)]
chunk_unions = []
# Use partial function for multiprocessing
buffer_func = partial(self._buffer_chunk, tolerance=self.tolerance_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(buffer_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_union = future.result(timeout=300) # 5 minute timeout
if chunk_union and not chunk_union.is_empty:
chunk_unions.append(chunk_union)
print(f"Processed chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing chunk {chunk_idx}: {str(e)}")
# Continue processing other chunks instead of failing completely
# Union all chunk results
print("Combining chunk unions...")
if chunk_unions:
final_union = unary_union(chunk_unions)
# Force garbage collection
del chunk_unions, chunks
gc.collect()
return final_union
else:
raise Exception("No valid geometries to create union")
@staticmethod
def _buffer_chunk(geometries: List, tolerance: float) -> Any:
"""Buffer geometries in a chunk and return their union."""
try:
# Buffer all geometries in the chunk
buffered = [geom.buffer(tolerance) for geom in geometries]
# Create union of buffered geometries
if len(buffered) == 1:
return buffered[0]
else:
return unary_union(buffered)
except Exception as e:
print(f"Error in chunk processing: {str(e)}")
return None
def create_spatial_index(self, gdf: gpd.GeoDataFrame) -> STRtree:
"""Create spatial index for fast intersection queries."""
print("Creating spatial index...")
# Create STRtree for fast spatial queries
geometries = gdf.geometry.tolist()
return STRtree(geometries)
def find_removed_segments_optimized(self, source_gdf: gpd.GeoDataFrame,
target_union: Any) -> List[Dict[str, Any]]:
"""
Find segments in source_gdf that are not covered by target_union (removed roads).
Optimized with parallel processing.
"""
print("Finding removed segments...")
# Split into chunks for parallel processing - convert to serializable format
chunks = []
for i in range(0, len(source_gdf), self.chunk_size):
chunk_gdf = source_gdf.iloc[i:i+self.chunk_size]
chunk_data = []
for idx, row in chunk_gdf.iterrows():
chunk_data.append({
'geometry': row.geometry,
'properties': dict(row.drop('geometry'))
})
chunks.append(chunk_data)
all_removed = []
# Use partial function for multiprocessing
process_func = partial(self._process_removed_chunk,
target_union=target_union,
min_length_deg=self.min_gap_length_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(process_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_removed = future.result(timeout=300) # 5 minute timeout
all_removed.extend(chunk_removed)
print(f"Processed removed chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing removed chunk {chunk_idx}: {str(e)}")
# Continue processing other chunks instead of failing completely
# Clean up memory
del chunks
gc.collect()
return all_removed
@staticmethod
def _process_removed_chunk(chunk_data: List[Dict], target_union: Any,
min_length_deg: float) -> List[Dict[str, Any]]:
"""Process a chunk of geometries to find removed segments."""
removed_segments = []
for row_data in chunk_data:
geom = row_data['geometry']
properties = row_data['properties']
# Handle MultiLineString by processing each component
if isinstance(geom, MultiLineString):
lines = list(geom.geoms)
else:
lines = [geom] # Polygon and Line can be accessed directly
for line in lines:
try:
# Find parts of the line that don't intersect with target_union
uncovered = line.difference(target_union)
if uncovered.is_empty:
continue
# Handle different geometry types returned by difference
uncovered_lines = []
if hasattr(uncovered, 'geoms'):
for geom_part in uncovered.geoms:
if isinstance(geom_part, LineString):
uncovered_lines.append(geom_part)
elif isinstance(uncovered, LineString):
uncovered_lines.append(uncovered)
# Check each uncovered line segment
for uncovered_line in uncovered_lines:
if uncovered_line.length >= min_length_deg:
# Create properties dict with original metadata plus 'removed: true'
result_properties = properties.copy()
result_properties['removed'] = True
removed_segments.append({
'geometry': uncovered_line,
**result_properties
})
except Exception as e:
continue # Skip problematic geometries
return removed_segments
def find_added_roads_optimized(self, source_gdf: gpd.GeoDataFrame,
target_union: Any) -> List[Dict[str, Any]]:
"""
Find entire roads in source_gdf that don't significantly overlap with target_union.
Optimized with parallel processing.
"""
print("Finding added roads...")
# Split into chunks for parallel processing - convert to serializable format
chunks = []
for i in range(0, len(source_gdf), self.chunk_size):
chunk_gdf = source_gdf.iloc[i:i+self.chunk_size]
chunk_data = []
for idx, row in chunk_gdf.iterrows():
chunk_data.append({
'geometry': row.geometry,
'properties': dict(row.drop('geometry'))
})
chunks.append(chunk_data)
all_added = []
# Use partial function for multiprocessing
process_func = partial(self._process_added_chunk,
target_union=target_union,
min_length_deg=self.min_gap_length_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(process_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_added = future.result(timeout=300) # 5 minute timeout
all_added.extend(chunk_added)
print(f"Processed added chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing added chunk {chunk_idx}: {str(e)}")
# Continue processing other chunks instead of failing completely
# Clean up memory
del chunks
gc.collect()
return all_added
@staticmethod
def _process_added_chunk(chunk_data: List[Dict], target_union: Any,
min_length_deg: float) -> List[Dict[str, Any]]:
"""Process a chunk of geometries to find added roads."""
added_roads = []
for row_data in chunk_data:
geom = row_data['geometry']
original_properties = row_data['properties']
try:
# Check what portion of the road is not covered
uncovered = geom.difference(target_union)
if not uncovered.is_empty:
# Calculate what percentage of the original road is uncovered
uncovered_length = 0
if hasattr(uncovered, 'geoms'):
for geom_part in uncovered.geoms:
if isinstance(geom_part, LineString):
uncovered_length += geom_part.length
elif isinstance(uncovered, LineString):
uncovered_length = uncovered.length
original_length = geom.length
uncovered_ratio = uncovered_length / original_length if original_length > 0 else 0
# Include the entire road if:
# 1. The uncovered portion is above minimum threshold, AND
# 2. More than 10% of the road is uncovered
if uncovered_ratio > 0.1:
#uncovered_length >= min_length_deg and
# Include entire original road with all original metadata
properties = {
'surface': 'asphalt'
}
# Detect county format based on available fields
is_lake_county = 'FullStreet' in original_properties
is_sumter_county = 'NAME' in original_properties and 'RoadClass' in original_properties
if is_lake_county:
# Lake County field mappings
for key, value in original_properties.items():
if key == 'FullStreet':
properties['name'] = titlecase(qgisfunctions.formatstreet(value,None,None)) if value is not None else None
elif key == 'SpeedLimit':
properties['maxspeed'] = f"{value} mph" if value is not None else None
elif key == 'NumberOfLa':
try:
num_value = int(float(value)) if value is not None else 0
if num_value > 0:
properties['lanes'] = str(num_value)
except (ValueError, TypeError):
pass
elif key == 'StreetClas':
highway_type = qgisfunctions.gethighwaytype(value, None, None)
properties['highway'] = highway_type if highway_type else 'residential'
elif is_sumter_county:
# Sumter County field mappings
for key, value in original_properties.items():
if key == 'NAME':
properties['name'] = titlecase(qgisfunctions.formatstreet(value,None,None)) if value is not None else None
elif key == 'SpeedLimit':
properties['maxspeed'] = f"{value} mph" if value is not None else None
elif key == 'RoadClass':
if value is None:
properties['highway'] = 'residential'
elif value.startswith('PRIMARY'):
properties['highway'] = 'trunk'
elif value.startswith('MAJOR'):
properties['highway'] = 'primary'
else:
properties['highway'] = 'residential'
else:
# Unknown format - try common field names
name = original_properties.get('NAME') or original_properties.get('FullStreet') or original_properties.get('name')
if name:
properties['name'] = titlecase(qgisfunctions.formatstreet(name,None,None))
speed = original_properties.get('SpeedLimit')
if speed:
properties['maxspeed'] = f"{speed} mph"
properties['highway'] = 'residential'
added_roads.append({
'geometry': geom,
**properties
})
except Exception as e:
print(e)
continue # Skip problematic geometries
return added_roads
def compare_roads(self, file1_path: str, file2_path: str) -> Tuple[List[Dict], List[Dict]]:
"""
Compare two GeoJSON files and find significant differences.
Optimized version with parallel processing.
Returns:
Tuple of (removed_roads, added_roads)
"""
print(f"Comparing {file1_path} and {file2_path}")
print(f"Tolerance: {self.tolerance_feet} feet")
print(f"Minimum significant length: {self.min_gap_length_feet} feet")
print(f"Parallel processing: {self.n_jobs} workers")
print("-" * 50)
# Load both files
# Filter unnamed features from file1 (OSM data) if exclude_unnamed is set
gdf1 = self.load_geojson(file1_path, filter_unnamed=self.exclude_unnamed)
gdf2 = self.load_geojson(file2_path)
# Ensure both are in the same CRS
if gdf1.crs != gdf2.crs:
print(f"Warning: CRS mismatch. Converting {file2_path} to match {file1_path}")
gdf2 = gdf2.to_crs(gdf1.crs)
print("Creating optimized spatial unions...")
# Create buffered unions using optimized method
union1 = self.create_buffered_union_optimized(gdf1)
union2 = self.create_buffered_union_optimized(gdf2)
print("Finding removed and added roads with parallel processing...")
# Find roads using optimized parallel methods
removed_roads = self.find_removed_segments_optimized(gdf1, union2)
added_roads = self.find_added_roads_optimized(gdf2, union1)
# Clean up memory
del gdf1, gdf2, union1, union2
gc.collect()
return removed_roads, added_roads
def save_results(self, removed: List[Dict], added: List[Dict], output_path: str):
"""Save results to GeoJSON file."""
all_results = removed + added
if not all_results:
print("No significant differences found!")
return
# Create GeoDataFrame efficiently
print("Saving results...")
results_gdf = gpd.GeoDataFrame(all_results)
# Save to file with optimization, with fallback for locked files
try:
results_gdf.to_file(output_path, driver='GeoJSON', engine='pyogrio')
print(f"Results saved to: {output_path}")
except (PermissionError, OSError) as e:
# File is locked, try with a timestamp suffix
from datetime import datetime
timestamp = datetime.now().strftime("%H%M%S")
base = Path(output_path)
fallback_path = str(base.parent / f"{base.stem}_{timestamp}{base.suffix}")
print(f"Warning: Could not save to {output_path}: {e}")
print(f"Saving to fallback: {fallback_path}")
results_gdf.to_file(fallback_path, driver='GeoJSON', engine='pyogrio')
print(f"Results saved to: {fallback_path}")
def print_summary(self, removed: List[Dict], added: List[Dict], file1_name: str, file2_name: str):
"""Print a summary of the comparison results."""
print("\n" + "="*60)
print("COMPARISON SUMMARY")
print("="*60)
print(f"\nFile 1: {file1_name}")
print(f"File 2: {file2_name}")
print(f"Tolerance: {self.tolerance_feet} feet")
print(f"Minimum significant length: {self.min_gap_length_feet} feet")
if removed:
print(f"\nREMOVED ROADS ({len(removed)} segments):")
print("These road segments exist in File 1 but are missing or incomplete in File 2:")
# Calculate total length of removed segments
total_removed_length = 0
removed_by_road = {}
for segment in removed:
geom = segment['geometry']
length_feet = geom.length * 364000.0 # Convert to feet
total_removed_length += length_feet
# Get road name
road_name = "Unknown"
name_fields = ['name', 'NAME', 'road_name', 'street_name', 'FULLNAME']
for field in name_fields:
if field in segment and pd.notna(segment[field]):
road_name = str(segment[field])
break
if road_name not in removed_by_road:
removed_by_road[road_name] = []
removed_by_road[road_name].append(length_feet)
print(f"Total removed length: {total_removed_length:,.1f} feet ({total_removed_length/5280:.2f} miles)")
for road, lengths in sorted(removed_by_road.items()):
road_total = sum(lengths)
print(f"{road}: {len(lengths)} segment(s), {road_total:,.1f} feet")
if added:
print(f"\nADDED ROADS ({len(added)} roads):")
print("These roads exist in File 2 but are missing or incomplete in File 1:")
# Calculate total length of added roads
total_added_length = 0
added_by_road = {}
for road in added:
geom = road['geometry']
length_feet = geom.length * 364000.0 # Convert to feet
total_added_length += length_feet
# Get road name
road_name = "Unknown"
name_fields = ['name', 'NAME', 'road_name', 'street_name', 'FULLNAME']
for field in name_fields:
if field in road and pd.notna(road[field]):
road_name = str(road[field])
break
if road_name not in added_by_road:
added_by_road[road_name] = 0
added_by_road[road_name] += length_feet
print(f"Total added length: {total_added_length:,.1f} feet ({total_added_length/5280:.2f} miles)")
for road, length in sorted(added_by_road.items()):
print(f"{road}: {length:,.1f} feet")
if not removed and not added:
print("\nNo significant differences found!")
print("The road networks have good coverage overlap within the specified tolerance.")
def main():
parser = argparse.ArgumentParser(
description="Compare two GeoJSON files containing roads and find significant gaps or extras (Optimized)",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python diff_highways.py roads1.geojson roads2.geojson
python diff_highways.py roads1.geojson roads2.geojson --tolerance 100 --min-length 200
python diff_highways.py roads1.geojson roads2.geojson --output differences.geojson
python diff_highways.py roads1.geojson roads2.geojson --jobs 8 --chunk-size 2000
"""
)
parser.add_argument('file1', help='First GeoJSON file')
parser.add_argument('file2', help='Second GeoJSON file')
parser.add_argument('--tolerance', '-t', type=float, default=50.0,
help='Distance tolerance in feet for considering roads as overlapping (default: 50)')
parser.add_argument('--min-length', '-m', type=float, default=100.0,
help='Minimum length in feet for gaps/extras to be considered significant (default: 100)')
parser.add_argument('--output', '-o', help='Output GeoJSON file for results (optional)')
parser.add_argument('--jobs', '-j', type=int, default=None,
help='Number of parallel processes (default: CPU count - 1)')
parser.add_argument('--chunk-size', '-c', type=int, default=1000,
help='Number of geometries to process per chunk (default: 1000)')
parser.add_argument('--exclude-unnamed', '-e', action='store_true',
help='Exclude features without name tags from coverage calculation (helps detect roads covered by unnamed geometry)')
args = parser.parse_args()
# Validate input files
if not Path(args.file1).exists():
print(f"Error: File {args.file1} does not exist")
return 1
if not Path(args.file2).exists():
print(f"Error: File {args.file2} does not exist")
return 1
try:
# Create comparator and run comparison
comparator = RoadComparator(
tolerance_feet=args.tolerance,
min_gap_length_feet=args.min_length,
n_jobs=args.jobs,
chunk_size=args.chunk_size,
exclude_unnamed=args.exclude_unnamed
)
removed, added = comparator.compare_roads(args.file1, args.file2)
# Print summary
comparator.print_summary(removed, added, args.file1, args.file2)
# Save results if output file specified
if args.output:
comparator.save_results(removed, added, args.output)
elif removed or added:
# Auto-generate output filename if differences found
output_file = f"road_differences_{Path(args.file1).stem}_vs_{Path(args.file2).stem}.geojson"
comparator.save_results(removed, added, output_file)
return 0
except Exception as e:
print(f"Error: {str(e)}")
return 1
if __name__ == "__main__":
exit(main())

14
docker-compose.yml Normal file
View File

@@ -0,0 +1,14 @@
version: '3.8'
services:
web:
build:
context: .
pull: true
ports:
- "5000:5000"
volumes:
- ./data:/data
environment:
- FLASK_ENV=development
restart: unless-stopped

183
download-overpass.py Normal file
View File

@@ -0,0 +1,183 @@
#!/usr/bin/env python3
"""
Download OSM data from Overpass API for a given county and save as GeoJSON.
Usage:
python download-overpass.py --type highways "Sumter County" "Florida" output/roads.geojson
python download-overpass.py --type addresses "Lake County" "Florida" output/addresses.geojson
python download-overpass.py --type multimodal "Sumter County" "Florida" output/paths.geojson
TODO:
- Don't just download roads. Probably ignore relations also.
"""
import argparse
import json
import sys
import time
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
def get_county_area_id(county_name, state_name):
"""Get OSM area ID for a county using Nominatim."""
search_query = f"{county_name}, {state_name}, USA"
url = f"https://nominatim.openstreetmap.org/search?q={urllib.parse.quote(search_query)}&format=json&limit=1&featuretype=county"
# Nominatim requires User-Agent header
req = urllib.request.Request(url, headers={'User-Agent': 'TheVillagesImport/1.0'})
try:
with urllib.request.urlopen(req) as response:
results = json.loads(response.read().decode("utf-8"))
if results and results[0].get('osm_type') == 'relation':
relation_id = int(results[0]['osm_id'])
area_id = relation_id + 3600000000
print(f"Found {county_name}, {state_name}: relation {relation_id} -> area {area_id}")
return area_id
raise ValueError(f"Could not find relation for {county_name}, {state_name}")
except urllib.error.HTTPError as e:
print(f"Nominatim HTTP Error {e.code}: {e.reason}", file=sys.stderr)
sys.exit(1)
def build_overpass_query(county_name, state_name, data_type="highways"):
"""Build Overpass API query for specified data type in a county."""
area_id = get_county_area_id(county_name, state_name)
base_query = f"""[out:json][timeout:60];
area(id:{area_id})->.searchArea;"""
if data_type == "highways":
selector = '('
selector += 'way["highway"="motorway"](area.searchArea);'
selector += 'way["highway"="trunk"](area.searchArea);'
selector += 'way["highway"="primary"](area.searchArea);'
selector += 'way["highway"="secondary"](area.searchArea);'
selector += 'way["highway"="tertiary"](area.searchArea);'
selector += 'way["highway"="unclassified"](area.searchArea);'
selector += 'way["highway"="residential"](area.searchArea);'
selector += 'way["highway"~"_link"](area.searchArea);'
selector += 'way["highway"="service"](area.searchArea);'
selector += 'way["highway"="track"](area.searchArea);'
selector += ');'
elif data_type == "addresses":
selector = 'nwr["addr:housenumber"](area.searchArea);'
elif data_type == "multimodal":
selector = '(way["highway"="path"](area.searchArea);way["highway"="cycleway"](area.searchArea););'
else:
raise ValueError(f"Unknown data type: {data_type}")
query = base_query + selector + "out geom;"
return query
def query_overpass(query):
"""Send query to Overpass API and return JSON response."""
url = "https://overpass-api.de/api/interpreter"
data = urllib.parse.urlencode({"data": query}).encode("utf-8")
try:
with urllib.request.urlopen(url, data=data) as response:
return json.loads(response.read().decode("utf-8"))
except urllib.error.HTTPError as e:
print(f"HTTP Error {e.code}: {e.reason}", file=sys.stderr)
try:
error_body = e.read().decode("utf-8")
print(f"Error response body: {error_body}", file=sys.stderr)
except:
print("Could not read error response body", file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f"Error querying Overpass API: {e}", file=sys.stderr)
sys.exit(1)
def convert_to_geojson(overpass_data):
"""Convert Overpass API response to GeoJSON format."""
features = []
for element in overpass_data.get("elements", []):
if element["type"] == "way" and "geometry" in element:
coordinates = [[coord["lon"], coord["lat"]] for coord in element["geometry"]]
feature = {
"type": "Feature",
"properties": element.get("tags", {}),
"geometry": {
"type": "LineString",
"coordinates": coordinates
}
}
features.append(feature)
elif element["type"] == "node":
feature = {
"type": "Feature",
"properties": element.get("tags", {}),
"geometry": {
"type": "Point",
"coordinates": [element["lon"], element["lat"]]
}
}
features.append(feature)
return {
"type": "FeatureCollection",
"features": features
}
def main():
parser = argparse.ArgumentParser(
description="Download OSM data from Overpass API for a county"
)
parser.add_argument(
"county",
help="County name (e.g., 'Lake County')"
)
parser.add_argument(
"state",
help="State name (e.g., 'Florida')"
)
parser.add_argument(
"output",
help="Output GeoJSON file path"
)
parser.add_argument(
"--type", "-t",
choices=["highways", "addresses", "multimodal"],
default="highways",
help="Type of data to download (default: highways)"
)
args = parser.parse_args()
# Create output directory if it doesn't exist
output_path = Path(args.output)
output_path.parent.mkdir(parents=True, exist_ok=True)
print(f"Downloading {args.type} for {args.county}, {args.state}...")
# Build and execute query
query = build_overpass_query(args.county, args.state, args.type)
print(f"Query: {query}")
overpass_data = query_overpass(query)
# Convert to GeoJSON
geojson = convert_to_geojson(overpass_data)
# Save to file
with open(output_path, "w", encoding="utf-8") as f:
json.dump(geojson, f, indent=2)
print(f"Saved {len(geojson['features'])} {args.type} features to {args.output}")
if __name__ == "__main__":
main()

268
qgis-functions.py Normal file
View File

@@ -0,0 +1,268 @@
#import qgis.core
#import qgis.gui
import re
#
# This will keep street names like SR 574A as SR 574A however
# will lowercase other number-digit suffixes with <2 or >4 numbers
# or >1 suffix-letters, like 12th Street or 243rd Ave.
#
def title(s):
return re.sub(
r"[A-Za-z0-9]+('[A-Za-z0-9]+)?",
lambda word: word.group(0).capitalize(),
s)
# @qgsfunction(args='auto', group='Custom', referenced_columns=[])
def getstreetfromaddress(value1, feature, parent):
parts = value1.split()
parts.pop(0) # Ignore the first bit (i.e. "123" in "123 N MAIN ST")
#parts = map(formatstreetname, parts)
#return " ".join(parts)
return formatstreet(" ".join(parts), None, None)
# @qgsfunction(args='auto', group='Custom', referenced_columns=[])
def formatstreet(value1, feature, parent):
parts = value1.split()
# Handle the special case of a street name starting with "ST"
# which is almost always "Saint __" and not "Street __"
if parts[0].upper() == "ST":
parts[0] = "Saint"
if parts[0].upper() == "ROYAL" and parts[1].upper() == "ST":
parts[0] = "Royal"
parts[1] = "Saint"
# And "CR" as a first part (County Road) vs last part (Circle)
if parts[0].upper() == "C ":
parts[0] = "County Road "
if parts[0].upper() == "CR":
parts[0] = "County Road"
if parts[0].upper() == "SR":
parts[0] = "State Road"
parts = map(formatstreetname, parts)
return " ".join(parts)
# @qgsfunction(args='auto', group='Custom', referenced_columns=[])
def formatname(value1, feature, parent):
parts = value1.split()
parts = map(formatstreetname, parts)
return " ".join(parts)
# @qgsfunction(args='auto', group='Custom', referenced_columns=[])
def gethighwaytype(value1, feature, parent):
match value1:
case "ALLEY":
return "alley"
case "LOCAL":
return "residential"
case "MAJOR":
return "trunk"
case "MEDIAN CUT":
return "primary_link"
case "OTHER":
return "unclassified"
case "PRIMARY":
return "primary"
case "PRIVATE":
return "service"
case "RAMP":
return "trunk_link"
case "SECONDARY":
return "secondary"
case "TURN LANE":
return "primary_link"
case "VEHICULAR TRAIL":
return "track"
# Internal function
def formatstreetname(name):
nameUp = name.upper()
# Acronyms
if nameUp == "SR":
return "SR" # State Route
if nameUp == "NFS":
return "NFS" # National Forest Service?
if nameUp == "US":
return "US"
# Directions
if nameUp == "N":
return "North"
if nameUp == "NE":
return "Northeast"
if nameUp == "E":
return "East"
if nameUp == "SE":
return "Southeast"
if nameUp == "S":
return "South"
if nameUp == "SW":
return "Southwest"
if nameUp == "W":
return "West"
if nameUp == "NW":
return "Northwest"
# Names
if nameUp == "MACLEAY":
return "MacLeay"
if nameUp == "MCCLAINE":
return "McClaine"
if nameUp == "MCAHREN":
return "McAhren"
if nameUp == "MCCAMMON":
return "McCammon"
if nameUp == "MCCLELLAN":
return "McClellan"
if nameUp == "MCCOY":
return "McCoy"
if nameUp == "MCDONALD":
return "McDonald"
if nameUp == "MCGEE":
return "McGee"
if nameUp == "MCGILCHRIST":
return "McGilchrist"
if nameUp == "MCINTOSH":
return "McIntosh"
if nameUp == "MCKAY":
return "McKay"
if nameUp == "MCKEE":
return "McKee"
if nameUp == "MCKENZIE":
return "McKenzie"
if nameUp == "MCKILLOP":
return "McKillop"
if nameUp == "MCKINLEY":
return "McKinley"
if nameUp == "MCKNIGHT":
return "McKnight"
if nameUp == "MCLAUGHLIN":
return "McLaughlin"
if nameUp == "MCLEOD":
return "McLeod"
if nameUp == "MCMASTER":
return "McMaster"
if nameUp == "MCNARY":
return "McNary"
if nameUp == "MCNAUGHT":
return "McNaught"
if nameUp == "O'BRIEN":
return "O'Brien"
if nameUp == "O'CONNOR":
return "O'Connor"
if nameUp == "O'NEIL":
return "O'Neil"
if nameUp == "O'TOOLE":
return "O'Toole"
# Suffixes
if nameUp == "ALY":
return "Alley"
if nameUp == "AV":
return "Avenue"
if nameUp == "AVE":
return "Avenue"
if nameUp == "BAY":
return "Bay"
if nameUp == "BLF":
return "Bluff"
if nameUp == "BLVD":
return "Boulevard"
if nameUp == "BV":
return "Boulevard"
if nameUp == "BND":
return "Bend"
if nameUp == "CIR":
return "Circle"
if nameUp == "CR":
return "Circle"
if nameUp == "CRK":
return "Creek"
if nameUp == "CRST":
return "Crest"
if nameUp == "CT":
return "Court"
if nameUp == "CURV":
return "Curve"
if nameUp == "CV":
return "Curve"
if nameUp == "DR":
return "Drive"
if nameUp == "FLDS":
return "Fields"
if nameUp == "GLN":
return "Glenn"
if nameUp == "GRV":
return "Grove"
if nameUp == "HL":
return "Hill"
if nameUp == "HOLW":
return "Hollow"
if nameUp == "HTS":
return "Heights"
if nameUp == "HW":
return "Highway"
if nameUp == "HWY":
return "Highway"
if nameUp == "HY":
return "Highway"
if nameUp == "LN":
return "Lane"
if nameUp == "LNDG":
return "Landing"
if nameUp == "LOOP":
return "Loop"
if nameUp == "LP":
return "Loop"
if nameUp == "MNR":
return "Manor"
if nameUp == "MT":
return "Mount"
if nameUp == "MTN":
return "Mountain"
if nameUp == "PARK":
return "Park"
if nameUp == "PASS":
return "Pass"
if nameUp == "PATH":
return "Path"
if nameUp == "PKWY":
return "Parkway"
if nameUp == "PL":
return "Place"
if nameUp == "PLZ":
return "Plaza"
if nameUp == "PS":
return "Pass"
if nameUp == "PT":
return "Point"
if nameUp == "RD":
return "Road"
if nameUp == "RDG":
return "Ridge"
if nameUp == "RUN":
return "Run"
if nameUp == "SHRS":
return "Shores"
if nameUp == "SQ":
return "Square"
if nameUp == "ST":
return "Street"
if nameUp == "TER":
return "Terrace"
if nameUp == "TR":
return "Trail"
if nameUp == "TRL":
return "Trail"
if nameUp == "VW":
return "View"
if nameUp == "WALK":
return "Walk"
if nameUp == "WAY":
return "Way"
if nameUp == "WY":
return "Way"
if nameUp == "XING":
return "Crossing"
if re.match('^[0-9]{2,4}[A-Za-z]$', name) != None:
return name
return name #.capitalize()

7
requirements.txt Normal file
View File

@@ -0,0 +1,7 @@
Flask==3.0.0
numpy<2.0.0
geopandas>=0.14.0
pandas>=2.1.0
shapely>=2.0.0
pyproj>=3.6.0
rtree>=1.1.0

41
run-lake-addresses.py Normal file
View File

@@ -0,0 +1,41 @@
#!/usr/bin/env python3
"""
Simple wrapper script for comparing Lake County addresses
"""
import subprocess
import sys
from pathlib import Path
def main():
# Change to script directory
script_dir = Path(__file__).parent
# Define the command
cmd = [
sys.executable,
"compare-addresses.py",
"Lake",
"Florida",
"--local-zip", "original data/Lake/Addresspoints 2025-06.zip",
"--tolerance", "50",
"--output-dir", "processed data/Lake"
]
print("Running Lake County address comparison...")
print("Command:", " ".join(cmd))
print()
# Run the command
result = subprocess.run(cmd, cwd=script_dir)
if result.returncode == 0:
print("\nAddress comparison completed successfully!")
print("Results saved in: processed data/Lake/")
else:
print(f"\nError: Script failed with return code {result.returncode}")
return result.returncode
if __name__ == "__main__":
sys.exit(main())

75
shp-to-geojson.py Normal file
View File

@@ -0,0 +1,75 @@
import geopandas
import sys
import os
from pathlib import Path
def convert_shapefile_to_geojson(
input_shapefile,
output_geojson,
target_crs=4326 # Convert to WGS 84
):
"""
Main conversion function
Args:
input_shapefile: Path to input shapefile
output_geojson: Path to output GeoJSON file
target_crs: Target coordinate reference system
"""
try:
# Read shapefile
print(f"Reading shapefile: {input_shapefile}")
df = geopandas.read_file(input_shapefile)
print(f"Converting to CRS {target_crs}")
df = df.to_crs(target_crs)
exploded = df.explode()
exploded.to_file(output_geojson, driver='GeoJSON')
except Exception as e:
print(f"Error during conversion: {str(e)}")
sys.exit(1)
def main():
"""
Main function to handle command line arguments
"""
import argparse
parser = argparse.ArgumentParser(
description='Convert shapefile to GeoJSON'
)
parser.add_argument(
'input_shapefile',
help='Path to input shapefile'
)
parser.add_argument(
'output_geojson',
help='Path to output GeoJSON file'
)
parser.add_argument(
'--target-crs',
default='4326',
help='Target coordinate reference system (default: 4326)'
)
args = parser.parse_args()
# Validate input file
if not os.path.exists(args.input_shapefile):
print(f"Error: Input shapefile '{args.input_shapefile}' not found")
sys.exit(1)
# Create output directory if it doesn't exist
output_dir = Path(args.output_geojson).parent
output_dir.mkdir(parents=True, exist_ok=True)
# Run conversion
convert_shapefile_to_geojson(
args.input_shapefile,
args.output_geojson,
args.target_crs
)
if __name__ == "__main__":
import pandas as pd
main()

258
sumter-address-convert.py Normal file
View File

@@ -0,0 +1,258 @@
#!/usr/bin/env python3
"""
Shapefile to GeoJSON Converter for Address Data
Converts ESRI:102659 CRS shapefile to EPSG:4326 GeoJSON with OSM-style address tags
"""
import geopandas as gpd
import json
import sys
import os
from pathlib import Path
import importlib
qgis_functions = importlib.import_module("qgis-functions")
title = qgis_functions.title
getstreetfromaddress = qgis_functions.getstreetfromaddress
def convert_crs(gdf, source_crs='ESRI:102659', target_crs='EPSG:4326'):
"""
Convert coordinate reference system from source to target CRS
Args:
gdf: GeoDataFrame to convert
source_crs: Source coordinate reference system (default: ESRI:102659)
target_crs: Target coordinate reference system (default: EPSG:4326)
Returns:
GeoDataFrame with converted CRS
"""
if gdf.crs is None:
print(f"Warning: No CRS detected, assuming {source_crs}")
gdf.crs = source_crs
if gdf.crs != target_crs:
print(f"Converting from {gdf.crs} to {target_crs}")
gdf = gdf.to_crs(target_crs)
return gdf
def process_address_fields(gdf):
"""
Process and map address fields according to OSM address schema
Args:
gdf: GeoDataFrame with address data
Returns:
GeoDataFrame with processed address fields
"""
processed_gdf = gdf.copy()
# Create new columns for OSM address tags
address_mapping = {}
# ADD_NUM -> addr:housenumber (as integer)
if 'ADD_NUM' in processed_gdf.columns:
# Handle NaN values and convert to nullable integer
add_num_series = processed_gdf['ADD_NUM'].copy()
# Convert to numeric, coercing errors to NaN
add_num_series = pd.to_numeric(add_num_series, errors='coerce')
# Round to remove decimal places, then convert to nullable integer
address_mapping['addr:housenumber'] = add_num_series.round().astype('Int64')
# UNIT -> addr:unit (as string)
if 'UNIT' in processed_gdf.columns:
unit_series = processed_gdf['UNIT'].copy()
# Replace NaN, empty strings, and 'None' string with actual None
unit_series = unit_series.replace(['nan', 'None', '', None], None)
# Only keep non-null values as strings
unit_series = unit_series.where(unit_series.notna(), None)
address_mapping['addr:unit'] = unit_series
# SADD -> addr:street via title(getstreetfromaddress("SADD"))
if 'SADD' in processed_gdf.columns:
street_names = []
for sadd_value in processed_gdf['SADD']:
if pd.notna(sadd_value):
street_from_addr = getstreetfromaddress(str(sadd_value), None, None)
street_titled = title(street_from_addr)
street_names.append(street_titled)
else:
street_names.append(None)
address_mapping['addr:street'] = street_names
# POST_COMM -> addr:city via title("POST_COMM")
if 'POST_COMM' in processed_gdf.columns:
city_names = []
for post_comm in processed_gdf['POST_COMM']:
if pd.notna(post_comm):
city_titled = title(str(post_comm))
city_names.append(city_titled)
else:
city_names.append(None)
address_mapping['addr:city'] = city_names
# POST_CODE -> addr:postcode (as integer)
if 'POST_CODE' in processed_gdf.columns:
# Handle NaN values and convert to nullable integer
post_code_series = processed_gdf['POST_CODE'].copy()
# Convert to numeric, coercing errors to NaN
post_code_series = pd.to_numeric(post_code_series, errors='coerce')
# Round to remove decimal places, then convert to nullable integer
address_mapping['addr:postcode'] = post_code_series.round().astype('Int64')
# Manually add addr:state = 'FL'
address_mapping['addr:state'] = 'FL'
# Add the new address columns to the GeoDataFrame
for key, value in address_mapping.items():
processed_gdf[key] = value
return processed_gdf
def clean_output_data(gdf, keep_original_fields=False):
"""
Clean the output data, optionally keeping original fields
Args:
gdf: GeoDataFrame to clean
keep_original_fields: Whether to keep original shapefile fields
Returns:
Cleaned GeoDataFrame
"""
# Define the OSM address fields we want to keep
osm_fields = [
'addr:housenumber', 'addr:unit', 'addr:street',
'addr:city', 'addr:postcode', 'addr:state'
]
if keep_original_fields:
# Keep both original and OSM fields
original_fields = ['ADD_NUM', 'UNIT', 'SADD', 'POST_COMM', 'POST_CODE']
fields_to_keep = list(set(osm_fields + original_fields + ['geometry']))
else:
# Keep only OSM fields and geometry
fields_to_keep = osm_fields + ['geometry']
# Filter to only existing columns
existing_fields = [field for field in fields_to_keep if field in gdf.columns]
return gdf[existing_fields]
def convert_shapefile_to_geojson(
input_shapefile,
output_geojson,
keep_original_fields=False,
source_crs='ESRI:102659',
target_crs='EPSG:4326'
):
"""
Main conversion function
Args:
input_shapefile: Path to input shapefile
output_geojson: Path to output GeoJSON file
keep_original_fields: Whether to keep original shapefile fields
source_crs: Source coordinate reference system
target_crs: Target coordinate reference system
"""
try:
# Read shapefile
print(f"Reading shapefile: {input_shapefile}")
gdf = gpd.read_file(input_shapefile)
print(f"Loaded {len(gdf)} features")
# Display original columns
print(f"Original columns: {list(gdf.columns)}")
# Convert CRS if needed
gdf = convert_crs(gdf, source_crs, target_crs)
# Process address fields
print("Processing address fields...")
gdf = process_address_fields(gdf)
# Clean output data
gdf = clean_output_data(gdf, keep_original_fields)
# Remove rows with no valid geometry
gdf = gdf[gdf.geometry.notna()]
print(f"Final columns: {list(gdf.columns)}")
print(f"Final feature count: {len(gdf)}")
# Write to GeoJSON
print(f"Writing GeoJSON: {output_geojson}")
gdf.to_file(output_geojson, driver='GeoJSON')
print(f"Conversion completed successfully!")
# Display sample of processed data
if len(gdf) > 0:
print("\nSample of processed data:")
sample_cols = [col for col in gdf.columns if col.startswith('addr:')]
if sample_cols:
print(gdf[sample_cols].head())
except Exception as e:
print(f"Error during conversion: {str(e)}")
sys.exit(1)
def main():
"""
Main function to handle command line arguments
"""
import argparse
parser = argparse.ArgumentParser(
description='Convert shapefile to GeoJSON with OSM address tags'
)
parser.add_argument(
'input_shapefile',
help='Path to input shapefile'
)
parser.add_argument(
'output_geojson',
help='Path to output GeoJSON file'
)
parser.add_argument(
'--keep-original',
action='store_true',
help='Keep original shapefile fields in addition to OSM fields'
)
parser.add_argument(
'--source-crs',
default='ESRI:102659',
help='Source coordinate reference system (default: ESRI:102659)'
)
parser.add_argument(
'--target-crs',
default='EPSG:4326',
help='Target coordinate reference system (default: EPSG:4326)'
)
args = parser.parse_args()
# Validate input file
if not os.path.exists(args.input_shapefile):
print(f"Error: Input shapefile '{args.input_shapefile}' not found")
sys.exit(1)
# Create output directory if it doesn't exist
output_dir = Path(args.output_geojson).parent
output_dir.mkdir(parents=True, exist_ok=True)
# Run conversion
convert_shapefile_to_geojson(
args.input_shapefile,
args.output_geojson,
args.keep_original,
args.source_crs,
args.target_crs
)
if __name__ == "__main__":
import pandas as pd
main()

View File

@@ -0,0 +1,540 @@
#!/usr/bin/env python3
"""
GeoJSON Multi Modal Golf Cart Path Comparison Script
Compares two GeoJSON files containing road data and identifies:
1. Roads in file1 that don't have corresponding coverage in file2 (removed roads)
2. Roads in file2 that don't have corresponding coverage in file1 (added roads)
Only reports differences that are significant (above minimum length threshold).
Optimized for performance with parallel processing and spatial indexing.
TODO:
- put properties properly on removed roads, so they're visible in JOSM
- handle polygons properly (on previous geojson step?) for circular roads
"""
import json
import argparse
from pathlib import Path
from typing import List, Dict, Any, Tuple
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString, Point, Polygon
from shapely.ops import unary_union
from shapely.strtree import STRtree
import pandas as pd
import warnings
import multiprocessing as mp
from functools import partial
import numpy as np
from concurrent.futures import ProcessPoolExecutor, as_completed
import gc
# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')
class RoadComparator:
def __init__(self, tolerance_feet: float = 50.0, min_gap_length_feet: float = 100.0,
n_jobs: int = None, chunk_size: int = 1000):
"""
Initialize the road comparator.
Args:
tolerance_feet: Distance tolerance for considering roads as overlapping (default: 50 feet)
min_gap_length_feet: Minimum length of gap/extra to be considered significant (default: 100 feet)
n_jobs: Number of parallel processes to use (default: CPU count - 1)
chunk_size: Number of geometries to process per chunk (default: 1000)
"""
self.tolerance_feet = tolerance_feet
self.min_gap_length_feet = min_gap_length_feet
self.n_jobs = n_jobs or max(1, mp.cpu_count() - 1)
self.chunk_size = chunk_size
# Convert feet to degrees (approximate conversion for continental US)
# 1 degree latitude ≈ 364,000 feet
# 1 degree longitude ≈ 288,000 feet (at 40° latitude)
self.tolerance_deg = tolerance_feet / 364000.0
self.min_gap_length_deg = min_gap_length_feet / 364000.0
print(f"Using {self.n_jobs} parallel processes with chunk size {self.chunk_size}")
def load_geojson(self, filepath: str) -> gpd.GeoDataFrame:
"""Load and validate GeoJSON file with optimizations."""
try:
# Use pyogr engine for faster loading of large files
gdf = gpd.read_file(filepath, engine='pyogrio')
# Filter only LineString, MultiLineString, and Polygon geometries
line_types = ['LineString', 'MultiLineString', 'Polygon']
gdf = gdf[gdf.geometry.type.isin(line_types)].copy()
if len(gdf) == 0:
raise ValueError(f"No line geometries found in {filepath}")
# Reset index for efficient processing
gdf = gdf.reset_index(drop=True)
# Ensure geometry is valid and fix simple issues
invalid_mask = ~gdf.geometry.is_valid
if invalid_mask.any():
print(f"Fixing {invalid_mask.sum()} invalid geometries...")
gdf.loc[invalid_mask, 'geometry'] = gdf.loc[invalid_mask, 'geometry'].buffer(0)
print(f"Loaded {len(gdf)} road features from {filepath}")
return gdf
except Exception as e:
raise Exception(f"Error loading {filepath}: {str(e)}")
def create_buffered_union_optimized(self, gdf: gpd.GeoDataFrame) -> Any:
"""Create a buffered union using chunked processing for memory efficiency."""
print("Creating optimized buffered union...")
# Process in chunks to manage memory
chunks = [gdf.iloc[i:i+self.chunk_size] for i in range(0, len(gdf), self.chunk_size)]
chunk_unions = []
# Use partial function for multiprocessing
buffer_func = partial(self._buffer_chunk, tolerance=self.tolerance_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(buffer_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_union = future.result()
if chunk_union and not chunk_union.is_empty:
chunk_unions.append(chunk_union)
print(f"Processed chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing chunk {chunk_idx}: {str(e)}")
# Union all chunk results
print("Combining chunk unions...")
if chunk_unions:
final_union = unary_union(chunk_unions)
# Force garbage collection
del chunk_unions
gc.collect()
return final_union
else:
raise Exception("No valid geometries to create union")
@staticmethod
def _buffer_chunk(chunk_gdf: gpd.GeoDataFrame, tolerance: float) -> Any:
"""Buffer geometries in a chunk and return their union."""
try:
# Buffer all geometries in the chunk
buffered = chunk_gdf.geometry.buffer(tolerance)
# Create union of buffered geometries
if len(buffered) == 1:
return buffered.iloc[0]
else:
return unary_union(buffered.tolist())
except Exception as e:
print(f"Error in chunk processing: {str(e)}")
return None
def create_spatial_index(self, gdf: gpd.GeoDataFrame) -> STRtree:
"""Create spatial index for fast intersection queries."""
print("Creating spatial index...")
# Create STRtree for fast spatial queries
geometries = gdf.geometry.tolist()
return STRtree(geometries)
def find_removed_segments_optimized(self, source_gdf: gpd.GeoDataFrame,
target_union: Any) -> List[Dict[str, Any]]:
"""
Find segments in source_gdf that are not covered by target_union (removed roads).
Optimized with parallel processing.
"""
print("Finding removed segments...")
# Split into chunks for parallel processing
chunks = [source_gdf.iloc[i:i+self.chunk_size]
for i in range(0, len(source_gdf), self.chunk_size)]
all_removed = []
# Use partial function for multiprocessing
process_func = partial(self._process_removed_chunk,
target_union=target_union,
min_length_deg=self.min_gap_length_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(process_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_removed = future.result()
all_removed.extend(chunk_removed)
print(f"Processed removed chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing removed chunk {chunk_idx}: {str(e)}")
return all_removed
@staticmethod
def _process_removed_chunk(chunk_gdf: gpd.GeoDataFrame, target_union: Any,
min_length_deg: float) -> List[Dict[str, Any]]:
"""Process a chunk of geometries to find removed segments."""
removed_segments = []
for idx, row in chunk_gdf.iterrows():
geom = row.geometry
# Handle MultiLineString by processing each component
if isinstance(geom, MultiLineString):
lines = list(geom.geoms)
else:
lines = [geom] # Polygon and Line can be accessed directly
for line in lines:
try:
# Find parts of the line that don't intersect with target_union
uncovered = line.difference(target_union)
if uncovered.is_empty:
continue
# Handle different geometry types returned by difference
uncovered_lines = []
if hasattr(uncovered, 'geoms'):
for geom_part in uncovered.geoms:
if isinstance(geom_part, LineString):
uncovered_lines.append(geom_part)
elif isinstance(uncovered, LineString):
uncovered_lines.append(uncovered)
# Check each uncovered line segment
for uncovered_line in uncovered_lines:
if uncovered_line.length >= min_length_deg:
# Create properties dict with original metadata plus 'removed: true'
properties = dict(row.drop('geometry'))
properties['removed'] = True
removed_segments.append({
'geometry': uncovered_line,
**properties
})
except Exception as e:
continue # Skip problematic geometries
return removed_segments
def find_added_roads_optimized(self, source_gdf: gpd.GeoDataFrame,
target_union: Any) -> List[Dict[str, Any]]:
"""
Find entire roads in source_gdf that don't significantly overlap with target_union.
Optimized with parallel processing.
"""
print("Finding added roads...")
# Split into chunks for parallel processing
chunks = [source_gdf.iloc[i:i+self.chunk_size]
for i in range(0, len(source_gdf), self.chunk_size)]
all_added = []
# Use partial function for multiprocessing
process_func = partial(self._process_added_chunk,
target_union=target_union,
min_length_deg=self.min_gap_length_deg)
with ProcessPoolExecutor(max_workers=self.n_jobs) as executor:
# Submit all chunk processing jobs
future_to_chunk = {executor.submit(process_func, chunk): i
for i, chunk in enumerate(chunks)}
# Collect results as they complete
for future in as_completed(future_to_chunk):
chunk_idx = future_to_chunk[future]
try:
chunk_added = future.result()
all_added.extend(chunk_added)
print(f"Processed added chunk {chunk_idx + 1}/{len(chunks)}")
except Exception as e:
print(f"Error processing added chunk {chunk_idx}: {str(e)}")
return all_added
@staticmethod
def _process_added_chunk(chunk_gdf: gpd.GeoDataFrame, target_union: Any,
min_length_deg: float) -> List[Dict[str, Any]]:
"""Process a chunk of geometries to find added roads."""
added_roads = []
for idx, row in chunk_gdf.iterrows():
geom = row.geometry
try:
# Check what portion of the road is not covered
uncovered = geom.difference(target_union)
if not uncovered.is_empty:
# Calculate what percentage of the original road is uncovered
uncovered_length = 0
if hasattr(uncovered, 'geoms'):
for geom_part in uncovered.geoms:
if isinstance(geom_part, LineString):
uncovered_length += geom_part.length
elif isinstance(uncovered, LineString):
uncovered_length = uncovered.length
original_length = geom.length
uncovered_ratio = uncovered_length / original_length if original_length > 0 else 0
# Include the entire road if:
# 1. The uncovered portion is above minimum threshold, AND
# 2. More than 10% of the road is uncovered
if uncovered_ratio > 0.1:
#uncovered_length >= min_length_deg and
# Include entire original road with all original metadata
original_properties = dict(row.drop('geometry'))
#
# For Sumter County Roads
#
properties = {
'surface': 'asphalt'
}
output = True
for key, value in original_properties.items():
if key == 'Part_of_Ro' and value == "Yes":
output = False
continue # Skip cart paths that are parts of roads
else:
properties['highway'] = 'residential'
properties['bicycle'] = 'yes'
properties['foot'] = 'yes'
properties['golf'] = 'cartpath'
properties['golf_cart'] = 'yes'
properties['highway'] = 'path'
properties['motor_vehicle'] = 'no'
properties['segregated'] = 'no'
properties['surface'] = 'asphalt'
if output:
added_roads.append({
'geometry': geom,
**properties
})
except Exception as e:
print(e)
continue # Skip problematic geometries
return added_roads
def compare_roads(self, file1_path: str, file2_path: str) -> Tuple[List[Dict], List[Dict]]:
"""
Compare two GeoJSON files and find significant differences.
Optimized version with parallel processing.
Returns:
Tuple of (removed_roads, added_roads)
"""
print(f"Comparing {file1_path} and {file2_path}")
print(f"Tolerance: {self.tolerance_feet} feet")
print(f"Minimum significant length: {self.min_gap_length_feet} feet")
print(f"Parallel processing: {self.n_jobs} workers")
print("-" * 50)
# Load both files
gdf1 = self.load_geojson(file1_path)
gdf2 = self.load_geojson(file2_path)
# Ensure both are in the same CRS
if gdf1.crs != gdf2.crs:
print(f"Warning: CRS mismatch. Converting {file2_path} to match {file1_path}")
gdf2 = gdf2.to_crs(gdf1.crs)
print("Creating optimized spatial unions...")
# Create buffered unions using optimized method
union1 = self.create_buffered_union_optimized(gdf1)
union2 = self.create_buffered_union_optimized(gdf2)
print("Finding removed and added roads with parallel processing...")
# Find roads using optimized parallel methods
removed_roads = self.find_removed_segments_optimized(gdf1, union2)
added_roads = self.find_added_roads_optimized(gdf2, union1)
# Clean up memory
del gdf1, gdf2, union1, union2
gc.collect()
return removed_roads, added_roads
def save_results(self, removed: List[Dict], added: List[Dict], output_path: str):
"""Save results to GeoJSON file."""
all_results = removed + added
if not all_results:
print("No significant differences found!")
return
# Create GeoDataFrame efficiently
print("Saving results...")
results_gdf = gpd.GeoDataFrame(all_results)
# Save to file with optimization
results_gdf.to_file(output_path, driver='GeoJSON', engine='pyogrio')
print(f"Results saved to: {output_path}")
def print_summary(self, removed: List[Dict], added: List[Dict], file1_name: str, file2_name: str):
"""Print a summary of the comparison results."""
print("\n" + "="*60)
print("COMPARISON SUMMARY")
print("="*60)
print(f"\nFile 1: {file1_name}")
print(f"File 2: {file2_name}")
print(f"Tolerance: {self.tolerance_feet} feet")
print(f"Minimum significant length: {self.min_gap_length_feet} feet")
if removed:
print(f"\n🔴 REMOVED ROADS ({len(removed)} segments):")
print("These road segments exist in File 1 but are missing or incomplete in File 2:")
# Calculate total length of removed segments
total_removed_length = 0
removed_by_road = {}
for segment in removed:
geom = segment['geometry']
length_feet = geom.length * 364000.0 # Convert to feet
total_removed_length += length_feet
# Get road name
road_name = "Unknown"
name_fields = ['name', 'NAME', 'road_name', 'street_name', 'FULLNAME']
for field in name_fields:
if field in segment and pd.notna(segment[field]):
road_name = str(segment[field])
break
if road_name not in removed_by_road:
removed_by_road[road_name] = []
removed_by_road[road_name].append(length_feet)
print(f"Total removed length: {total_removed_length:,.1f} feet ({total_removed_length/5280:.2f} miles)")
for road, lengths in sorted(removed_by_road.items()):
road_total = sum(lengths)
print(f"{road}: {len(lengths)} segment(s), {road_total:,.1f} feet")
if added:
print(f"\n🔵 ADDED ROADS ({len(added)} roads):")
print("These roads exist in File 2 but are missing or incomplete in File 1:")
# Calculate total length of added roads
total_added_length = 0
added_by_road = {}
for road in added:
geom = road['geometry']
length_feet = geom.length * 364000.0 # Convert to feet
total_added_length += length_feet
# Get road name
road_name = "Unknown"
name_fields = ['name', 'NAME', 'road_name', 'street_name', 'FULLNAME']
for field in name_fields:
if field in road and pd.notna(road[field]):
road_name = str(road[field])
break
if road_name not in added_by_road:
added_by_road[road_name] = 0
added_by_road[road_name] += length_feet
print(f"Total added length: {total_added_length:,.1f} feet ({total_added_length/5280:.2f} miles)")
for road, length in sorted(added_by_road.items()):
print(f"{road}: {length:,.1f} feet")
if not removed and not added:
print("\n✅ No significant differences found!")
print("The road networks have good coverage overlap within the specified tolerance.")
def main():
parser = argparse.ArgumentParser(
description="Compare two GeoJSON files containing roads and find significant gaps or extras (Optimized)",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python sumter-multi-modal-convert.py osm-multi-modal.geojson county-multi-modal.geojson
python sumter-multi-modal-convert.py osm-multi-modal.geojson county-multi-modal.geojson --tolerance 100 --min-length 200
python sumter-multi-modal-convert.py osm-multi-modal.geojson county-multi-modal.geojson --output differences.geojson
python sumter-multi-modal-convert.py osm-multi-modal.geojson county-multi-modal.geojson --jobs 8 --chunk-size 2000
"""
)
parser.add_argument('file1', help='First GeoJSON file')
parser.add_argument('file2', help='Second GeoJSON file')
parser.add_argument('--tolerance', '-t', type=float, default=50.0,
help='Distance tolerance in feet for considering roads as overlapping (default: 50)')
parser.add_argument('--min-length', '-m', type=float, default=100.0,
help='Minimum length in feet for gaps/extras to be considered significant (default: 100)')
parser.add_argument('--output', '-o', help='Output GeoJSON file for results (optional)')
parser.add_argument('--jobs', '-j', type=int, default=None,
help='Number of parallel processes (default: CPU count - 1)')
parser.add_argument('--chunk-size', '-c', type=int, default=1000,
help='Number of geometries to process per chunk (default: 1000)')
args = parser.parse_args()
# Validate input files
if not Path(args.file1).exists():
print(f"Error: File {args.file1} does not exist")
return 1
if not Path(args.file2).exists():
print(f"Error: File {args.file2} does not exist")
return 1
try:
# Create comparator and run comparison
comparator = RoadComparator(
tolerance_feet=args.tolerance,
min_gap_length_feet=args.min_length,
n_jobs=args.jobs,
chunk_size=args.chunk_size
)
removed, added = comparator.compare_roads(args.file1, args.file2)
# Print summary
comparator.print_summary(removed, added, args.file1, args.file2)
# Save results if output file specified
if args.output:
comparator.save_results(removed, added, args.output)
elif removed or added:
# Auto-generate output filename if differences found
output_file = f"multi_modal_differences_{Path(args.file1).stem}_vs_{Path(args.file2).stem}.geojson"
comparator.save_results(removed, added, output_file)
return 0
except Exception as e:
print(f"Error: {str(e)}")
return 1
if __name__ == "__main__":
exit(main())

683
web/app.js Normal file
View File

@@ -0,0 +1,683 @@
// Global state
let map;
let osmLayer;
let diffLayer;
let countyLayer;
let osmData = null;
let diffData = null;
let countyData = null;
let selectedFeature = null;
let selectedLayer = null;
let acceptedFeatures = new Set();
let rejectedFeatures = new Set();
let featurePopup = null;
let layerOrder = ['diff', 'osm', 'county']; // Default layer order (top to bottom)
// Initialize map
function initMap() {
map = L.map('map').setView([28.7, -81.7], 12);
L.tileLayer('https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png', {
attribution: '© OpenStreetMap contributors © CARTO',
subdomains: 'abcd',
maxZoom: 20
}).addTo(map);
// Create custom panes for layer ordering
map.createPane('osmPane');
map.createPane('diffPane');
map.createPane('countyPane');
// Set initial z-indices for panes
map.getPane('osmPane').style.zIndex = 400;
map.getPane('diffPane').style.zIndex = 401;
map.getPane('countyPane').style.zIndex = 402;
}
// Calculate bounds for all loaded layers
function calculateBounds() {
const bounds = L.latLngBounds([]);
let hasData = false;
if (osmData && osmData.features.length > 0) {
L.geoJSON(osmData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (diffData && diffData.features.length > 0) {
L.geoJSON(diffData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (countyData && countyData.features.length > 0) {
L.geoJSON(countyData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (hasData && bounds.isValid()) {
map.fitBounds(bounds, { padding: [50, 50] });
}
}
// Style functions
function osmStyle(feature) {
return {
color: '#4a4a4a',
weight: 3,
opacity: 0.7
};
}
function diffStyle(feature) {
// Check if feature is accepted or rejected
if (acceptedFeatures.has(feature)) {
return {
color: '#007bff',
weight: 3,
opacity: 0.8
};
}
if (rejectedFeatures.has(feature)) {
return {
color: '#ff8c00',
weight: 3,
opacity: 0.8
};
}
const isRemoved = feature.properties && (feature.properties.removed === true || feature.properties.removed === 'True');
return {
color: isRemoved ? '#ff0000' : '#00ff00',
weight: 3,
opacity: 0.8
};
}
function countyStyle(feature) {
return {
color: '#ff00ff',
weight: 3,
opacity: 0.8
};
}
// Filter function for OSM features
function shouldShowOsmFeature(feature) {
const props = feature.properties || {};
const isService = props.highway === 'service';
const hideService = document.getElementById('hideService').checked;
if (isService && hideService) return false;
return true;
}
// Create layer for OSM data
function createOsmLayer() {
if (osmLayer) {
map.removeLayer(osmLayer);
}
if (!osmData) return;
osmLayer = L.geoJSON(osmData, {
style: osmStyle,
filter: shouldShowOsmFeature,
pane: 'osmPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'osm');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(osmStyle(feature));
}
});
}
}).addTo(map);
updateLayerZIndex();
}
// Filter function for diff features
function shouldShowFeature(feature) {
const props = feature.properties || {};
const isRemoved = props.removed === true || props.removed === 'True';
const isService = props.highway === 'service';
const showAdded = document.getElementById('showAdded').checked;
const showRemoved = document.getElementById('showRemoved').checked;
const hideService = document.getElementById('hideService').checked;
// Check removed/added filter
if (isRemoved && !showRemoved) return false;
if (!isRemoved && !showAdded) return false;
// Check service filter
if (isService && hideService) return false;
return true;
}
// Create layer for diff data with click handlers
function createDiffLayer() {
if (diffLayer) {
map.removeLayer(diffLayer);
}
if (!diffData) return;
diffLayer = L.geoJSON(diffData, {
style: diffStyle,
filter: shouldShowFeature,
pane: 'diffPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'diff');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(diffStyle(feature));
}
});
}
}).addTo(map);
updateLayerZIndex();
}
// Filter function for county features
function shouldShowCountyFeature(feature) {
const props = feature.properties || {};
const isService = props.highway === 'service';
const hideService = document.getElementById('hideService').checked;
if (isService && hideService) return false;
return true;
}
// Create layer for county data
function createCountyLayer() {
if (countyLayer) {
map.removeLayer(countyLayer);
}
if (!countyData) return;
countyLayer = L.geoJSON(countyData, {
style: countyStyle,
filter: shouldShowCountyFeature,
pane: 'countyPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'county');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(countyStyle(feature));
}
});
}
});
// County layer is hidden by default
if (document.getElementById('countyToggle').checked) {
countyLayer.addTo(map);
}
updateLayerZIndex();
}
// Select a feature from any layer
function selectFeature(feature, layer, e, layerType = 'diff') {
// Deselect previous feature
if (selectedLayer) {
// Get the appropriate style function based on previous layer type
const styleFunc = selectedLayer._layerType === 'diff' ? diffStyle :
selectedLayer._layerType === 'osm' ? osmStyle : countyStyle;
selectedLayer.setStyle(styleFunc(selectedLayer.feature));
}
selectedFeature = feature;
selectedLayer = layer;
selectedLayer._layerType = layerType; // Store layer type for later
layer.setStyle({
weight: 6,
opacity: 1,
color: '#ffc107'
});
// Create popup near the clicked location
const props = feature.properties || {};
const isRemoved = props.removed === true || props.removed === 'True';
const isAccepted = acceptedFeatures.has(feature);
const isRejected = rejectedFeatures.has(feature);
let html = '<div style="font-size: 12px; max-height: 400px; overflow-y: auto;">';
// Show layer type
html += `<div style="margin-bottom: 8px;"><strong>Layer:</strong> ${layerType.toUpperCase()}</div>`;
// Only show status for diff layer
if (layerType === 'diff') {
html += `<div style="margin-bottom: 8px;"><strong>Status:</strong> ${isRemoved ? 'Removed' : 'Added/Modified'}</div>`;
}
// Display all non-null properties with custom ordering
const displayProps = Object.entries(props)
.filter(([key, value]) => value !== null && value !== undefined && key !== 'removed')
.sort(([a], [b]) => {
// Priority order: name, highway, then alphabetical
const priorityOrder = { 'name': 0, 'highway': 1 };
const aPriority = priorityOrder[a] ?? 999;
const bPriority = priorityOrder[b] ?? 999;
if (aPriority !== bPriority) {
return aPriority - bPriority;
}
return a.localeCompare(b);
});
if (displayProps.length > 0) {
html += '<div style="font-size: 11px;">';
for (const [key, value] of displayProps) {
html += `<div style="margin: 2px 0;"><strong>${key}:</strong> ${value}</div>`;
}
html += '</div>';
}
// Only show accept/reject for diff layer
if (layerType === 'diff') {
if (isAccepted) {
html += '<div style="margin-top: 8px; color: #007bff; font-weight: bold;">✓ Accepted</div>';
} else if (isRejected) {
html += '<div style="margin-top: 8px; color: #4a4a4a; font-weight: bold;">✗ Rejected</div>';
}
html += '<div style="margin-top: 10px; display: flex; gap: 5px;">';
html += '<button onclick="acceptFeature()" style="flex: 1; padding: 5px; background: #007bff; color: white; border: none; border-radius: 3px; cursor: pointer;">Accept</button>';
html += '<button onclick="rejectFeature()" style="flex: 1; padding: 5px; background: #6c757d; color: white; border: none; border-radius: 3px; cursor: pointer;">Reject</button>';
html += '</div>';
}
html += '</div>';
// Remove old popup if exists
if (featurePopup) {
map.closePopup(featurePopup);
}
// Create popup at click location
featurePopup = L.popup({
maxWidth: 300,
closeButton: true,
autoClose: false,
closeOnClick: false
})
.setLatLng(e.latlng)
.setContent(html)
.openOn(map);
// Handle popup close
featurePopup.on('remove', function() {
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedLayer.feature));
selectedLayer = null;
selectedFeature = null;
}
});
}
// Accept a feature
function acceptFeature() {
if (!selectedFeature) return;
// Remove from rejected if present
rejectedFeatures.delete(selectedFeature);
// Add to accepted
acceptedFeatures.add(selectedFeature);
// Update layer style
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedFeature));
}
// Close popup
if (featurePopup) {
map.closePopup(featurePopup);
}
// Enable save button
updateSaveButton();
showStatus(`${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
}
// Reject a feature
function rejectFeature() {
if (!selectedFeature) return;
// Remove from accepted if present
acceptedFeatures.delete(selectedFeature);
// Add to rejected
rejectedFeatures.add(selectedFeature);
// Update layer style
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedFeature));
}
// Close popup
if (featurePopup) {
map.closePopup(featurePopup);
}
// Enable save button
updateSaveButton();
showStatus(`${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
}
// Expose functions globally for onclick handlers
window.acceptFeature = acceptFeature;
window.rejectFeature = rejectFeature;
// Update save button state
function updateSaveButton() {
document.getElementById('saveButton').disabled =
acceptedFeatures.size === 0 && rejectedFeatures.size === 0;
}
// Load file from input
function loadFile(input) {
return new Promise((resolve, reject) => {
const file = input.files[0];
if (!file) {
resolve(null);
return;
}
const reader = new FileReader();
reader.onload = (e) => {
try {
const data = JSON.parse(e.target.result);
resolve(data);
} catch (error) {
reject(new Error(`Failed to parse ${file.name}: ${error.message}`));
}
};
reader.onerror = () => reject(new Error(`Failed to read ${file.name}`));
reader.readAsText(file);
});
}
// Load all files
async function loadFiles() {
try {
showStatus('Loading files...', 'success');
const osmInput = document.getElementById('osmFile');
const diffInput = document.getElementById('diffFile');
const countyInput = document.getElementById('countyFile');
// Load files
osmData = await loadFile(osmInput);
diffData = await loadFile(diffInput);
countyData = await loadFile(countyInput);
if (!osmData && !diffData && !countyData) {
showStatus('Please select at least one file', 'error');
return;
}
// Create layers
createOsmLayer();
createDiffLayer();
createCountyLayer();
// Fit bounds to smallest layer
calculateBounds();
showStatus('Files loaded successfully!', 'success');
// Enable save button if we have diff data
document.getElementById('saveButton').disabled = !diffData;
} catch (error) {
showStatus(error.message, 'error');
console.error(error);
}
}
// Save accepted and rejected items to original diff file
async function saveAcceptedItems() {
if (!diffData || (acceptedFeatures.size === 0 && rejectedFeatures.size === 0)) {
showStatus('No features to save', 'error');
return;
}
try {
// Add accepted=true or accepted=false property to features
diffData.features.forEach(feature => {
if (acceptedFeatures.has(feature)) {
feature.properties.accepted = true;
} else if (rejectedFeatures.has(feature)) {
feature.properties.accepted = false;
}
});
// Create download
const dataStr = JSON.stringify(diffData, null, 2);
const dataBlob = new Blob([dataStr], { type: 'application/json' });
const url = URL.createObjectURL(dataBlob);
const link = document.createElement('a');
link.href = url;
link.download = 'diff-updated.geojson';
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
URL.revokeObjectURL(url);
showStatus(`Saved ${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
} catch (error) {
showStatus(`Save failed: ${error.message}`, 'error');
console.error(error);
}
}
// Show status message
function showStatus(message, type) {
const status = document.getElementById('status');
status.textContent = message;
status.className = `status ${type}`;
setTimeout(() => {
status.classList.add('hidden');
}, 3000);
}
// Update pane z-index based on order
function updateLayerZIndex() {
const panes = {
'osm': 'osmPane',
'diff': 'diffPane',
'county': 'countyPane'
};
// Reverse index so first item in list is on top
layerOrder.forEach((layerName, index) => {
const paneName = panes[layerName];
const pane = map.getPane(paneName);
if (pane) {
pane.style.zIndex = 400 + (layerOrder.length - 1 - index);
}
});
}
// Toggle layer visibility
function toggleLayer(layerId, layer) {
const checkbox = document.getElementById(layerId);
if (checkbox.checked && layer) {
if (!map.hasLayer(layer)) {
map.addLayer(layer);
updateLayerZIndex();
}
} else if (layer) {
if (map.hasLayer(layer)) {
map.removeLayer(layer);
}
}
}
// Event listeners
document.addEventListener('DOMContentLoaded', function() {
initMap();
// Layer toggles
document.getElementById('osmToggle').addEventListener('change', function() {
toggleLayer('osmToggle', osmLayer);
});
document.getElementById('diffToggle').addEventListener('change', function() {
toggleLayer('diffToggle', diffLayer);
});
document.getElementById('countyToggle').addEventListener('change', function() {
toggleLayer('countyToggle', countyLayer);
});
// Diff filter toggles
document.getElementById('showAdded').addEventListener('change', function() {
createDiffLayer();
});
document.getElementById('showRemoved').addEventListener('change', function() {
createDiffLayer();
});
document.getElementById('hideService').addEventListener('change', function() {
createDiffLayer();
createOsmLayer();
createCountyLayer();
});
// Load button
document.getElementById('loadButton').addEventListener('click', loadFiles);
// Save button
document.getElementById('saveButton').addEventListener('click', saveAcceptedItems);
// Drag and drop for layer reordering
const layerList = document.getElementById('layerList');
const layerItems = layerList.querySelectorAll('.layer-item');
let draggedElement = null;
layerItems.forEach(item => {
item.addEventListener('dragstart', function(e) {
draggedElement = this;
this.classList.add('dragging');
e.dataTransfer.effectAllowed = 'move';
});
item.addEventListener('dragend', function(e) {
this.classList.remove('dragging');
draggedElement = null;
});
item.addEventListener('dragover', function(e) {
e.preventDefault();
e.dataTransfer.dropEffect = 'move';
if (this === draggedElement) return;
const afterElement = getDragAfterElement(layerList, e.clientY);
if (afterElement == null) {
layerList.appendChild(draggedElement);
} else {
layerList.insertBefore(draggedElement, afterElement);
}
});
item.addEventListener('drop', function(e) {
e.preventDefault();
// Update layer order based on new DOM order
layerOrder = Array.from(layerList.querySelectorAll('.layer-item'))
.map(item => item.dataset.layer);
updateLayerZIndex();
});
});
function getDragAfterElement(container, y) {
const draggableElements = [...container.querySelectorAll('.layer-item:not(.dragging)')];
return draggableElements.reduce((closest, child) => {
const box = child.getBoundingClientRect();
const offset = y - box.top - box.height / 2;
if (offset < 0 && offset > closest.offset) {
return { offset: offset, element: child };
} else {
return closest;
}
}, { offset: Number.NEGATIVE_INFINITY }).element;
}
});

207
web/index.html Normal file
View File

@@ -0,0 +1,207 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>GeoJSON Map Viewer</title>
<link rel="stylesheet" href="https://unpkg.com/leaflet@1.9.4/dist/leaflet.css" />
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
height: 100vh;
display: flex;
flex-direction: column;
}
#map {
flex: 1;
width: 100%;
}
.controls {
position: absolute;
top: 10px;
right: 10px;
z-index: 1000;
background: white;
padding: 15px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0,0,0,0.2);
min-width: 200px;
}
.controls h3 {
margin: 0 0 10px 0;
font-size: 14px;
color: #333;
}
.controls label {
display: flex;
align-items: center;
margin: 8px 0;
cursor: pointer;
font-size: 13px;
}
.layer-item {
display: flex;
align-items: center;
margin: 8px 0;
padding: 5px;
background: #f8f9fa;
border-radius: 4px;
cursor: move;
font-size: 13px;
}
.layer-item.dragging {
opacity: 0.5;
}
.layer-item input[type="checkbox"] {
margin-right: 8px;
}
.controls input[type="checkbox"] {
margin-right: 8px;
}
.controls button {
width: 100%;
padding: 10px;
margin-top: 10px;
background: #007bff;
color: white;
border: none;
border-radius: 4px;
cursor: pointer;
font-size: 13px;
font-weight: 500;
}
.controls button:hover {
background: #0056b3;
}
.controls button:disabled {
background: #ccc;
cursor: not-allowed;
}
.status {
margin-top: 10px;
padding: 8px;
border-radius: 4px;
font-size: 12px;
text-align: center;
}
.status.success {
background: #d4edda;
color: #155724;
}
.status.error {
background: #f8d7da;
color: #721c24;
}
.status.hidden {
display: none;
}
.file-input-group {
margin-bottom: 15px;
padding-bottom: 15px;
border-bottom: 1px solid #eee;
}
.file-input-group:last-of-type {
border-bottom: none;
}
.file-input-group label {
display: block;
margin-bottom: 5px;
font-weight: 500;
}
.file-input-group input[type="file"] {
width: 100%;
font-size: 11px;
}
.load-button {
background: #28a745 !important;
}
.load-button:hover {
background: #218838 !important;
}
</style>
</head>
<body>
<div id="map"></div>
<div class="controls">
<h3>Layer Controls (top to bottom)</h3>
<div id="layerList">
<div class="layer-item" draggable="true" data-layer="diff">
<input type="checkbox" id="diffToggle" checked>
<span>Diff Layer</span>
</div>
<div class="layer-item" draggable="true" data-layer="osm">
<input type="checkbox" id="osmToggle" checked>
<span>OSM Roads (Gray)</span>
</div>
<div class="layer-item" draggable="true" data-layer="county">
<input type="checkbox" id="countyToggle">
<span>County Layer (Purple)</span>
</div>
</div>
<h3 style="margin-top: 15px;">Diff Filters</h3>
<label>
<input type="checkbox" id="showAdded" checked>
Show Added (Green)
</label>
<label>
<input type="checkbox" id="showRemoved" checked>
Show Removed (Red)
</label>
<label>
<input type="checkbox" id="hideService">
Hide highway=service
</label>
<h3 style="margin-top: 15px;">Load Files</h3>
<div class="file-input-group">
<label for="diffFile">Diff File:</label>
<input type="file" id="diffFile" accept=".geojson,.json">
</div>
<div class="file-input-group">
<label for="osmFile">OSM File:</label>
<input type="file" id="osmFile" accept=".geojson,.json">
</div>
<div class="file-input-group">
<label for="countyFile">County File:</label>
<input type="file" id="countyFile" accept=".geojson,.json">
</div>
<button id="loadButton" class="load-button">Load Files</button>
<button id="saveButton" disabled>Save Accepted Items</button>
<div id="status" class="status hidden"></div>
</div>
<script src="https://unpkg.com/leaflet@1.9.4/dist/leaflet.js"></script>
<script src="app.js"></script>
</body>
</html>

215
web/server.py Normal file
View File

@@ -0,0 +1,215 @@
#!/usr/bin/env python3
"""
Flask web server for The Villages Import Tools
"""
from flask import Flask, render_template, jsonify, request, send_from_directory
import subprocess
import os
import threading
import json
from datetime import datetime
app = Flask(__name__, static_folder='static', template_folder='templates')
# Store running processes
running_processes = {}
process_logs = {}
@app.route('/')
def index():
"""Main index page with script execution buttons"""
# Get available scripts and organize them
script_map = get_script_map()
# Organize scripts by category
scripts_by_category = {
'Download County Data': ['download-county-addresses', 'download-county-roads', 'download-county-paths'],
'Download OSM Data': ['download-osm-roads', 'download-osm-paths'],
'Convert Data': ['convert-roads', 'convert-paths'],
'Diff Data': ['diff-roads', 'diff-paths', 'diff-addresses'],
'Utilities': ['ls', 'make-new-latest']
}
return render_template('index.html',
script_map=script_map,
scripts_by_category=scripts_by_category)
@app.route('/map')
def map_viewer():
"""Map viewer page"""
return render_template('map.html')
def get_script_map():
"""Get the map of available scripts and their commands"""
return {
'ls': 'ls -alr /data',
'make-new-latest': 'cd /data && NEWDIR=$(date +%y%m%d) && mkdir -p $NEWDIR/lake $NEWDIR/sumter && ln -sfn $NEWDIR latest',
# todo: make a clean-old-data script
'download-county-addresses': {
# deliver files with standardized names
'sumter': 'mkdir -p /data/latest/sumter && wget https://www.arcgis.com/sharing/rest/content/items/c75c5aac13a648968c5596b0665be28b/data -O /data/latest/sumter/addresses.shp.zip',
'lake': 'mkdir -p /data/latest/lake && wget [LAKE_URL_HERE] -O /data/latest/lake/addresses.shp.zip'
},
'download-county-roads': {
# deliver files with standardized names
'sumter': 'mkdir -p /data/latest/sumter && wget https://www.arcgis.com/sharing/rest/content/items/9177e17c72d3433aa79630c7eda84add/data -O /data/latest/sumter/roads.shp.zip',
'lake': 'mkdir -p /data/latest/lake && wget [LAKE_URL_HERE] -O /data/latest/lake/roads.shp.zip'
},
'download-county-paths': {
# deliver files with standardized names
#'sumter': ['/data/latest/sumter/paths.shp.zip'],
#'lake': ['/data/latest/lake/paths.shp.zip']
},
# todo: integrate osm downloading and shapefile converting into diff-roads like addresses
'download-osm-roads': {
'lake': ['python', 'download-overpass.py', '--type', 'highways', 'Lake County', 'Florida', '/data/latest/lake/osm-roads.geojson'],
'sumter': ['python', 'download-overpass.py', '--type', 'highways', 'Sumter County', 'Florida', '/data/latest/sumter/osm-roads.geojson']
},
'download-osm-paths': {
# todo: no lake county paths
#'lake': ['python', 'download-overpass.py', '--type', 'highways', 'Lake County', 'Florida', '/data/latest/lake/osm-roads.geojson'],
'sumter': ['python', 'download-overpass.py', '--type', 'paths', 'Sumter County', 'Florida', '/data/latest/sumter/osm-paths.geojson']
},
# todo
'convert-roads': {
'sumter': ['python', 'shp-to-geojson.py', '/data/latest/sumter/roads.shp.zip', '/data/latest/sumter/county-roads.geojson'],
'lake': ['python', 'shp-to-geojson.py', '/data/latest/lake/roads.shp.zip', '/data/latest/lake/county-roads.geojson']
},
'convert-paths': {
#todo: delete sumter-multi-modal-convert.py ?
'sumter': ['python', 'shp-to-geojson.py', '/data/latest/sumter/paths.shp.zip', '/data/latest/sumter/county-paths.geojson'],
},
'diff-roads': {
'lake': ['python', 'diff-highways.py', '/data/latest/lake/osm-roads.geojson', '/data/latest/lake/county-roads.geojson', '--output', '/data/latest/lake/diff-roads.geojson'],
'sumter': ['python', 'diff-highways.py', '/data/latest/sumter/osm-roads.geojson', '/data/latest/sumter/county-roads.geojson', '--output', '/data/latest/sumter/diff-roads.geojson']
},
'diff-paths': {
#todo: no lake county data for paths
#'lake': ['python', 'diff-highways.py', '/data/latest/lake/osm-paths.geojson', '/data/latest/lake/county-paths.geojson', '--output', '/data/latest/lake/diff-paths.geojson'],
'sumter': ['python', 'diff-highways.py', '/data/latest/sumter/osm-paths.geojson', '/data/latest/sumter/county-paths.geojson', '--output', '/data/latest/sumter/diff-paths.geojson'],
},
# addresses need no osm download or shapefile convert, just county download
'diff-addresses': {
#todo: delete sumter-address-convert.py ?
'lake': ['python', 'compare-addresses.py', 'Lake', 'Florida', '--local-zip', '/data/latest/lake/addresses.shp.zip', '--output-dir', '/data/latest/lake', '--cache-dir', '/data/osm_cache'],
'sumter': ['python', 'compare-addresses.py', 'Sumter', 'Florida', '--local-zip', '/data/latest/sumter/addresses.shp.zip', '--output-dir', '/data/latest/sumter', '--cache-dir', '/data/osm_cache']
},
}
@app.route('/api/run-script', methods=['POST'])
def run_script():
"""Execute a script in the background"""
data = request.json
script_name = data.get('script')
county = data.get('county', '')
if not script_name:
return jsonify({'error': 'No script specified'}), 400
script_map = get_script_map()
if script_name not in script_map:
return jsonify({'error': 'Unknown script'}), 400
script_config = script_map[script_name]
# Handle both string commands and dict of county-specific commands
if isinstance(script_config, str):
# Simple string command (like 'ls')
cmd = ['bash', '-c', script_config]
elif isinstance(script_config, dict):
# County-specific commands
if not county:
return jsonify({'error': 'County required for this script'}), 400
if county not in script_config:
return jsonify({'error': f'County {county} not supported for {script_name}'}), 400
cmd_config = script_config[county]
if isinstance(cmd_config, str):
cmd = ['bash', '-c', cmd_config]
else:
cmd = cmd_config
else:
return jsonify({'error': 'Invalid script configuration'}), 400
# Generate a unique job ID
job_id = f"{script_name}_{county}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
# Start process in background
def run_command():
try:
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
cwd=os.path.dirname(os.path.dirname(__file__))
)
running_processes[job_id] = process
process_logs[job_id] = []
# Stream output
for line in process.stdout:
process_logs[job_id].append(line)
process.wait()
process_logs[job_id].append(f"\n[Process completed with exit code {process.returncode}]")
except Exception as e:
process_logs[job_id].append(f"\n[ERROR: {str(e)}]")
finally:
if job_id in running_processes:
del running_processes[job_id]
thread = threading.Thread(target=run_command)
thread.daemon = True
thread.start()
return jsonify({'job_id': job_id, 'status': 'started'})
@app.route('/api/job-status/<job_id>')
def job_status(job_id):
"""Get status and logs for a job"""
is_running = job_id in running_processes
logs = process_logs.get(job_id, [])
return jsonify({
'job_id': job_id,
'running': is_running,
'logs': logs
})
@app.route('/api/list-files')
def list_files():
"""List available GeoJSON files"""
data_dir = '/data'
files = {
'diff': [],
'osm': [],
'county': []
}
# Scan directories for geojson files
if os.path.exists(data_dir):
for root, dirs, filenames in os.walk(data_dir):
for filename in filenames:
if filename.endswith('.geojson'):
rel_path = os.path.relpath(os.path.join(root, filename), data_dir)
if 'diff' in filename.lower():
files['diff'].append(rel_path)
elif 'osm' in filename.lower():
files['osm'].append(rel_path)
elif any(county in filename.lower() for county in ['lake', 'sumter']):
files['county'].append(rel_path)
return jsonify(files)
@app.route('/data/<path:filename>')
def serve_data(filename):
"""Serve GeoJSON files"""
return send_from_directory('/data', filename)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=True)

693
web/static/map.js Normal file
View File

@@ -0,0 +1,693 @@
// Global state
let map;
let osmLayer;
let diffLayer;
let countyLayer;
let osmData = null;
let diffData = null;
let countyData = null;
let selectedFeature = null;
let selectedLayer = null;
let acceptedFeatures = new Set();
let rejectedFeatures = new Set();
let featurePopup = null;
let layerOrder = ['diff', 'osm', 'county']; // Default layer order (top to bottom)
// Initialize map
function initMap() {
map = L.map('map').setView([28.7, -81.7], 12);
L.tileLayer('https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png', {
attribution: '© OpenStreetMap contributors © CARTO',
subdomains: 'abcd',
maxZoom: 20
}).addTo(map);
// Create custom panes for layer ordering
map.createPane('osmPane');
map.createPane('diffPane');
map.createPane('countyPane');
// Set initial z-indices for panes
map.getPane('osmPane').style.zIndex = 400;
map.getPane('diffPane').style.zIndex = 401;
map.getPane('countyPane').style.zIndex = 402;
}
// Calculate bounds for all loaded layers
function calculateBounds() {
const bounds = L.latLngBounds([]);
let hasData = false;
if (osmData && osmData.features.length > 0) {
L.geoJSON(osmData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (diffData && diffData.features.length > 0) {
L.geoJSON(diffData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (countyData && countyData.features.length > 0) {
L.geoJSON(countyData).eachLayer(layer => {
if (layer.getBounds) {
bounds.extend(layer.getBounds());
} else if (layer.getLatLng) {
bounds.extend(layer.getLatLng());
}
});
hasData = true;
}
if (hasData && bounds.isValid()) {
map.fitBounds(bounds, { padding: [50, 50] });
}
}
// Style functions
function osmStyle(feature) {
return {
color: '#4a4a4a',
weight: 3,
opacity: 0.7
};
}
function diffStyle(feature) {
// Check if feature is accepted or rejected
if (acceptedFeatures.has(feature)) {
return {
color: '#007bff',
weight: 3,
opacity: 0.8
};
}
if (rejectedFeatures.has(feature)) {
return {
color: '#ff8c00',
weight: 3,
opacity: 0.8
};
}
const isRemoved = feature.properties && (feature.properties.removed === true || feature.properties.removed === 'True');
return {
color: isRemoved ? '#ff0000' : '#00ff00',
weight: 3,
opacity: 0.8
};
}
function countyStyle(feature) {
return {
color: '#ff00ff',
weight: 3,
opacity: 0.8
};
}
// Filter function for OSM features
function shouldShowOsmFeature(feature) {
const props = feature.properties || {};
const isService = props.highway === 'service';
const hideService = document.getElementById('hideService').checked;
if (isService && hideService) return false;
return true;
}
// Create layer for OSM data
function createOsmLayer() {
if (osmLayer) {
map.removeLayer(osmLayer);
}
if (!osmData) return;
osmLayer = L.geoJSON(osmData, {
style: osmStyle,
filter: shouldShowOsmFeature,
pane: 'osmPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'osm');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(osmStyle(feature));
}
});
}
}).addTo(map);
updateLayerZIndex();
}
// Filter function for diff features
function shouldShowFeature(feature) {
const props = feature.properties || {};
const isRemoved = props.removed === true || props.removed === 'True';
const isService = props.highway === 'service';
const showAdded = document.getElementById('showAdded').checked;
const showRemoved = document.getElementById('showRemoved').checked;
const hideService = document.getElementById('hideService').checked;
// Check removed/added filter
if (isRemoved && !showRemoved) return false;
if (!isRemoved && !showAdded) return false;
// Check service filter
if (isService && hideService) return false;
return true;
}
// Create layer for diff data with click handlers
function createDiffLayer() {
if (diffLayer) {
map.removeLayer(diffLayer);
}
if (!diffData) return;
diffLayer = L.geoJSON(diffData, {
style: diffStyle,
filter: shouldShowFeature,
pane: 'diffPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'diff');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(diffStyle(feature));
}
});
}
}).addTo(map);
updateLayerZIndex();
}
// Filter function for county features
function shouldShowCountyFeature(feature) {
const props = feature.properties || {};
const isService = props.highway === 'service';
const hideService = document.getElementById('hideService').checked;
if (isService && hideService) return false;
return true;
}
// Create layer for county data
function createCountyLayer() {
if (countyLayer) {
map.removeLayer(countyLayer);
}
if (!countyData) return;
countyLayer = L.geoJSON(countyData, {
style: countyStyle,
filter: shouldShowCountyFeature,
pane: 'countyPane',
onEachFeature: function(feature, layer) {
layer.on('click', function(e) {
L.DomEvent.stopPropagation(e);
selectFeature(feature, layer, e, 'county');
});
layer.on('mouseover', function(e) {
if (selectedLayer !== layer) {
layer.setStyle({
weight: 5,
opacity: 1
});
}
});
layer.on('mouseout', function(e) {
if (selectedLayer !== layer) {
layer.setStyle(countyStyle(feature));
}
});
}
});
// County layer is hidden by default
if (document.getElementById('countyToggle').checked) {
countyLayer.addTo(map);
}
updateLayerZIndex();
}
// Select a feature from any layer
function selectFeature(feature, layer, e, layerType = 'diff') {
// Deselect previous feature
if (selectedLayer) {
// Get the appropriate style function based on previous layer type
const styleFunc = selectedLayer._layerType === 'diff' ? diffStyle :
selectedLayer._layerType === 'osm' ? osmStyle : countyStyle;
selectedLayer.setStyle(styleFunc(selectedLayer.feature));
}
selectedFeature = feature;
selectedLayer = layer;
selectedLayer._layerType = layerType; // Store layer type for later
layer.setStyle({
weight: 6,
opacity: 1,
color: '#ffc107'
});
// Create popup near the clicked location
const props = feature.properties || {};
const isRemoved = props.removed === true || props.removed === 'True';
const isAccepted = acceptedFeatures.has(feature);
const isRejected = rejectedFeatures.has(feature);
let html = '<div style="font-size: 12px; max-height: 400px; overflow-y: auto;">';
// Show layer type
html += `<div style="margin-bottom: 8px;"><strong>Layer:</strong> ${layerType.toUpperCase()}</div>`;
// Only show status for diff layer
if (layerType === 'diff') {
html += `<div style="margin-bottom: 8px;"><strong>Status:</strong> ${isRemoved ? 'Removed' : 'Added/Modified'}</div>`;
}
// Display all non-null properties with custom ordering
const displayProps = Object.entries(props)
.filter(([key, value]) => value !== null && value !== undefined && key !== 'removed')
.sort(([a], [b]) => {
// Priority order: name, highway, then alphabetical
const priorityOrder = { 'name': 0, 'highway': 1 };
const aPriority = priorityOrder[a] ?? 999;
const bPriority = priorityOrder[b] ?? 999;
if (aPriority !== bPriority) {
return aPriority - bPriority;
}
return a.localeCompare(b);
});
if (displayProps.length > 0) {
html += '<div style="font-size: 11px;">';
for (const [key, value] of displayProps) {
html += `<div style="margin: 2px 0;"><strong>${key}:</strong> ${value}</div>`;
}
html += '</div>';
}
// Only show accept/reject for diff layer
if (layerType === 'diff') {
if (isAccepted) {
html += '<div style="margin-top: 8px; color: #007bff; font-weight: bold;">✓ Accepted</div>';
} else if (isRejected) {
html += '<div style="margin-top: 8px; color: #4a4a4a; font-weight: bold;">✗ Rejected</div>';
}
html += '<div style="margin-top: 10px; display: flex; gap: 5px;">';
html += '<button onclick="acceptFeature()" style="flex: 1; padding: 5px; background: #007bff; color: white; border: none; border-radius: 3px; cursor: pointer;">Accept</button>';
html += '<button onclick="rejectFeature()" style="flex: 1; padding: 5px; background: #6c757d; color: white; border: none; border-radius: 3px; cursor: pointer;">Reject</button>';
html += '</div>';
}
html += '</div>';
// Remove old popup if exists
if (featurePopup) {
map.closePopup(featurePopup);
}
// Create popup at click location
featurePopup = L.popup({
maxWidth: 300,
closeButton: true,
autoClose: false,
closeOnClick: false
})
.setLatLng(e.latlng)
.setContent(html)
.openOn(map);
// Handle popup close
featurePopup.on('remove', function() {
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedLayer.feature));
selectedLayer = null;
selectedFeature = null;
}
});
}
// Accept a feature
function acceptFeature() {
if (!selectedFeature) return;
// Remove from rejected if present
rejectedFeatures.delete(selectedFeature);
// Add to accepted
acceptedFeatures.add(selectedFeature);
// Update layer style
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedFeature));
}
// Close popup
if (featurePopup) {
map.closePopup(featurePopup);
}
// Enable save button
updateSaveButton();
showStatus(`${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
}
// Reject a feature
function rejectFeature() {
if (!selectedFeature) return;
// Remove from accepted if present
acceptedFeatures.delete(selectedFeature);
// Add to rejected
rejectedFeatures.add(selectedFeature);
// Update layer style
if (selectedLayer) {
selectedLayer.setStyle(diffStyle(selectedFeature));
}
// Close popup
if (featurePopup) {
map.closePopup(featurePopup);
}
// Enable save button
updateSaveButton();
showStatus(`${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
}
// Expose functions globally for onclick handlers
window.acceptFeature = acceptFeature;
window.rejectFeature = rejectFeature;
// Update save button state
function updateSaveButton() {
document.getElementById('saveButton').disabled =
acceptedFeatures.size === 0 && rejectedFeatures.size === 0;
}
// Load GeoJSON from server
async function loadFromServer(url) {
const response = await fetch(url);
if (!response.ok) {
if (response.status === 404) {
return null; // File doesn't exist
}
throw new Error(`Failed to load ${url}: ${response.statusText}`);
}
return await response.json();
}
// Load all files from server
async function loadFiles() {
try {
showStatus('Loading data from server...', 'success');
const county = document.getElementById('countySelect').value;
const dataType = document.getElementById('dataTypeSelect').value;
// Build file paths based on county and data type
let osmFile, diffFile, countyFile;
if (dataType === 'roads') {
osmFile = `latest/${county}/osm-roads.geojson`;
diffFile = `latest/${county}/diff-roads.geojson`;
countyFile = `latest/${county}/county-roads.geojson`;
} else if (dataType === 'paths') {
osmFile = `latest/${county}/osm-paths.geojson`;
diffFile = `latest/${county}/diff-paths.geojson`;
countyFile = `latest/${county}/county-paths.geojson`;
} else if (dataType === 'addresses') {
osmFile = null; // No OSM addresses file
diffFile = `latest/${county}/addresses-to-add.geojson`;
countyFile = `latest/${county}/addresses-existing.geojson`;
}
// Load files from server
osmData = osmFile ? await loadFromServer(`/data/${osmFile}`) : null;
diffData = diffFile ? await loadFromServer(`/data/${diffFile}`) : null;
countyData = countyFile ? await loadFromServer(`/data/${countyFile}`) : null;
if (!osmData && !diffData && !countyData) {
showStatus(`No data files found for ${county} ${dataType}. Run the processing scripts first.`, 'error');
return;
}
// Create layers
createOsmLayer();
createDiffLayer();
createCountyLayer();
// Fit bounds to smallest layer
calculateBounds();
let loadedFiles = [];
if (osmData) loadedFiles.push('OSM');
if (diffData) loadedFiles.push('Diff');
if (countyData) loadedFiles.push('County');
showStatus(`Loaded ${loadedFiles.join(', ')} data for ${county} ${dataType}`, 'success');
// Enable save button if we have diff data
document.getElementById('saveButton').disabled = !diffData;
} catch (error) {
showStatus(error.message, 'error');
console.error(error);
}
}
// Save accepted and rejected items to original diff file
async function saveAcceptedItems() {
if (!diffData || (acceptedFeatures.size === 0 && rejectedFeatures.size === 0)) {
showStatus('No features to save', 'error');
return;
}
try {
// Add accepted=true or accepted=false property to features
diffData.features.forEach(feature => {
if (acceptedFeatures.has(feature)) {
feature.properties.accepted = true;
} else if (rejectedFeatures.has(feature)) {
feature.properties.accepted = false;
}
});
// Create download
const dataStr = JSON.stringify(diffData, null, 2);
const dataBlob = new Blob([dataStr], { type: 'application/json' });
const url = URL.createObjectURL(dataBlob);
const link = document.createElement('a');
link.href = url;
link.download = 'diff-updated.geojson';
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
URL.revokeObjectURL(url);
showStatus(`Saved ${acceptedFeatures.size} accepted, ${rejectedFeatures.size} rejected`, 'success');
} catch (error) {
showStatus(`Save failed: ${error.message}`, 'error');
console.error(error);
}
}
// Show status message
function showStatus(message, type) {
const status = document.getElementById('status');
status.textContent = message;
status.className = `status ${type}`;
setTimeout(() => {
status.classList.add('hidden');
}, 3000);
}
// Update pane z-index based on order
function updateLayerZIndex() {
const panes = {
'osm': 'osmPane',
'diff': 'diffPane',
'county': 'countyPane'
};
// Reverse index so first item in list is on top
layerOrder.forEach((layerName, index) => {
const paneName = panes[layerName];
const pane = map.getPane(paneName);
if (pane) {
pane.style.zIndex = 400 + (layerOrder.length - 1 - index);
}
});
}
// Toggle layer visibility
function toggleLayer(layerId, layer) {
const checkbox = document.getElementById(layerId);
if (checkbox.checked && layer) {
if (!map.hasLayer(layer)) {
map.addLayer(layer);
updateLayerZIndex();
}
} else if (layer) {
if (map.hasLayer(layer)) {
map.removeLayer(layer);
}
}
}
// Event listeners
document.addEventListener('DOMContentLoaded', function() {
initMap();
// Layer toggles
document.getElementById('osmToggle').addEventListener('change', function() {
toggleLayer('osmToggle', osmLayer);
});
document.getElementById('diffToggle').addEventListener('change', function() {
toggleLayer('diffToggle', diffLayer);
});
document.getElementById('countyToggle').addEventListener('change', function() {
toggleLayer('countyToggle', countyLayer);
});
// Diff filter toggles
document.getElementById('showAdded').addEventListener('change', function() {
createDiffLayer();
});
document.getElementById('showRemoved').addEventListener('change', function() {
createDiffLayer();
});
document.getElementById('hideService').addEventListener('change', function() {
createDiffLayer();
createOsmLayer();
createCountyLayer();
});
// Load button
document.getElementById('loadButton').addEventListener('click', loadFiles);
// Save button
document.getElementById('saveButton').addEventListener('click', saveAcceptedItems);
// Drag and drop for layer reordering
const layerList = document.getElementById('layerList');
const layerItems = layerList.querySelectorAll('.layer-item');
let draggedElement = null;
layerItems.forEach(item => {
item.addEventListener('dragstart', function(e) {
draggedElement = this;
this.classList.add('dragging');
e.dataTransfer.effectAllowed = 'move';
});
item.addEventListener('dragend', function(e) {
this.classList.remove('dragging');
draggedElement = null;
});
item.addEventListener('dragover', function(e) {
e.preventDefault();
e.dataTransfer.dropEffect = 'move';
if (this === draggedElement) return;
const afterElement = getDragAfterElement(layerList, e.clientY);
if (afterElement == null) {
layerList.appendChild(draggedElement);
} else {
layerList.insertBefore(draggedElement, afterElement);
}
});
item.addEventListener('drop', function(e) {
e.preventDefault();
// Update layer order based on new DOM order
layerOrder = Array.from(layerList.querySelectorAll('.layer-item'))
.map(item => item.dataset.layer);
updateLayerZIndex();
});
});
function getDragAfterElement(container, y) {
const draggableElements = [...container.querySelectorAll('.layer-item:not(.dragging)')];
return draggableElements.reduce((closest, child) => {
const box = child.getBoundingClientRect();
const offset = y - box.top - box.height / 2;
if (offset < 0 && offset > closest.offset) {
return { offset: offset, element: child };
} else {
return closest;
}
}, { offset: Number.NEGATIVE_INFINITY }).element;
}
});

311
web/templates/index.html Normal file
View File

@@ -0,0 +1,311 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>The Villages Import Tools</title>
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
background: #f5f5f5;
padding: 20px;
}
.container {
max-width: 1200px;
margin: 0 auto;
background: white;
padding: 30px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0,0,0,0.1);
}
h1 {
color: #333;
margin-bottom: 10px;
}
.subtitle {
color: #666;
margin-bottom: 30px;
}
.section {
margin-bottom: 30px;
}
.section h2 {
color: #444;
margin-bottom: 15px;
font-size: 20px;
}
.button-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
gap: 15px;
margin-bottom: 20px;
}
.script-button {
padding: 15px 20px;
background: #007bff;
color: white;
border: none;
border-radius: 6px;
cursor: pointer;
font-size: 14px;
font-weight: 500;
transition: background 0.2s;
}
.script-button:hover {
background: #0056b3;
}
.script-button:disabled {
background: #ccc;
cursor: not-allowed;
}
.script-button.lake {
background: #28a745;
}
.script-button.lake:hover {
background: #218838;
}
.script-button.sumter {
background: #17a2b8;
}
.script-button.sumter:hover {
background: #138496;
}
.map-link {
display: inline-block;
padding: 15px 30px;
background: #6f42c1;
color: white;
text-decoration: none;
border-radius: 6px;
font-weight: 500;
transition: background 0.2s;
}
.map-link:hover {
background: #5a32a3;
}
.log-viewer {
margin-top: 30px;
border-top: 2px solid #eee;
padding-top: 20px;
}
.log-box {
background: #1e1e1e;
color: #d4d4d4;
padding: 15px;
border-radius: 6px;
font-family: 'Courier New', monospace;
font-size: 12px;
max-height: 400px;
overflow-y: auto;
white-space: pre-wrap;
word-wrap: break-word;
display: none;
}
.log-box.active {
display: block;
}
.status-message {
padding: 10px 15px;
border-radius: 6px;
margin-bottom: 15px;
display: none;
}
.status-message.success {
background: #d4edda;
color: #155724;
border: 1px solid #c3e6cb;
}
.status-message.error {
background: #f8d7da;
color: #721c24;
border: 1px solid #f5c6cb;
}
.status-message.info {
background: #d1ecf1;
color: #0c5460;
border: 1px solid #bee5eb;
}
.county-group {
margin-bottom: 25px;
}
.county-group h3 {
color: #555;
margin-bottom: 10px;
font-size: 16px;
}
</style>
</head>
<body>
<div class="container">
<h1>The Villages Import Tools</h1>
<p class="subtitle">Run data processing scripts and view results</p>
<div class="section">
<h2>Map Viewer</h2>
<a href="/map" class="map-link">Open GeoJSON Map Viewer</a>
</div>
<div class="section">
<h2>Data Processing Scripts</h2>
{% for category, script_names in scripts_by_category.items() %}
<div class="county-group">
<h3>{{ category }}</h3>
{% for script_name in script_names %}
{% if script_name in script_map %}
{% set script_config = script_map[script_name] %}
{% if script_config is string %}
{# Simple command with no county selection #}
<div class="button-grid">
<button class="script-button" onclick="runScript('{{ script_name }}', '')">
{{ script_name|replace('-', ' ')|title }}
</button>
</div>
{% elif script_config is mapping %}
{# County-specific commands #}
<div class="button-grid">
{% if 'lake' in script_config %}
<button class="script-button lake" onclick="runScript('{{ script_name }}', 'lake')">
{{ script_name|replace('-', ' ')|title }} (Lake)
</button>
{% endif %}
{% if 'sumter' in script_config %}
<button class="script-button sumter" onclick="runScript('{{ script_name }}', 'sumter')">
{{ script_name|replace('-', ' ')|title }} (Sumter)
</button>
{% endif %}
</div>
{% endif %}
{% endif %}
{% endfor %}
</div>
{% endfor %}
</div>
<div class="log-viewer">
<h2>Script Output</h2>
<div id="status" class="status-message"></div>
<div id="logs" class="log-box"></div>
</div>
</div>
<script>
let currentJobId = null;
let logCheckInterval = null;
function showStatus(message, type) {
const statusEl = document.getElementById('status');
statusEl.textContent = message;
statusEl.className = `status-message ${type}`;
statusEl.style.display = 'block';
}
function runScript(scriptName, county) {
const logsEl = document.getElementById('logs');
logsEl.textContent = 'Starting script...\n';
logsEl.classList.add('active');
showStatus(`Running ${scriptName} for ${county}...`, 'info');
// Disable all buttons
document.querySelectorAll('.script-button').forEach(btn => {
btn.disabled = true;
});
fetch('/api/run-script', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
script: scriptName,
county: county
})
})
.then(response => response.json())
.then(data => {
if (data.error) {
showStatus(`Error: ${data.error}`, 'error');
enableButtons();
return;
}
currentJobId = data.job_id;
showStatus(`Script started (Job ID: ${data.job_id})`, 'success');
// Start polling for logs
if (logCheckInterval) {
clearInterval(logCheckInterval);
}
logCheckInterval = setInterval(checkJobStatus, 1000);
})
.catch(error => {
showStatus(`Error: ${error.message}`, 'error');
enableButtons();
});
}
function checkJobStatus() {
if (!currentJobId) return;
fetch(`/api/job-status/${currentJobId}`)
.then(response => response.json())
.then(data => {
const logsEl = document.getElementById('logs');
logsEl.textContent = data.logs.join('');
// Auto-scroll to bottom
logsEl.scrollTop = logsEl.scrollHeight;
if (!data.running) {
clearInterval(logCheckInterval);
showStatus('Script completed', 'success');
enableButtons();
currentJobId = null;
}
})
.catch(error => {
console.error('Error checking job status:', error);
});
}
function enableButtons() {
document.querySelectorAll('.script-button').forEach(btn => {
btn.disabled = false;
});
}
</script>
</body>
</html>

214
web/templates/map.html Normal file
View File

@@ -0,0 +1,214 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>GeoJSON Map Viewer</title>
<link rel="stylesheet" href="https://unpkg.com/leaflet@1.9.4/dist/leaflet.css" />
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
height: 100vh;
display: flex;
flex-direction: column;
}
#map {
flex: 1;
width: 100%;
}
.controls {
position: absolute;
top: 10px;
right: 10px;
z-index: 1000;
background: white;
padding: 15px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0,0,0,0.2);
min-width: 200px;
}
.controls h3 {
margin: 0 0 10px 0;
font-size: 14px;
color: #333;
}
.controls label {
display: flex;
align-items: center;
margin: 8px 0;
cursor: pointer;
font-size: 13px;
}
.layer-item {
display: flex;
align-items: center;
margin: 8px 0;
padding: 5px;
background: #f8f9fa;
border-radius: 4px;
cursor: move;
font-size: 13px;
}
.layer-item.dragging {
opacity: 0.5;
}
.layer-item input[type="checkbox"] {
margin-right: 8px;
}
.controls input[type="checkbox"] {
margin-right: 8px;
}
.controls button {
width: 100%;
padding: 10px;
margin-top: 10px;
background: #007bff;
color: white;
border: none;
border-radius: 4px;
cursor: pointer;
font-size: 13px;
font-weight: 500;
}
.controls button:hover {
background: #0056b3;
}
.controls button:disabled {
background: #ccc;
cursor: not-allowed;
}
.status {
margin-top: 10px;
padding: 8px;
border-radius: 4px;
font-size: 12px;
text-align: center;
}
.status.success {
background: #d4edda;
color: #155724;
}
.status.error {
background: #f8d7da;
color: #721c24;
}
.status.hidden {
display: none;
}
.file-input-group {
margin-bottom: 15px;
padding-bottom: 15px;
border-bottom: 1px solid #eee;
}
.file-input-group:last-of-type {
border-bottom: none;
}
.file-input-group label {
display: block;
margin-bottom: 5px;
font-weight: 500;
}
.file-input-group input[type="file"],
.file-input-group select {
width: 100%;
font-size: 11px;
padding: 5px;
border: 1px solid #ccc;
border-radius: 4px;
}
.load-button {
background: #28a745 !important;
}
.load-button:hover {
background: #218838 !important;
}
</style>
</head>
<body>
<div id="map"></div>
<div class="controls">
<h3>Layer Controls (top to bottom)</h3>
<div id="layerList">
<div class="layer-item" draggable="true" data-layer="diff">
<input type="checkbox" id="diffToggle" checked>
<span>Diff Layer</span>
</div>
<div class="layer-item" draggable="true" data-layer="osm">
<input type="checkbox" id="osmToggle" checked>
<span>OSM Roads (Gray)</span>
</div>
<div class="layer-item" draggable="true" data-layer="county">
<input type="checkbox" id="countyToggle">
<span>County Layer (Purple)</span>
</div>
</div>
<h3 style="margin-top: 15px;">Diff Filters</h3>
<label>
<input type="checkbox" id="showAdded" checked>
Show Added (Green)
</label>
<label>
<input type="checkbox" id="showRemoved" checked>
Show Removed (Red)
</label>
<label>
<input type="checkbox" id="hideService">
Hide highway=service
</label>
<h3 style="margin-top: 15px;">Load Data</h3>
<div class="file-input-group">
<label for="countySelect">County:</label>
<select id="countySelect">
<option value="lake">Lake</option>
<option value="sumter" selected>Sumter</option>
</select>
</div>
<div class="file-input-group">
<label for="dataTypeSelect">Data Type:</label>
<select id="dataTypeSelect">
<option value="roads" selected>Roads</option>
<option value="paths">Multi-Use Paths</option>
<option value="addresses">Addresses</option>
</select>
</div>
<button id="loadButton" class="load-button">Load from Server</button>
<button id="saveButton" disabled>Save Accepted Items</button>
<div id="status" class="status hidden"></div>
</div>
<script src="https://unpkg.com/leaflet@1.9.4/dist/leaflet.js"></script>
<script src="{{ url_for('static', filename='map.js') }}"></script>
</body>
</html>