contra-costa-import/README.md
2024-01-26 10:34:22 -08:00

274 lines
10 KiB
Markdown

# Contra Costa County (California) Address Import
See [https://wiki.openstreetmap.org/wiki/Contra_Costa_County_Address_Import](https://wiki.openstreetmap.org/wiki/Contra_Costa_County_Address_Import)
## Data Source
https://www.contracosta.ca.gov/1818/GIS (https://gis.cccounty.us/Downloads/General%20County%20Data/CCC_Adddress_Points.zip)
Already emailed County about licensing because https://gis.cccounty.us/Downloads/General%20County%20Data/CCC_GIS_Disclaimer.pdf says "THIS DATA CONTAINS COPYRIGHTED INFORMATION OF THE COUNTY OF CONTRA COSTA"
*TODO:* see if there is updated data, this was modified 8/27/2019
## Field Mapping
`street_num` -> `addr:housenumber`
`trim(array_to_string(array(prefix_typ,prefix_dir,street_nam,street_typ,suffix_dir),' '))`
->
`addr:street`
`unit_numbe` -> `addr:unit`
`city` -> `addr:city`
`zip_code` -> `addr:postcode`
## Instructions
* Download QGIS and JOSM. Download the MapWithAI plugin in JSOM and set your JOSM preferences to validate everything always, especially Addresses on upload and on demand, and at the bottom Mismatched street/street address (MapWithAI) on upload and on demand.
* Open the original data in QGIS
* Format OSM fields with QGIS functions to have proper capitalization and full spellings without extraneous whitespace, based on original fields. For example OSM uses names like North Main Street, not N MAIN ST. All fields are of the QGIS type "text" even if they're numbers.
* You can use the Attribute Table's Field Calculator for this; you can copy-paste the QGIS Functions script below into the Function Editor and then use the Expression tab to create new, formatted virtual fields. Don't worry if the field name limit is too short, it can be fixed in JOSM.
* Use QGIS's "select by value" function to search "West Street" an "East Streer" in addr:house, then proceed to delete these, as they're incorrect duplicates of West/East nth Street. Repeat this for "East Place"
* The Addresses shapefile is recorded in a California-specific CRS, make sure your project is set to WGS84 and reproject upon opening the Shapefile.
* Export to Geojson **selecting only the OSM-formatted fields we want**.
* Here you can rename any temporary or misnamed columns like `addr:house` to `addr:housenumber` etc.
* Ensure the export file is in the `EPSG:4326 - WGS84` CRS.
* Open the GeoJSON in JSOM. It's suggested to begin with roads first, addresses second, so the addresses can be placed in context.
* Select a small region to work on: one neighborhood or smaller. For this import, we are assuming that only newly-constructed small residential areas will be imported, not main roads or commercial areas or areas with significant existing map data.
* Download the area you're working on from OSM, into a new Data Layer (not your geojson layer.)
* Run the validation routine on the geojson data layer to see if there are any duplicate addresses within the data. Address them first.
* Ensure the tags are correct and good. (Shapefiles have a character limit and sometimes don't like colons, so double check that `addr:house` is `addr:housenumber`, `addr:postc` is `addr:postcode`, `addr:stree` is `addr:street`, etc.)
* Mass-add any new tags if desired (like `state=CA`)
* Remove any spurious tags that may have been brought over in the import (if it's not in the OSM Wiki, we don't want it.)
* Select all features to be imported at this time and leave them selected until the merge step below.
* Make sure that your selection window only includes nodes, no relations.
* Press ctrl-shift-M to merge into the OSM data layer. There will be a warning, but click OK; we will be extra careful about validating the merge in the next steps.
* For addresses, remove any address nodes that seem to not reflect reality or be placed far from the street bearing their name: it's better to not have 123 Adams Street mapped at all, than to claim that 123 Adams Street is hovering over someone's newly-built house at 321 Franklin Avenue, 200 feet away from Adams Street. (Cities often won't remove old addresses, leading to confusion when new streets are built.) Future OSMers can always add more/better data, but it's super confusing for everyone to try and fix wrong data.
* Check the edges of the imported areas to ensure that there aren't any undownloaded areas along the edges of where you're about to import.
* Use the JOSM validator to ensure no errors in imported data. Warnings about existing data separate from the import can be ignored. The easiest way to do this is to click the upload button and then cancel the upload (Upload validations only validate changes.)
* If there are duplicate house numbers in the data, investigate and remove the more-unlikely node or both nodes.
* Click upload
* Make sure there are no erroneous Relations or other unwanted objects about to be uploaded.
* Use a descriptive changeset message like "Addresses in Contra Costa County #cccimport"
* Set the Source to be "Contra Costa County GIS"
* You can easily copy-paste the below into the Settings tab:
```
comment=Addresses in Contra Costa County #cccimport
import=yes
website=https://wiki.openstreetmap.org/wiki/Contra_Costa_County_Address_Import
source=Contra Costa County GIS
source:url=https://git.zyphon.com/public/contra-costa-import
```
* Review imported data in Achavi or Osmcha to ensure it looks proper.
## QGIS Script
```
import qgis.core
import qgis.gui
import re
#
# This will keep street names like SR 574A as SR 574A however
# will lowercase other number-digit suffixes with <2 or >4 numbers
# or >1 suffix-letters, like 12th Street or 243rd Ave.
#
@qgsfunction(args='auto', group='Custom', referenced_columns=[])
def getstreetfromaddress(value1, feature, parent):
parts = value1.split()
parts.pop(0) # Ignore the first bit (i.e. "123" in "123 N MAIN ST")
parts = map(formatstreetname, parts)
return " ".join(parts)
@qgsfunction(args='auto', group='Custom', referenced_columns=[])
def formatstreet(value1, feature, parent):
parts = value1.split()
# Handle the special case of a street name starting with "ST"
# which is almost always "Saint __" and not "Street __"
if parts[0].upper() == "ST":
parts[0] = "Saint"
parts = map(formatstreetname, parts)
return " ".join(parts)
# Internal function
def formatstreetname(name):
nameUp = name.upper()
# Acronyms
if nameUp == "CR":
return "County Road"
if nameUp == "SR":
return "SR" # State Route
if nameUp == "NFS":
return "NFS" # National Forest Service?
if nameUp == "US":
return "US"
# Directions
if nameUp == "N":
return "North"
if nameUp == "NE":
return "Northeast"
if nameUp == "E":
return "East"
if nameUp == "SE":
return "Southeast"
if nameUp == "S":
return "South"
if nameUp == "SW":
return "Southwest"
if nameUp == "W":
return "West"
if nameUp == "NW":
return "Northwest"
# Names
if nameUp == "MACLEAY":
return "MacLeay"
if nameUp == "MCCLAINE":
return "McClaine"
if nameUp == "MCAHREN":
return "McAhren"
if nameUp == "MCCAMMON":
return "McCammon"
if nameUp == "MCCLELLAN":
return "McClellan"
if nameUp == "MCCOY":
return "McCoy"
if nameUp == "MCDONALD":
return "McDonald"
if nameUp == "MCGEE":
return "McGee"
if nameUp == "MCGILCHRIST":
return "McGilchrist"
if nameUp == "MCINTOSH":
return "McIntosh"
if nameUp == "MCKAY":
return "McKay"
if nameUp == "MCKEE":
return "McKee"
if nameUp == "MCKENZIE":
return "McKenzie"
if nameUp == "MCKILLOP":
return "McKillop"
if nameUp == "MCKINLEY":
return "McKinley"
if nameUp == "MCKNIGHT":
return "McKnight"
if nameUp == "MCLAUGHLIN":
return "McLaughlin"
if nameUp == "MCLEOD":
return "McLeod"
if nameUp == "MCMASTER":
return "McMaster"
if nameUp == "MCNARY":
return "McNary"
if nameUp == "MCNAUGHT":
return "McNaught"
if nameUp == "O'BRIEN":
return "O'Brien"
if nameUp == "O'CONNOR":
return "O'Connor"
if nameUp == "O'NEIL":
return "O'Neil"
if nameUp == "O'TOOLE":
return "O'Toole"
if nameUp == "MCARTHUR":
return "McArthur"
# Suffixes
if nameUp == "AV":
return "Avenue"
if nameUp == "AVE":
return "Avenue"
if nameUp == "BLVD":
return "Boulevard"
if nameUp == "BL":
return "Boulevard"
if nameUp == "BV":
return "Boulevard"
if nameUp == "BND":
return "Bend"
if nameUp == "CIR":
return "Circle"
if nameUp == "CR":
return "Circle"
if nameUp == "CT":
return "Court"
if nameUp == "DR":
return "Drive"
if nameUp == "FLDS":
return "Fields"
if nameUp == "GRV":
return "Grove"
if nameUp == "HL":
return "Hill"
if nameUp == "HOLW":
return "Hollow"
if nameUp == "HW":
return "Highway"
if nameUp == "HWY":
return "Highway"
if nameUp == "HY":
return "Highway"
if nameUp == "LN":
return "Lane"
if nameUp == "LOOP":
return "Loop"
if nameUp == "LP":
return "Loop"
if nameUp == "MT":
return "Mount"
if nameUp == "MTN":
return "Mountain"
if nameUp == "PATH":
return "Path"
if nameUp == "PL":
return "Place"
if nameUp == "RD":
return "Road"
if nameUp == "RUN":
return "Run"
if nameUp == "SQ":
return "Square"
if nameUp == "ST":
return "Street"
if nameUp == "TER":
return "Terrace"
if nameUp == "TR":
return "Trail"
if nameUp == "TRL":
return "Trail"
if nameUp == "VW":
return "View"
if nameUp == "WAY":
return "Way"
if nameUp == "WY":
return "Way"
if nameUp == "XING":
return "Crossing"
if nameUp == "CTR":
return "Center"
if nameUp == "PKWY":
return "Parkway"
if nameUp == "Rdg":
return "Ridge"
if nameUp == "SKWY":
return "Skyway"
if nameUp == "Vly":
return "Valleyway"
if nameUp == "PLZ":
return "Plaza"
if nameUp == "HTS":
return "Heights"
if nameUp == "CV":
return "Cove"
if re.match('^[0-9]{2,4}[A-Za-z]$', name) != None:
return name
return name #.capitalize()
```