Go to file
Will Bradley 611919e306
processing addresses
2024-07-11 11:41:18 -07:00
data ready to import Initial commit 2022-09-01 11:52:22 -07:00
original data Lake county updates 2024-07-09 23:54:00 -07:00
processed data processing addresses 2024-07-11 11:41:18 -07:00
README.md processing addresses 2024-07-11 11:41:18 -07:00
qgis-functions.py Lake county updates 2024-07-09 23:54:00 -07:00

README.md

The Villages Road and Address Import

See https://wiki.openstreetmap.org/wiki/The_Villages_Road_and_Address_Import

Data

Lake County: https://c.lakecountyfl.gov/ftp/GIS/GisDownloads/Shapefiles/ Sumter GIS is via emailing their GIS team and accessing their Dropbox.

Instructions

  • Always do roads first, addresses second, so new subdivisions don't throw address validation errors.
  • Open the original data in QGIS
  • Format OSM fields with QGIS functions to have proper capitalization and full spellings without extraneous whitespace, based on original fields. For example OSM uses names like North Main Street, not N MAIN ST. All fields are of the QGIS type "text" even if they're numbers.
    • You can use the Attribute Table's Field Calculator for this; you can copy-paste the qgis-functions.py file into the Function Editor and then use the Expression tab to create new, formatted virtual fields. Don't worry if the field name limit is too short, it can be fixed in JOSM.

For Sumter County:

  • For roads:
    • NAME becomes the virtual name via the title(formatstreet("NAME"))
    • SpeedLimit becomes the virtual maxspeed via concat("SpeedLimit",' mph')
    • highway=residential or similar added manually in JOSM
    • surface=asphalt added manually in JOSM
  • For addresses:
    • The Addresses shapefile is recorded in the ESRI:102659 CRS, you may need to convert or reproject to/from the default EPSG:4326 - WGS 84 CRS that OSM uses.
    • ADD_NUM becomes the virtual addr:housenumber (or addr:house temporarily, avoiding addr:house which is a real tag) as an integer
    • UNIT becomes the virtual addr:unit (sometimes the LOT key is used for multiple units in a range, but mostly it's unrelated lot IDs and not useful) as a string
    • SADD becomes the virtual addr:street (or addr:stree temporarily) via the getformattedstreetnamefromaddress("SADD") custom expression as a string
    • POST_COMM becomes the virual addr:city via the title("POST_COMM") expression (we care about postal community addresses not what municipality a place might be governed by) as a string
    • POST_CODE becomes addr:postcode (or addr:postc temporarily) as an integer
    • Manually add addr:state = 'FL'
  • For multi-modal trails (golf cart paths):
    • bicycle=yes
    • foot=yes
    • golf=cartpath
    • golf_cart=yes
    • highway=path
    • motor_vehicle=no
    • segregated=no
    • surface=asphalt
  • Use the Filter with Form function to Select all entries with "LIFECYCLE"='Current'

For Lake County:

  • For roads:
    • FullStreet becomes the virtual name via the title(formatstreet("FullStreet"))
    • SpeedLimit becomes the virtual maxspeed via concat("SpeedLimit",' mph')
    • NumberOfLa becomes the virtual lanes
    • surface=asphalt added manually
    • StreetClas becomes the virtual highway via the gethighwaytype("StreetClas")
    • Could use MaxWeight (1.0 - 20.0)
  • For addresses:
    • The Addresses shapefile is recorded in the NAD83(HARN) / Florida East (ftUS) CRS, you may need to convert or reproject to/from the default EPSG:4326 - WGS 84 CRS that OSM uses.
    • AddressNum becomes the virtual addr:housenumber (or addr:house temporarily, avoiding addr:house which is a real tag) as an integer
    • UnitType becomes the virtual addr:unit via regexp_replace("UnitType",'U ','') (UnitNumber is blank) as a string
    • The virtual addr:street (or addr:stree temporarily) is created via the regexp_replace(trim(concat(formatname("PrefixDire"),' ',title(formatstreet("PrefixType")),' ',title(formatstreet("BaseStreet")),' ',formatname("SuffixType"))),'\\s+',' ') custom expression as a string
    • PostalCity becomes the virual addr:city via the title("PostalCity") expression (we care about postal community addresses not what municipality a place might be governed by) as a string
    • ZipCode becomes addr:postcode (or addr:postc temporarily) as an integer
    • Manually add addr:state = 'FL'

Continuing instructions for both:

  • Export to Geojson, only exporting selected entries, selecting only the OSM-formatted fields we want.
  • Here you can rename temporary columns like addr:house to addr:housenumber.
  • Ensure the export file is in the EPSG:4326 - WGS84 CRS.
  • Open in JSOM. It's suggested to begin with roads first, addresses second, so the addresses can be placed in context.
  • In the Roads dataset, select and remove all relations from the geojson/shapefile layer: the data often has one relation per road and this is improper for OSM import.
  • Select a small region to work on: one neighborhood or smaller. For this import, we are assuming that only newly-constructed small residential areas will be imported, not main roads or commercial areas or areas with significant existing map data.
  • Download the area you're working on from OSM, into a new Data Layer (not your geojson layer.)
  • Select all features to be imported at this time and leave them selected until the merge step below.
  • Select all ways for roads, or all nodes for addresses. Make sure you aren't about to mass-edit the nodes of a road: deselect the nodes if this happens.
  • Ensure the tags are correct and good. (QGIS has a character limit and sometimes doesn't like colons, so double check that addr:house is addr:housenumber, addr:postc is addr:postcode, addr:stree is addr:street, etc.)
  • Mass-add new tags like highway=residential, surface=asphalt, etc, as indicated.
  • Remove any spurious tags that may have been brought over in the import (if it's not in the OSM Wiki, we don't want it.)
  • Press ctrl-shift-M to merge into the OSM data layer. There will be a warning, but click OK; we will be extra careful about validating the merge in the next steps.
  • For addresses, remove any address nodes that seem to not reflect reality or be placed far from the street bearing their name: it's better to not have 123 Adams Street mapped at all, than to claim that 123 Adams Street is hovering over someone's newly-built house at 321 Franklin Avenue, 200 feet away from Adams Street. (Cities often won't remove old addresses, leading to confusion when new streets are built.)
  • For roads, highlight multiple street segments which have the same name and press C to combine them: the county data has one way per road segment and that's excessive for OSM.
  • Check the edges of the imported areas to ensure new roads are merged with any preexisting roads
  • Check the import area to ensure no incorrect overlaps
  • Use the JOSM validator to ensure no errors in imported data. Warnings about existing data separate from the import can be ignored.
    • If there are duplicate house numbers in the data, investigate and remove the more-unlikely node or both nodes. For example 4650 Ramsell Road is duplicated in the source data, but the easternmost copy is on the "odd" side of the street and between 4653 and 4663 so it's more likely to actually be 4651, 4655, 4657, 4659, or 4661. We have no way of knowing, so we can either delete it entirely or simply delete the housenumber tag and leave it as an address without a number for a future editor to review. (We may submit incomplete data, just not wrong data.) We then leave the westernmost copy alone since 4650 fits neatly in between 4640/4644 and 4654/4660.
    • All known duplicates:
      • 4886 C 472 (one is a daycare the other is a church)
      • 5626 C THOMAS RD has unit numbers in the original data's Notes field
      • 301 CURRY ST
      • 401 HALL ST
      • 340 HEALD WAY has many buildings and many units per building, in the notes
      • 1908 LAUREL MANOR DR (one is a CELL TOWER)
      • 1950 LAUREL MANOR DR (each one has multiple units in a range, in the notes)
      • 6217 MEGGISON RD (one's note is "restroom/storage")
      • 6221 MEGGISON RD (one's note is "pavilion")
      • 6227 MEGGISON RD (one's note is "recreation center")
      • 102 NE 4TH AVE (one is a cell tower, the other a water tower)
      • 11750 NE 62ND TER (one's note is Pebble Springs retirement community building)
      • 13813 NE 136TH LOOP UNIT 1306
      • 8550 NE 138TH LN (each is a different building number)
      • 4650 RAMSELL RD
      • 400 RUTLAND ST (both say "church owned")
      • 308 SHAWN AVE (one says Wildwood Acres, the other Progress Energy Pump 29)
      • 413 S PINE ST
      • 2605 TRILLIUM RDG (one says bldg 1, the other says meter)
      • 2680 TRILLIUM RDG (bldg 3, meter, meter)
      • 13940 US 441 (different building names in the notes)
      • 702 WEBSTER ST (one is city of ww, the other retention pond)
  • Click upload
    • Make sure there are no erroneous Relations or other unwanted objects about to be uploaded.
    • Use a descriptive changeset message like "Roads/Addresses in The Villages #villagesimport"
    • Set the Source to be "Sumter County GIS"
    • You can easily copy-paste the below into the Settings tab:
comment=Roads/Addresses in The Villages #villagesimport
import=yes
website=https://wiki.openstreetmap.org/wiki/The_Villages_Road_and_Address_Import
source=Sumter County GIS
source:url=https://gitlab.com/zyphlar/the-villages-import
  • Review imported data in Achavi or Osmcha to ensure it looks proper.