Data Preparation
Goals
- Provide an overview of data needed by UrbanFootprint
- Describe the required fields for base data
- Walk through the steps used for SACOG’s base data preparation.
- Work through an example of preparing data for a county.
Data Needs
UrbanFootprint is a data intensive application. The effort that goes into data collection, preparation, and review should not be underestimated.
Sample Data Set
A sample dataset of Elk Grove, CA is available and is included in the README-developers.md and README-developers-windows.md instructions in the UrbanFootprint repo.
If you would like to use your own data, please see the below.
Creating a New Data Set
The base data which is also called the base canvas, or existing conditions dataset will require extensive data collection, processing, and then review prior to its use. The requirements imposed by UrbanFootprint on its base conditions data include strict adherence to the data schema, and the need for a detailed understanding of the existing conditions at a parcel level. If you are working in a geographic area that has not had a prior installation of UrbanFootprint, it is unlikely that there will be a dataset that you can use without substantial effort in createing a base condition dataset.
Scenario development has looser data requirements, but will require that you have an understanding of regional and local plans for the future, and have planned out goals for the scenario that can be translated onto a map at a parcel scale.
Environmental constraint layers influence the intensity of development that is possible in locations where these constraints are present.
Reference layers provide a visual reference to UrbanFootprint users while editing or visualizing scenarios.
The transportation module (and any other modules that build on its results) will require that you have substantial additional data derived from both regional transportation infrastructure GIS as well as a travel demand model.
Some of the other analytical modules also require climate data to run.
Data Types and Sources
Data for the Base Canvas
Data Type | Potential Source | Notes |
---|---|---|
Cadastral/tax parcel geograpies | Tax Assessor Local Jurisdiction Regional COG/MPO | |
Existing land use (parcel land use codes | Tax Assessor, Local Jurisdiction, Regional COG/MPO | |
Existing employment (2 or 4 digit NAICS codes) | Census LEHD, State or Regional COG/MPO | |
Dwelling Units by type (parcel scale) | Tax Assessor, Local Jurisdiction, Regional COG/MPO | Parcel scale dwelling unit counts are acceptable, ideally data is provided as Single Family/MultiFamily counts |
Population/households | Census ACS 5-Year / Other (population synthesizer etc.) | |
Populating Age Characteristics/Educational Attainment/Household Income ranges/Race-Ethnicity/Vehicle Ownership | Census ACS 5-Year / Other (population synthesizer etc.) | |
Parcel level Building Square Footage Data (even limited sample) | Tax Assessor, Local Jurisdiction, Regional COG/MPO | Used to calibrate building square footage estimates |
Any aggreggate geography to be used as the unit of analysis (other than parcel) | Regional COG/MPO |
Data for Scenario Development
Data Type | Potential Source | Notes |
---|---|---|
Priority Development Areas | Regional COG/MPO or Local Jurisdiction | |
Redevelopment Analysis Geographies | Regional COG/MPO or Local Jurisdiction | |
Planned Development (in the pipeline) locations and attributes | Regional COG/MPO or Local Jurisdiction | |
Transit Priority Zones | Regional COG/MPO or Local Jurisdiction | |
General Plan Parcels and attributes | Regional COG/MPO or Local Jurisdiction | |
Any existing RTP/SCS scenarios - geographies and attributes | Regional COG/MPO or Local Jurisdiction | |
Any other existing scenarios - geographies and attributes | Regional COG/MPO or Local Jurisdiction | |
Any other geography used to inform policy on distribution of growth | Regional COG/MPO or Local Jurisdiction |
Environmental Constraints
Data Type | Potential Source | Notes |
---|---|---|
Streams/Rivers | Regional COG/MPO | |
Wetlands | National Wetlands Inventory | |
Vernal Pools | National Wetlands Inventory | |
Priority Conservation Lands | Regional COG/MPO or Local Jurisdiction | |
Slope/Digital Elevation Model (DEM) | USGS or Regional COG/MPO | |
Flood Zones | FEMA or Regional COG/MPO | |
Sea Level Rise Zones | ||
Parks/Protected lands | CPAD Holdings |
Base Reference Layers
Data Type | Potential Source | Notes |
---|---|---|
County boundary(s) | Census TIGER | |
Jurisdiction Boundaries | Local Jurisdiction | |
Jurisdiction Sphere of Influence Boundaries | Regional COG/MPO or Local Jurisdiction | |
Regional Sub-Area Geographies | Regional COG/MPO | |
School/College/Universities | Regional COG/MPO, Local Jurisdiction, or Census TIGER" | |
Health Care Facilities | Regional COG/MPO, Local Jurisdiction, or Census TIGER | |
Public/Institutional parcels | Tax Assessor, Local Jurisdiction, Regional COG/MPO | |
Indian Reservations/Tribal Areas | Census TIGER or Regional COG/MPO | |
Military Areas | Census TIGER or Regional COG/MPO | |
Open Space/Conservation Lands | Regional COG/MPO or Local Jurisdiction |
Transportation Data
Data Type | Potential Source | Notes |
---|---|---|
Transit stops (fixed guideway and others as available) | Regional COG/MPO or Transport District | |
Skim Matrices for each horzon year/policy scenario | Regional MPO or Transportation Modeling Consultants | |
Transportation network | Regional COG/MPO or Transport District | |
Road Network (Freeways, major roads, secondary roads) | Regional COG/MPO or Transport District | |
Street Intersection Points | Regional COG/MPO or Transport District | |
Railroads: light rail, commuter rail, and selected freight rail | Regional COG/MPO or Transport District | |
TAZ geography(s) | Regional COG/MPO or Transport District | |
Bike network (if available) | Regional COG/MPO or Transport District | |
Sidewalks (if available) | Regional COG/MPO or Transport District |
Analysis Reference Data
Data Type | Potential Source | Notes |
---|---|---|
Climate Zones (for energy modeling) | ||
Evapotranspiration Zones (for outdoor water modeling) |
Base Data Schema: SACOG
- The structure and field names are critical.
- There is a single table
- Which will be uploaded to PostGIS
-
For convenience the discussion of fields will be divided into groups
-
Metadata and Geography
- Paint Configuration
- Parcel Areas/Types
- Residential/Housing
- Employment
- Building Square footage
- Outdoor Irrigated Area
Metadata and Geography
Field Name | Description |
---|---|
id | UF unique id |
geography_id | original geometry id (from SACOG) |
wkb_geometry | PostGreSQL geometry field (will not be visible in ArcGIS) |
region_lu_code | SACOG land use code |
built_form_key | identifier (name of) for building type or place type key |
created | date of data creation/import into UF system |
updated | date of data change or modifiction within UF system |
Paint Configuration
These fields are not used in the base features dataset, but are included to maintain an identical structure to the End State data.
Field Name | Description |
---|---|
dev_pct | development percent - proportion of geography receiving building type or place type application |
density_pct density percent | proportional intensity of building type or place type application |
gross_net_pct gross-to-net percent | proportion of developed acreage that receives building type or place type application |
clear_base_flag | boolean field to indicate clearance of development program (removal of all base year dwelling units and employees) |
redevelopment_flag | boolean field to indicate/track redevelopment on geography |
dirty_flag | used internally by UF during paint application |
Parcel Area/Type
Field Name | Description |
---|---|
intersection_density_sqmi | density of walkable street intersections in the geography (calculated as a weighted square mile density) |
acres_gross | gross/total acreage of the geography |
acres_parcel | gross/total acreage of the parcel(s) |
sqft_parcel | parcel square footage |
acres_parcel_res | residential acreage of the parcel |
acres_parcel_res_detsf_sl | acreage of parcel with single family small lot detached homes (<5500sf) |
acres_parcel_res_detsf_ll | acreage of parcel with single family large lot detached homes (>5500sf) |
acres_parcel_res_attsf | acreage of parcel with single family attached homes/townhomes |
acres_parcel_res_mf | acreage of parcel with multifamily housing |
acres_parcel_emp | acreage of parcel with employment |
acres_parcel_emp_off | acreage of parcel with office employment |
acres_parcel_emp_ret | acreage of parcel with retail employment |
acres_parcel_emp_ind | acreage of parcel with industrial employment |
acres_parcel_emp_ag | acreage of parcel with agricultural employment |
acres_parcel_emp_mixed | acreage of parcel with mixed employment uses |
acres_parcel_emp_military | acreage of parcel with military employment |
acres_parcel_mixed | acreage of parcel with mixed use (residential and employment) |
acres_parcel_mixed_w_off | acreage of mixed use parcels with residential and employment (includes office employment) |
acres_parcel_mixed_no_off | acreage of mixed use parcels with residential and employment (no office employment |
acres_parcel_no_use | acreage of parcel with no use |
Residential and Housing
Field Name | Description |
---|---|
hh | households |
du | dwelling units |
du_detsf | detached single family dwelling units |
du_detsf_sl | detached single family small lot dwelling units |
du_detsf_ll | detached single family large lot dwelling units |
du_attsf | attached single family dwelling units |
du_mf | multifamily dwelling units |
du_mf2to4 | units in multifamily buildings with 2 to 4 dwelling units |
du_mf5p | units in multifamily buildings with 5 or more dwelling units |
Employment
Field Name | Description |
---|---|
emp | number of employees |
emp_ret | number of retail employees |
emp_retail_services | number of retail services employees |
emp_restaurant | number of restaurant employees |
emp_accommodation | number of accommodation employees |
emp_arts_entertainment | number of arts and entertainment employees |
emp_other_services | number of other services employees |
emp_off | number of office employees |
emp_office_services | number of office services employees |
emp_public_admin | number of public administration employees |
emp_education | number of education employees |
emp_medical_services | number of medical services employees |
emp_ind | number of industrial employees |
emp_manufacturing | number of manufacturing employees |
emp_wholesale | number of wholesale employees |
emp_transport_warehousing | number of transportation and warehousing employees |
emp_utilities | number of utilities employees |
emp_construction | number of construction employees |
emp_ag | number of agricultural/extration employees |
emp_agriculture | number of agricultural employees |
emp_extraction | number of extraction employees |
emp_military | number of military employees |
Building Square Footage
Field Name | Description |
---|---|
bldg_sqft_detsf_sl | building square footage of detached single family small lot homes |
bldg_sqft_detsf_ll | building square footage of detached single family large lot homes |
bldg_sqft_attsf | building square footage of attached single family homes/townhomes |
bldg_sqft_mf | building square footage of multifamily units |
bldg_sqft_retail_services | building square footage of retail services |
bldg_sqft_restaurant | building square footage of restaurants |
bldg_sqft_accommodation | building square footage of accommodation |
bldg_sqft_arts_entertainment | building square footage of arts and entertainment |
bldg_sqft_other_services | building square footage of other services |
bldg_sqft_office_services | building square footage of office services |
bldg_sqft_public_admin | building square footage of public administration |
bldg_sqft_education | building square footage of education |
bldg_sqft_medical_services | building square footage of medical services |
bldg_sqft_transport_warehousing | building square footage of transportation and warehousing |
bldg_sqft_wholesale | building square footage of wholesale |
Outdoor Irrigated Area
Field Name | Description |
---|---|
residential_irrigated_sqft | residential outdoor irrigated square feet |
commercial_irrigated_sqft | commercial outdoor irrigated square feet |
Base Data Preparation: SACOG
Input Data
- SACOG parcel data
- SACOG Land Use
- Dwelling Units
- SACOG TAZ
- Census 2010 Blockgroups
- Census 2010 Tracts
Data Preparation: Topology
- Parcels must not overlap
- Clip the dataset to the county border
- Remove roads and waterbodies
Dwelling Units
SACOG Use Code | Dwelling Unit Type |
---|---|
Rural Residential | Single Family |
Farm Home | Single Family |
Very Low Density Res. | Single Family |
Low Density Res. | Single Family |
Large Lot Not Farm Home | Single Family |
Medium Density Res. | Multifamily |
Medium-High Density Res. | Multifamily |
High Density Res | Multifamily |
Urban Residential | Multifamily |
- Total DU = SACOG Parcel DU
- Controlled to TAZ totals
- Assign DU type using crosswalk (right), and assign DU totals to du_detsf
- Du_detsf_sl and du_detsf_ll based on sf/du calculation.
- ACS rates for Attached SF, MF 2-4, and MF 5 plus are applied to all parcels with MF units
Households
- HH from SACOG 2008
- DU from Parcel Data
- Occupancy rate = HH/DU
Population
- Calculate Average HH by block group from census data
- Ave. HH size = pop/hh
- Then multiply the HH count in each parcel by the Ave. HH size.
Employment
- Parcel employment from SACOG 2008
- Crosswalk using the table
- Use LEHD to disaggregate where needed. (next page)
- Accommodation extracted using SACOG Employment Inventory
Employment Land Use Crosswalk
SACOG Category | UrbanFootprint Employment Category |
---|---|
EmpGov | Emp_Public_admin |
EmpOfc | Emp_Office_services |
EmpMed | Emp_Medical_services |
EmpEdu | Emp_Education |
EmpRet | Emp_Retail_services |
EmpFood | Emp_Restaurant |
EmpSvc | Emp_Entrec, Emp_Othe_services, Emp_Accomodation |
EmpInd | Emp_Utilities, Emp_Transware, Emp_Warehouse, Emp_Wholesale, Emp_Construction, Emp_Manufacturing, Emp_Agriculture, Emp_Extract |
EmpOth | Emp_military |
Employment Processing and Source
Employment Processing and Source
UF Employment Sub Category | Method for Spatially Deriving Field at Parcel | SACSIM Category |
---|---|---|
Emp_Public | ,Direct Crosswalk from SACSIM Category | EmpGov |
Emp_Office | Direct Crosswalk from SACSIM Category | EmpOfc |
Emp_Medss | Direct Crosswalk from SACSIM Category | EmpMed |
Emp_Educ | Direct Crosswalk from SACSIM Category | EmpEdu |
Emp_Retail | Direct Crosswalk from SACSIM Category | EmpRet |
Emp_Restaurant | Direct Crosswalk from SACSIM Category | EmpFood |
Emp_Entrec | LEHD 2010 Near Imputed Rate (Block) | EmpSvc |
Emp_Other | LEHD 2010 Near Imputed Rate (Block) | EmpSvc |
Emp_Accommodation | SACOG Employment Inventory | EmpSvc |
Emp_Transware | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Warehouse | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Whole | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Constr | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Manuf | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Agriculture | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_Extract | LEHD 2010 Near Imputed Rate (Block) | EmpInd |
Emp_AF | Direct Crosswalk from SACSIM Category | EmpOth |
Disaggregation
- This technique is used several times during data preparation.
- Calculate the proportion of each SACOG category that goes into each UF Employment Category.
- Use the LEHD 2010 near imputed rate datase as the basis for the disaggregation.
i.e. %emp_entrec = 100 * emp_entrec / (emp_entrec + emp_other_services + emp_accomodation)
Dataset 1 (higher accuracy): 95 employees
Dataset 2: 50 retail, 30 service, and 20 industrial employees.
Total Emp | Ret. % | Ser. % | Ind. % | # Ret. | # Ser. | # Ind |
---|---|---|---|---|---|---|
95 | 50 | 30 | 20 | 47.5 | 28.5 | 19 |
Base Canvas Schema
Make a postgres database with a base_canvas table with the following columns (it may be easiest to start with the sample base database provided in the UrbanFootprint repo).
psql -U postgres -c "CREATE DATABASE urbanfootprint_source;"
psql -U postgres urbanfootprint_source;"
urbanfootprint_source=# \d+ base_canvas
Table "public.base_canvas"
Column | Type | Modifiers | Storage | Stats target | Description
---------------------------------+--------------------------+-----------+----------+--------------+-------------
geography_id | integer | | plain | |
source_id | character varying | | extended | |
wkb_geometry | geometry | | main | |
region_lu_code | character varying(200) | | extended | |
built_form_key | character varying(200) | | extended | |
land_development_category | character varying(50) | | extended | |
acres_developable | numeric(14,4) | | main | |
developable_proportion | numeric(8,4) | | main | |
sqft_parcel | numeric(14,4) | | main | |
acres_gross | numeric(14,4) | | main | |
acres_parcel | numeric(14,4) | | main | |
acres_parcel_res | numeric(14,4) | | main | |
acres_parcel_emp | numeric(14,4) | | main | |
acres_parcel_mixed_use | numeric(14,4) | | main | |
acres_parcel_no_use | numeric(14,4) | | main | |
intersection_density_sqmi | numeric(14,1) | | main | |
pop | numeric(14,1) | | main | |
hh | numeric(14,1) | | main | |
du | numeric(14,1) | | main | |
du_detsf | numeric(14,1) | | main | |
du_attsf | numeric(14,1) | | main | |
du_mf | numeric(14,1) | | main | |
emp | numeric(14,1) | | main | |
emp_ret | numeric(14,1) | | main | |
emp_off | numeric(14,1) | | main | |
emp_pub | numeric(14,1) | | main | |
emp_ind | numeric(14,1) | | main | |
emp_ag | numeric(14,1) | | main | |
emp_military | numeric(14,1) | | main | |
du_detsf_ll | numeric(14,1) | | main | |
du_detsf_sl | numeric(14,1) | | main | |
du_mf2to4 | numeric(14,1) | | main | |
du_mf5p | numeric(14,1) | | main | |
emp_retail_services | numeric(14,1) | | main | |
emp_restaurant | numeric(14,1) | | main | |
emp_accommodation | numeric(14,1) | | main | |
emp_arts_entertainment | numeric(14,1) | | main | |
emp_other_services | numeric(14,1) | | main | |
emp_office_services | numeric(14,1) | | main | |
emp_public_admin | numeric(14,1) | | main | |
emp_education | numeric(14,1) | | main | |
emp_medical_services | numeric(14,1) | | main | |
emp_manufacturing | numeric(14,1) | | main | |
emp_wholesale | numeric(14,1) | | main | |
emp_transport_warehousing | numeric(14,1) | | main | |
emp_utilities | numeric(14,1) | | main | |
emp_construction | numeric(14,1) | | main | |
emp_agriculture | numeric(14,1) | | main | |
emp_extraction | numeric(14,1) | | main | |
bldg_sqft_detsf_sl | numeric(14,1) | | main | |
bldg_sqft_detsf_ll | numeric(14,1) | | main | |
bldg_sqft_attsf | numeric(14,1) | | main | |
bldg_sqft_mf | numeric(14,1) | | main | |
bldg_sqft_retail_services | numeric(14,1) | | main | |
bldg_sqft_restaurant | numeric(14,1) | | main | |
bldg_sqft_accommodation | numeric(14,1) | | main | |
bldg_sqft_arts_entertainment | numeric(14,1) | | main | |
bldg_sqft_other_services | numeric(14,1) | | main | |
bldg_sqft_office_services | numeric(14,1) | | main | |
bldg_sqft_public_admin | numeric(14,1) | | main | |
bldg_sqft_education | numeric(14,1) | | main | |
bldg_sqft_medical_services | numeric(14,1) | | main | |
bldg_sqft_transport_warehousing | numeric(14,1) | | main | |
bldg_sqft_wholesale | numeric(14,1) | | main | |
acres_parcel_res_detsf | numeric(14,1) | | main | |
acres_parcel_res_detsf_sl | numeric(14,1) | | main | |
acres_parcel_res_detsf_ll | numeric(14,1) | | main | |
acres_parcel_res_attsf | numeric(14,1) | | main | |
acres_parcel_res_mf | numeric(14,1) | | main | |
acres_parcel_emp_ret | numeric(14,1) | | main | |
acres_parcel_emp_off | numeric(14,1) | | main | |
acres_parcel_emp_pub | numeric(14,1) | | main | |
acres_parcel_emp_ind | numeric(14,1) | | main | |
acres_parcel_emp_ag | numeric(14,1) | | main | |
acres_parcel_emp_military | numeric(14,1) | | main | |
residential_irrigated_sqft | numeric(14,1) | | main | |
commercial_irrigated_sqft | numeric(14,1) | | main | |
created | timestamp with time zone | | plain | |
updated | timestamp with time zone | | plain | |
dev_pct | numeric(8,4) | | main | |
density_pct | numeric(8,4) | | main | |
gross_net_pct | numeric(8,4) | | main | |
dirty_flag | boolean | | plain | |
clear_flag | boolean | | plain | |
Keep the Goal in Mind
- Data for your region will be unique
- This process should serve as a starting point for developing your data, not a fixed recipe.