Take-home_Ex03

Author

Yashica

Published

March 18, 2023

Modified

March 26, 2023

1.0 Overview

1.1 Background

Housing is an essential component of household wealth worldwide. Buying a housing has always been a major investment for most people. The price of housing is affected by many factors. Some of them are global in nature such as the general economy of a country or inflation rate. Others can be more specific to the properties themselves. These factors can be further divided to structural and locational factors. Structural factors are variables related to the property themselves such as the size, fitting, and tenure of the property. Locational factors are variables related to the neighbourhood of the properties such as proximity to childcare centre, public transport service and shopping centre.

Conventional, housing resale prices predictive models were built by using Ordinary Least Square (OLS) method. However, this method failed to take into consideration that spatial autocorrelation and spatial heterogeneity exist in geographic data sets such as housing transactions. With the existence of spatial autocorrelation, the OLS estimation of predictive housing resale pricing models could lead to biased, inconsistent, or inefficient results (Anselin 1998). In view of this limitation, Geographical Weighted Models were introduced for calibrating predictive model for housing resale prices.

1.2 Objectives

In this take-home exercise, you are tasked to predict HDB resale prices at the sub-market level (i.e. HDB 3-room, HDB 4-room and HDB 5-room) for the month of January and February 2023 in Singapore. The predictive models must be built by using by using conventional OLS method and GWR methods. You are also required to compare the performance of the conventional OLS method versus the geographical weighted methods.

1.3 The Data

For the purpose of this take-home exercise, HDB Resale Flat Prices provided by Data.gov.sg should be used as the core data set. The study should focus on either three-room, four-room or five-room flat and transaction period should be from 1st January 2021 to 31st December 2022. The test data should be January and February 2023 resale prices.

Below is a list of recommended predictors to consider. However, students are free to include other appropriate independent variables.

Structural Factors: - Area of the unit - Floor level - Remaining lease - Age of the unit - Main Upgrading Program (MUP) completed (optional)

Locational Factors: - Proxomity to CBD - Proximity to eldercare - Proximity to foodcourt/hawker centres - Proximity to MRT - Proximity to park - Proximity to good primary school - Proximity to shopping mall - Proximity to supermarket - Numbers of kindergartens within 350m - Numbers of childcare centres within 350m - Numbers of bus stop within 350m - Numbers of primary school within 1km

2.0 Installing R Packages & Importing Data

2.1 Packages Used

  • sf
  • spdep
  • GWmodel
  • SpatialML –> use grf.bw to find the optimal bandwidth for Geographically Weighted Random Forest: https://search.r-project.org/CRAN/refmans/SpatialML/html/grf.bw.html
  • tmap
  • rsample
  • Metrics
  • tidyverse
  • olsrr
  • ggpubr
  • gtsummary
  • lwgeom

2.2 Installing Packages

pacman::p_load(sf, spdep, GWmodel, SpatialML, 
               tmap, rsample, Metrics, tidyverse, olsrr, ggpubr, gtsummary, lwgeom, httr, jsonlite, dplyr, geojsonsf, rvest, corrplot, stats)

3.0 Importing Data

3.1 Importing Geospatial Data

sg_area = st_read(
  dsn = "data/geospatial", 
  layer = "MPSZ-2019")
Reading layer `MPSZ-2019' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
glimpse(sg_area)
Rows: 332
Columns: 7
$ SUBZONE_N  <chr> "MARINA EAST", "INSTITUTION HILL", "ROBERTSON QUAY", "JURON…
$ SUBZONE_C  <chr> "MESZ01", "RVSZ05", "SRSZ01", "WISZ01", "MUSZ02", "MPSZ05",…
$ PLN_AREA_N <chr> "MARINA EAST", "RIVER VALLEY", "SINGAPORE RIVER", "WESTERN …
$ PLN_AREA_C <chr> "ME", "RV", "SR", "WI", "MU", "MP", "WI", "WI", "SI", "SI",…
$ REGION_N   <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "WEST…
$ REGION_C   <chr> "CR", "CR", "CR", "WR", "CR", "CR", "WR", "WR", "CR", "CR",…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((103.8802 1...., MULTIPOLYGON (…

From the output message, we can tell that: - Geometry type is multipolygon - 332 records and 6 fields - We have to convert this from to Project Coordinated System.

busstop_sf <- st_read(dsn = "data/geospatial/data_extracted/BusStop", layer="BusStop")
Reading layer `BusStop' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/BusStop' 
  using driver `ESRI Shapefile'
Simple feature collection with 5159 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
glimpse(busstop_sf)
Rows: 5,159
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC   <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry   <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
childcare_sf <- st_read(dsn = "data/geospatial/data_extracted/childcare", layer="ChildcareServices")
Reading layer `ChildcareServices' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/childcare' 
  using driver `ESRI Shapefile'
Simple feature collection with 1545 features and 2 fields
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 11203.01 ymin: 25667.6 xmax: 45404.24 ymax: 49300.88
z_range:       zmin: 0 zmax: 0
Projected CRS: SVY21 / Singapore TM
glimpse(childcare_sf)
Rows: 1,545
Columns: 3
$ Name       <chr> "kml_1", "kml_2", "kml_3", "kml_4", "kml_5", "kml_6", "kml_…
$ Descriptio <chr> "<center><table><tr><th colspan='2' align='center'><em>Attr…
$ geometry   <POINT [m]> POINT Z (27976.73 45716.7 0), POINT Z (25824 29900.09…
eldercare_sf <- st_read(dsn = "data/geospatial/data_extracted/ELDERCARE", layer="ELDERCARE")
Reading layer `ELDERCARE' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/eldercare' 
  using driver `ESRI Shapefile'
Simple feature collection with 133 features and 18 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 28218.43 xmax: 41665.14 ymax: 46804.9
Projected CRS: SVY21
glimpse(eldercare_sf)
Rows: 133
Columns: 19
$ OBJECTID   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, …
$ ADDRESSBLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSBUI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSPOS <chr> "601318", "462509", "640190", "190005", "160044", "160117",…
$ ADDRESSSTR <chr> "318A Jurong East Avenue 1 #02-308", "Blk 509B Bedok North …
$ ADDRESSTYP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ DESCRIPTIO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ HYPERLINK  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ LANDXADDRE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ LANDYADDRE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ NAME       <chr> "Yuhua Senior Activity Centre", "THK SAC @ Kaki Bukit", "TH…
$ PHOTOURL   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSFLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ INC_CRC    <chr> "2B0DB92FDD914FFC", "82728FA30612F3FD", "DE7A8D4EA0BD1D9B",…
$ FMEL_UPD_D <date> 2016-07-28, 2016-07-28, 2016-07-28, 2016-07-28, 2016-07-28…
$ ADDRESSUNI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ X_ADDR     <dbl> 16614.08, 38803.81, 14481.92, 31505.35, 27218.35, 27278.94,…
$ Y_ADDR     <dbl> 36639.12, 35098.78, 36357.61, 31853.52, 30135.49, 29350.17,…
$ geometry   <POINT [m]> POINT (16614.08 36639.12), POINT (38803.81 35098.78),…
primaryschools <- read_csv( "data/geospatial/data_extracted/primaryschools/general-information-of-schools.csv")
glimpse(primaryschools)
Rows: 346
Columns: 31
$ school_name        <chr> "ADMIRALTY PRIMARY SCHOOL", "ADMIRALTY SECONDARY SC…
$ url_address        <chr> "https://admiraltypri.moe.edu.sg/", "http://www.adm…
$ address            <chr> "11   WOODLANDS CIRCLE", "31   WOODLANDS CRESCENT",…
$ postal_code        <chr> "738907", "737916", "768643", "768928", "579646", "…
$ telephone_no       <chr> "63620598", "63651733", "67592906", "67585384", "64…
$ telephone_no_2     <chr> "na", "63654596", "na", "na", "na", "na", "na", "na…
$ fax_no             <chr> "63627512", "63652774", "67592927", "67557778", "64…
$ fax_no_2           <chr> "na", "na", "na", "na", "na", "na", "na", "na", "na…
$ email_address      <chr> "ADMIRALTY_PS@MOE.EDU.SG", "Admiralty_SS@moe.edu.sg…
$ mrt_desc           <chr> "Admiralty Station", "ADMIRALTY MRT", "Yishun", "CA…
$ bus_desc           <chr> "TIBS 965, 964, 913", "904", "Yishun Ring Road - 81…
$ principal_name     <chr> "MR PEK WEE HAUR", "MR LAM YUI- P'NG", "MISS ONG LE…
$ first_vp_name      <chr> "MDM CHUA MUI LING", "MR NG SONG LIM STEVEN", "MADA…
$ second_vp_name     <chr> "MDM NUR SABARIAH BTE MOHD IBRAHIM", "MR SHEIK ALAU…
$ third_vp_name      <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NU…
$ fourth_vp_name     <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NU…
$ fifth_vp_name      <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NU…
$ sixth_vp_name      <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NU…
$ dgp_code           <chr> "WOODLANDS", "WOODLANDS", "YISHUN", "YISHUN", "BISH…
$ zone_code          <chr> "NORTH", "NORTH", "NORTH", "NORTH", "SOUTH", "SOUTH…
$ type_code          <chr> "GOVERNMENT SCHOOL", "GOVERNMENT SCHOOL", "GOVERNME…
$ nature_code        <chr> "CO-ED SCHOOL", "CO-ED SCHOOL", "CO-ED SCHOOL", "CO…
$ session_code       <chr> "FULL DAY", "SINGLE SESSION", "SINGLE SESSION", "SI…
$ mainlevel_code     <chr> "PRIMARY", "SECONDARY", "PRIMARY", "SECONDARY", "PR…
$ sap_ind            <chr> "No", "No", "No", "No", "Yes", "No", "No", "No", "N…
$ autonomous_ind     <chr> "No", "No", "No", "No", "No", "No", "No", "No", "Ye…
$ gifted_ind         <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No…
$ ip_ind             <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No…
$ mothertongue1_code <chr> "Chinese", "Chinese", "Chinese", "Chinese", "Chines…
$ mothertongue2_code <chr> "Malay", "Malay", "Malay", "Malay", "na", "Malay", …
$ mothertongue3_code <chr> "Tamil", "Tamil", "Tamil", "Tamil", "na", "Tamil", …

We see that the list of schools in this dataset contains many different levels (not just primary schools). We will need to select primary schools later on.

primaryschools <- primaryschools %>%
  filter(mainlevel_code == "PRIMARY") %>%
  select(school_name, address, postal_code, mainlevel_code)

Let’s create a list storing unique postal codes of primary schools.

prisch_list <- sort(unique(primaryschools$postal_code))

Now, let’s use a function to retrieve the coordinates of primary schools. More information on this function will be explained in section xxx.

get_coords <- function(add_list){
  
  # Create a data frame to store all retrieved coordinates
  postal_coords <- data.frame()
    
  for (i in add_list){
    #print(i)

    r <- GET('https://developers.onemap.sg/commonapi/search?',
           query=list(searchVal=i,
                     returnGeom='Y',
                     getAddrDetails='Y'))
    data <- fromJSON(rawToChar(r$content))
    found <- data$found
    res <- data$results
    
    # Create a new data frame for each address
    new_row <- data.frame()
    
    # If single result, append 
    if (found == 1){
      postal <- res$POSTAL 
      lat <- res$LATITUDE
      lng <- res$LONGITUDE
      new_row <- data.frame(address= i, postal = postal, latitude = lat, longitude = lng)
    }
    
    # If multiple results, drop NIL and append top 1
    else if (found > 1){
      # Remove those with NIL as postal
      res_sub <- res[res$POSTAL != "NIL", ]
      
      # Set as NA first if no Postal
      if (nrow(res_sub) == 0) {
          new_row <- data.frame(address= i, postal = NA, latitude = NA, longitude = NA)
      }
      
      else{
        top1 <- head(res_sub, n = 1)
        postal <- top1$POSTAL 
        lat <- top1$LATITUDE
        lng <- top1$LONGITUDE
        new_row <- data.frame(address= i, postal = postal, latitude = lat, longitude = lng)
      }
    }

    else {
      new_row <- data.frame(address= i, postal = NA, latitude = NA, longitude = NA)
    }
    
    # Add the row
    postal_coords <- rbind(postal_coords, new_row)
  }
  return(postal_coords)
}
prisch_coords <- get_coords(prisch_list)

Here, we check whether the relevant columns contains any NA values with is.na() function of base R package.

prisch_coords[(is.na(prisch_coords$postal) | is.na(prisch_coords$latitude) | is.na(prisch_coords$longitude)), ]
[1] address   postal    latitude  longitude
<0 rows> (or 0-length row.names)
prisch_coords = prisch_coords[c("postal","latitude", "longitude")]
pri_sch <- left_join(primaryschools, prisch_coords, by = c('postal_code' = 'postal'))
primaryschools_sf <- st_as_sf(pri_sch,
                    coords = c("longitude", 
                               "latitude"),
                    crs=4326) %>%
  st_transform(crs = 3414)
primaryschools_sf
Simple feature collection with 183 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11801.94 ymin: 28603.93 xmax: 42410.51 ymax: 48703.49
Projected CRS: SVY21 / Singapore TM
# A tibble: 183 × 5
   school_name                 address posta…¹ mainl…²            geometry
 * <chr>                       <chr>   <chr>   <chr>           <POINT [m]>
 1 ADMIRALTY PRIMARY SCHOOL    11   W… 738907  PRIMARY (24296.63 47144.77)
 2 AHMAD IBRAHIM PRIMARY SCHO… 10   Y… 768643  PRIMARY (27936.78 46125.16)
 3 AI TONG SCHOOL              100  B… 579646  PRIMARY (27966.81 38071.92)
 4 ALEXANDRA PRIMARY SCHOOL    2A   P… 159016  PRIMARY  (26964.86 30396.5)
 5 ANCHOR GREEN PRIMARY SCHOOL 31   A… 544969  PRIMARY (34022.25 41380.93)
 6 ANDERSON PRIMARY SCHOOL     19   A… 569785  PRIMARY (28898.48 40690.43)
 7 ANG MO KIO PRIMARY SCHOOL   20   A… 569920  PRIMARY (28710.77 38969.81)
 8 ANGLO-CHINESE SCHOOL (JUNI… 16   W… 227988  PRIMARY (28916.32 32403.75)
 9 ANGLO-CHINESE SCHOOL (PRIM… 50   B… 309918  PRIMARY (28225.54 33442.72)
10 ANGSANA PRIMARY SCHOOL      3    T… 529366  PRIMARY (41160.86 36732.32)
# … with 173 more rows, and abbreviated variable names ¹​postal_code,
#   ²​mainlevel_code

Make sure to click “Tools” -> “Install Packages” to install the “geojsonsf” package.

mrt_sf <- st_read(dsn = "data/geospatial/data_extracted/TrainStation/lta-mrt-station-exit-geojson.geojson")
Reading layer `lta-mrt-station-exit-geojson' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/TrainStation/lta-mrt-station-exit-geojson.geojson' 
  using driver `GeoJSON'
Simple feature collection with 474 features and 2 fields
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 103.6368 ymin: 1.264972 xmax: 103.9893 ymax: 1.449157
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
glimpse(mrt_sf)
Rows: 474
Columns: 3
$ Name        <chr> "kml_1", "kml_2", "kml_3", "kml_4", "kml_5", "kml_6", "kml…
$ Description <chr> "<center><table><tr><th colspan='2' align='center'><em>Att…
$ geometry    <POINT [°]> POINT Z (103.8709 1.338511 0), POINT Z (103.8705 1.3…

mrt_sf has its dimensions listed as ‘XYZ’: it has a z-dimension, though as we can see from the z_range, both zmin and zmax are at 0. As it is irrelevant to our analysis, we’ll drop this with st_zm() in our pre-processing.

We’ll take care of the Z-Dimension of mrt_sf with st_zm(), a function that drops Z (or M) dimensions from feature geometries and appropriately reset the classes.

# drops the Z-dimension from our dataframes
mrt_sf <- st_zm(mrt_sf)
mrt_sf
Simple feature collection with 474 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6368 ymin: 1.264972 xmax: 103.9893 ymax: 1.449157
Geodetic CRS:  WGS 84
First 10 features:
     Name
1   kml_1
2   kml_2
3   kml_3
4   kml_4
5   kml_5
6   kml_6
7   kml_7
8   kml_8
9   kml_9
10 kml_10
                                                                                                                                                                                                                                                                                                                                                  Description
1  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>ACFD572863DE422D</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
2  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>328763A3290E3CC8</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
3  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>DD4FAEF984D96A47</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
4  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>D92B99AC0FD16F8B</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
5  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>D2E20481ED62E439</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
6  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>7E887806CF052F4E</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
7  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C54A67D01293867F</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
8  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C87156CBBF363974</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
9  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C8C0EAA729F06B05</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
10 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>8A21E3735E9C4992</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
                    geometry
1  POINT (103.8709 1.338511)
2  POINT (103.8705 1.338583)
3  POINT (103.8619 1.319235)
4  POINT (103.8687 1.331067)
5  POINT (103.8693 1.331148)
6  POINT (103.8384 1.300028)
7  POINT (103.7876 1.299715)
8  POINT (103.8505 1.296046)
9  POINT (103.8513 1.296859)
10 POINT (103.8502 1.297014)
kindergarten_sf <- st_read(dsn = "data/geospatial/data_extracted/kindergartens", layer="KINDERGARTENS")
Reading layer `KINDERGARTENS' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/kindergartens' 
  using driver `ESRI Shapefile'
Simple feature collection with 448 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11909.7 ymin: 25596.33 xmax: 43395.47 ymax: 48562.06
Projected CRS: SVY21
glimpse(kindergarten_sf)
Rows: 448
Columns: 16
$ ADDRESSBLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSBUI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSFLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSPOS <chr> "560644", "600251", "600317", "671455", "670528", "670620",…
$ ADDRESSSTR <chr> "644 Ang Mo Kio Ave 4  #01-850 S(560644)", "251 Jurong East…
$ ADDRESSTYP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ DESCRIPTIO <chr> "Kindergartens", "Kindergartens", "Kindergartens", "Kinderg…
$ HYPERLINK  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ LANDXADDRE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ LANDYADDRE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ NAME       <chr> "PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)", "P…
$ PHOTOURL   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ INC_CRC    <chr> "904D106E26156265", "F735342764BD6BCC", "564523E27D221C4D",…
$ FMEL_UPD_D <date> 2020-08-13, 2020-08-13, 2020-08-13, 2020-08-13, 2020-08-13…
$ ADDRESSUNI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ geometry   <POINT [m]> POINT (28847.97 40124.9), POINT (17578.65 36146.17), …
hawkercentre_sf <- st_read(dsn = "data/geospatial/data_extracted/hawkercentre", layer="HAWKERCENTRE")
Reading layer `HAWKERCENTRE' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/hawkercentre' 
  using driver `ESRI Shapefile'
Simple feature collection with 125 features and 21 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12874.19 ymin: 28355.97 xmax: 45241.4 ymax: 47872.53
Projected CRS: SVY21
glimpse(hawkercentre_sf)
Rows: 125
Columns: 22
$ ADDRESSBLO <chr> "630", "16", "29", "38A", "166", "221A/B", "665", "163", "1…
$ STATUS     <chr> "Existing", "Existing", "Existing", "Existing", "Existing",…
$ CLEANINGST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSUNI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSFLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ HYPERLINK  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ INFO_ON_CO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ AWARDED_DA <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ LANDYADDRE <dbl> 35039.64, 33645.70, 33497.85, 30136.92, 32184.16, 36373.79,…
$ CLEANINGEN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ PHOTOURL   <chr> "http://www.nea.gov.sg/images/default-source/Hawker-Centres…
$ DESCRIPTIO <chr> "HUP Rebuilding", "HUP Standard Upgrading", "HUP Standard U…
$ NAME       <chr> "Bedok Reservoir Road Blk 630", "Bedok South Road Blk 16", …
$ ADDRESSTYP <chr> "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", NA, "I", …
$ ADDRESSBUI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ LANDXADDRE <dbl> 36985.00, 39376.14, 31305.63, 27336.64, 30619.27, 14587.57,…
$ ADDRESSSTR <chr> "Bedok Reservoir Road", "Bedok South Road", "Bendemeer Road…
$ ADDRESSPOS <chr> "470630", "460016", "330029", "169982", "208877", "641221",…
$ IMPLEMENTA <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ INC_CRC    <chr> "BBA7FF2BCA329EE8", "483F3877B4EE039F", "A7DB5A3EB9F35DE3",…
$ FMEL_UPD_D <date> 2021-10-25, 2021-10-25, 2021-10-25, 2021-10-25, 2021-10-25…
$ geometry   <POINT [m]> POINT (36985 35039.64), POINT (39376.14 33645.7), POI…
nationalparks_sf <- st_read(dsn = "data/geospatial/data_extracted/nationalparks", layer="NATIONALPARKS")
Reading layer `NATIONALPARKS' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/nationalparks' 
  using driver `ESRI Shapefile'
Simple feature collection with 352 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12373.11 ymin: 21869.93 xmax: 46735.95 ymax: 49231.09
Projected CRS: SVY21
glimpse(nationalparks_sf)
Rows: 352
Columns: 16
$ ADDRESSBLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSBUI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSTYP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ HYPERLINK  <chr> NA, NA, NA, NA, NA, NA, "www.nparks.gov.sg/gardens-parks-an…
$ LANDXADDRE <dbl> 29594.30, 28695.60, 30676.61, 39994.09, 40813.11, 37385.95,…
$ LANDYADDRE <dbl> 29323.41, 39413.70, 41137.35, 39355.59, 33764.61, 32814.41,…
$ NAME       <chr> "Telok Ayer Green", "Mayflower Crescent Playground", "Sunri…
$ PHOTOURL   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSPOS <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ DESCRIPTIO <chr> "Bounded by Amoy Street, Boon Tat Street and Telok Ayer Str…
$ ADDRESSSTR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADDRESSFLO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ INC_CRC    <chr> "07CFE6567539200A", "B01AE2FF8B58F5CA", "66086C14E8DACE2D",…
$ FMEL_UPD_D <date> 2020-02-18, 2020-02-18, 2020-02-18, 2020-02-18, 2020-02-18…
$ ADDRESSUNI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ geometry   <POINT [m]> POINT (29594.3 29323.41), POINT (28695.6 39413.7), PO…
supermarkets_sf <- st_read(dsn = "data/geospatial/data_extracted/supermarkets", layer="SUPERMARKETS")
Reading layer `SUPERMARKETS' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/supermarkets' 
  using driver `ESRI Shapefile'
Simple feature collection with 526 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4901.188 ymin: 25529.08 xmax: 46948.22 ymax: 49233.6
Projected CRS: SVY21
glimpse(supermarkets_sf)
Rows: 526
Columns: 9
$ LIC_NAME   <chr> "LI LI CHENG SUPERMARKET (PUNGGOL) PTE. LTD.", "SHENG SIONG…
$ BLK_HOUSE  <chr> "273C", "11", "683", "631", "201B", "201D", "421", "233", "…
$ STR_NAME   <chr> "PUNGGOL PLACE", "UPPER BOON KENG ROAD", "HOUGANG AVENUE 8"…
$ UNIT_NO    <chr> "884", "901", "903", "954", "1091", "1161", "1161", "1168",…
$ POSTCODE   <chr> "823273", "380011", "530683", "470631", "522201", "524201",…
$ LIC_NO     <chr> "NE12I65N000", "E73010V000", "NE11909C000", "S02210X000", "…
$ INC_CRC    <chr> "3DE8AF6E76F9D3D4", "F361759A8261CD6E", "1DC69902E02077CE",…
$ FMEL_UPD_D <date> 2017-11-29, 2017-11-29, 2017-11-29, 2017-11-29, 2017-11-29…
$ geometry   <POINT [m]> POINT (35561.22 42685.17), POINT (32184.01 32947.46),…
url <- "https://www.salary.sg/2021/best-primary-schools-2021-by-popularity/"

good_pri <- data.frame()

schools <- read_html(url) %>%
  html_nodes(xpath = paste('//*[@id="post-3068"]/div[3]/div/div/ol/li') ) %>%
  html_text() 

for (i in (schools)){
  sch_name <- toupper(gsub(" – .*","",i))
  sch_name <- gsub("\\(PRIMARY SECTION)","",sch_name)
  sch_name <- trimws(sch_name)
  new_row <- data.frame(pri_sch_name=sch_name)
  # Add the row
  good_pri <- rbind(good_pri, new_row)
}

top_good_pri <- head(good_pri, 10)
top_good_pri$pri_sch_name[!top_good_pri$pri_sch_name %in% primaryschools_sf$school_name]
[1] "CHIJ ST. NICHOLAS GIRLS’ SCHOOL" "CATHOLIC HIGH SCHOOL"           
[3] "ST. HILDA’S PRIMARY SCHOOL"     
good_pri_list <- unique(top_good_pri$pri_sch_name)
goodprisch_coords <- get_coords(good_pri_list)
goodprisch_coords[(is.na(goodprisch_coords$postal) | is.na(goodprisch_coords$latitude) | is.na(goodprisch_coords$longitude)), ]
                      address postal latitude longitude
10 ST. HILDA’S PRIMARY SCHOOL   <NA>     <NA>      <NA>

There are 2 primary school that we are unable to retrieve the coordinates for: - CHIJ ST. NICHOLAS GIRLS’ SCHOOL - ST. HILDA’S PRIMARY SCHOOL

With further research and testing, it is found that not only do we have to change the ST to SAINT, we also have to change the ” ’ ” used.

top_good_pri$pri_sch_name[top_good_pri$pri_sch_name == "CHIJ ST. NICHOLAS GIRLS’ SCHOOL"] <- "CHIJ SAINT NICHOLAS GIRLS' SCHOOL"
top_good_pri$pri_sch_name[top_good_pri$pri_sch_name == "ST. HILDA’S PRIMARY SCHOOL"] <- "SAINT HILDA'S PRIMARY SCHOOL"
good_pri_list <- unique(top_good_pri$pri_sch_name)
goodprisch_coords <- get_coords(good_pri_list)
goodprisch_coords[(is.na(goodprisch_coords$postal) | is.na(goodprisch_coords$latitude) | is.na(goodprisch_coords$longitude)), ]
[1] address   postal    latitude  longitude
<0 rows> (or 0-length row.names)

There are no NA values!

goodpri_sf <- st_as_sf(goodprisch_coords,
                    coords = c("longitude", 
                               "latitude"),
                    crs=4326) %>%
  st_transform(crs = 3414)
topprimary_sf <- goodpri_sf
url <- "https://en.wikipedia.org/wiki/List_of_shopping_malls_in_Singapore"
malls_list <- list()

for (i in 2:7){
  malls <- read_html(url) %>%
    html_nodes(xpath = paste('//*[@id="mw-content-text"]/div[1]/div[',as.character(i),']/ul/li',sep="") ) %>%
    html_text()
  malls_list <- append(malls_list, malls)
}
malls_list_coords <- get_coords(malls_list) %>% 
  rename("mall_name" = "address")
malls_list_coords <- subset(malls_list_coords, mall_name!= "Yew Tee Shopping Centre")
invalid_malls<- subset(malls_list_coords, is.na(malls_list_coords$postal))
invalid_malls_list <- unique(invalid_malls$mall_name)
corrected_malls <- c("Clarke Quay", "City Gate", "Raffles Holland V", "Knightsbridge", "Mustafa Centre", "GR.ID", "Shaw House",
                     "The Poiz Centre", "Velocity @ Novena Square", "Singapore Post Centre", "PLQ Mall", "KINEX", "The Grandstand")

for (i in 1:length(invalid_malls_list)) {
  malls_list_coords <- malls_list_coords %>% 
    mutate(mall_name = ifelse(as.character(mall_name) == invalid_malls_list[i], corrected_malls[i], as.character(mall_name)))
}
malls_list <- sort(unique(malls_list_coords$mall_name))
malls_coords <- get_coords(malls_list)
malls_coords[(is.na(malls_coords$postal) | is.na(malls_coords$latitude) | is.na(malls_coords$longitude)), ]
[1] address   postal    latitude  longitude
<0 rows> (or 0-length row.names)
malls_sf <- st_as_sf(malls_coords,
                    coords = c("longitude", 
                               "latitude"),
                    crs=4326) %>%
  st_transform(crs = 3414)
mall_coordinates_sf <- malls_sf

With a quick Google search, the latitude and longitude of Downtown Core also known as CBD, are 1.287953 and 103.851784 respectively.

We can first create a dataframe consisting of the latitude and longitude coordinates of the CBD area then transform it to EPSG 3414 (SVY21) format.

name <- c('CBD Area')
latitude= c(1.287953)
longitude= c(103.851784)
cbd_coords <- data.frame(name, latitude, longitude)
cbd_coords_sf <- st_as_sf(cbd_coords,
                    coords = c("longitude", 
                               "latitude"),
                    crs=4326) %>%
  st_transform(crs = 3414)
chas_sf <- st_read(dsn="data/geospatial/data_extracted/clinics/moh-chas-clinics.geojson")
Reading layer `moh-chas-clinics' from data source 
  `/Users/yashica/Desktop/xtc0/IS415-GAA/Take-home_Ex/Take_home_Ex03/data/geospatial/data_extracted/clinics/moh-chas-clinics.geojson' 
  using driver `GeoJSON'
Simple feature collection with 1064 features and 2 fields
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 103.5818 ymin: 1.016264 xmax: 103.9903 ymax: 1.456037
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84

chas_sf has its dimensions listed as ‘XYZ’: it has a z-dimension, though as we can see from the z_range, both zmin and zmax are at 0. As it is irrelevant to our analysis, we’ll drop this with st_zm() in our pre-processing.

We’ll take care of the Z-Dimension of chas_sf with st_zm(), a function that drops Z (or M) dimensions from feature geometries and appropriately reset the classes.

# drops the Z-dimension from our dataframes
chas_sf <- st_zm(chas_sf)
chas_sf
Simple feature collection with 1064 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.5818 ymin: 1.016264 xmax: 103.9903 ymax: 1.456037
Geodetic CRS:  WGS 84
First 10 features:
     Name
1   kml_1
2   kml_2
3   kml_3
4   kml_4
5   kml_5
6   kml_6
7   kml_7
8   kml_8
9   kml_9
10 kml_10
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Description
1  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>13C0197</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>BVH Community and Continuing Care Clinic</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62485749</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>547530</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>5</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td></td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>LORONG NAPIRI</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>BRIGHT VISION HOSPITAL (LEVEL 2)</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>32975.96333266</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>39338.44369977</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>78FBD9F763FA1748</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
2        <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9404225</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Balestier Clinic and Health Screening Centre</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62588798</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>329928</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>221</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>03</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>04</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Balestier Road</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>Rocca Balestier</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>30137.60782798</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>33727.68326718</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>FE3FEA6E72F6A9CD</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
3                           <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9402787</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Banyan Clinic Pte Ltd.</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>63652001</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>730768</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>768</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>02</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>04</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Woodlands Avenue 6</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>Woodlands Mart</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>24085.28700111</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>47475.18426852</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>BDFADA141BC3C5C0</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
4                                         <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>12C0244</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Beo Crescent Clinic & Surgery</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62750269</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>160040</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>40</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>02</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Beo Crescent</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>27353.89036223</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>30162.99234494</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>ACDDB5D5BB3AF37F</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
5                          <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9405101</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bethesda Medical Centre</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>63378933</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>038983</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>3</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>B1</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>124</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TEMASEK BOULEVARD</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>SUNTEC CITY MALL</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>30859.71977217</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>30923.09790327</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>0F18873C06CB85A2</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
6         <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>14M0324</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bewell Clinic @ Dorm Pte Ltd</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MC</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>65656338</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>608596</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>28</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>01</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TOH GUAN ROAD EAST</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>WESTLITE TOH GUAN DORMITORY</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>18935.85979557</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>35408.97072583</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>38F5B4509423BD1C</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
7              <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>12C0252</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bless Medical Centre Pte Ltd</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62664119</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>640221</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>221</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>108</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>BOON LAY PLACE</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>BOON LAY SHOPPING CENTRE</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>14546.34900375</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>36498.19284422</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>0A525408785D9806</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
8                                     <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9402741</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C & K FAMILY CLINIC PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62429588</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>455297</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>108</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td></td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>UPPER EAST COAST ROAD</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>38682.39910624</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>32829.31658602</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>CAA157915ADB2310</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
9                      <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9400151</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C H TAN MEDICAL CLINIC & DENTAL SURGERY</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>65618712</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>650177</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>177</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>259</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>BT BATOK WEST AVE 8</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>17808.54269556</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>36497.59486993</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>1678EB41FB069F2B</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
10                          <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9400154</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C K TAN FAMILY CLINIC & SURGERY PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62514438</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>310125</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>125</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>537</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TOA PAYOH LOR 1</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>29382.05076063</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>35611.3990661</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C04D108D1F0B247A</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
                    geometry
1   POINT (103.878 1.372037)
2  POINT (103.8525 1.321296)
3  POINT (103.7981 1.445623)
4  POINT (103.8275 1.289058)
5   POINT (103.859 1.295932)
6  POINT (103.7519 1.336499)
7  POINT (103.7124 1.346348)
8  POINT (103.9293 1.313169)
9  POINT (103.7417 1.346344)
10 POINT (103.8457 1.338331)

3.1.1 Verifying Coordinate System

Though projected coordinate system is used, let’s assign a specific a CRS (Singapore) for our use case.

sg_area <- sg_area  %>% st_transform(crs = 3414)
sg_area

From the output, we can see that Projected CRS: SVY21 / Singapore TM is used which is correct. We have successfully converted Geographic Coordinate System to Projected Coordinate System.

CRS value of 3414 has been properly assigned!

Checking how the geospatial data we’ve just imported looks like geometrically. It does resemble Singapore’s map quite closely!

st_crs(childcare_sf)
Coordinate Reference System:
  User input: SVY21 / Singapore TM 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

In correct Project Coordinated System format (Singapore)! No transformation needed!

st_crs(eldercare_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21[WGS84]",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]

eldercare_sf is not in the right coordinate system and the code is incorrect.

eldercare_sf <- eldercare_sf  %>% st_transform(crs = 3414)
eldercare_sf
Simple feature collection with 133 features and 18 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 28218.43 xmax: 41665.14 ymax: 46804.9
Projected CRS: SVY21 / Singapore TM
First 10 features:
   OBJECTID ADDRESSBLO ADDRESSBUI ADDRESSPOS
1         1       <NA>       <NA>     601318
2         2       <NA>       <NA>     462509
3         3       <NA>       <NA>     640190
4         4       <NA>       <NA>     190005
5         5       <NA>       <NA>     160044
6         6       <NA>       <NA>     160117
7         7       <NA>       <NA>     523499
8         8       <NA>       <NA>     731569
9         9       <NA>       <NA>     651210
10       10       <NA>       <NA>     540182
                              ADDRESSSTR ADDRESSTYP DESCRIPTIO HYPERLINK
1      318A Jurong East Avenue 1 #02-308       <NA>       <NA>      <NA>
2  Blk 509B Bedok North Street 3 #02-157       <NA>       <NA>      <NA>
3         Blk 190 Boon Lay Drive #01-242       <NA>       <NA>      <NA>
4                    5 Beach Rd #02-4943       <NA>       <NA>      <NA>
5             Blk 44 Beo Crescent #01-67       <NA>       <NA>      <NA>
6     Blk 117 Jalan Bukit Merah #01-1683       <NA>       <NA>      <NA>
7            499C Tampines Ave 9 #01-256       <NA>       <NA>      <NA>
8              569A Champion Way #01-346       <NA>       <NA>      <NA>
9         210A Bukit Batok St 21 #01-294       <NA>       <NA>      <NA>
10   Blk 182 Rivervale Crescent\n#01-311       <NA>       <NA>      <NA>
   LANDXADDRE LANDYADDRE
1           0          0
2           0          0
3           0          0
4           0          0
5           0          0
6           0          0
7           0          0
8           0          0
9           0          0
10          0          0
                                                           NAME PHOTOURL
1                                  Yuhua Senior Activity Centre     <NA>
2                                          THK SAC @ Kaki Bukit     <NA>
3                                            THK SAC @ Boon Lay     <NA>
4                        PEACE-Connect Senior Activity Centre@5     <NA>
5                                        THK SAC @ Beo Crescent     <NA>
6                                      Silver ACE @ Bukit Merah     <NA>
7  Lions Befrienders Senior Activity Centre @ Tampines Blk 499C     <NA>
8                    Care Corner Senior Activity Centre (WL569)     <NA>
9           Fei Yue Senior Activity Centre (Bukit Batok Branch)     <NA>
10       COMNET Senior Activity Centre @ 182 Rivervale Crescent     <NA>
   ADDRESSFLO          INC_CRC FMEL_UPD_D ADDRESSUNI   X_ADDR   Y_ADDR
1        <NA> 2B0DB92FDD914FFC 2016-07-28       <NA> 16614.08 36639.12
2        <NA> 82728FA30612F3FD 2016-07-28       <NA> 38803.81 35098.78
3        <NA> DE7A8D4EA0BD1D9B 2016-07-28       <NA> 14481.92 36357.61
4        <NA> A2C058FC5785F7FE 2016-07-28       <NA> 31505.35 31853.52
5        <NA> 9DBFD51E056AEE70 2016-07-28       <NA> 27218.35 30135.49
6        <NA> 169DABA5B6ECEA87 2016-07-28       <NA> 27278.94 29350.17
7        <NA> 5DB6B9F0BF276F6D 2016-07-28       <NA> 41665.14 37956.92
8        <NA> 4DC6800E9BB385BE 2016-07-28       <NA> 23147.94 45761.17
9        <NA> EFBD712DA5DD6FEC 2016-07-28       <NA> 18820.58 36396.32
10       <NA> 6BB0D7698D7B4C7D 2016-07-28       <NA> 36446.37 41376.90
                    geometry
1  POINT (16614.08 36639.12)
2  POINT (38803.81 35098.78)
3  POINT (14481.92 36357.61)
4  POINT (31505.35 31853.52)
5  POINT (27218.35 30135.49)
6  POINT (27278.94 29350.17)
7  POINT (41665.14 37956.92)
8  POINT (23147.94 45761.17)
9  POINT (18820.58 36396.32)
10  POINT (36446.37 41376.9)
st_crs(eldercare_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(busstop_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["WGS 84",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]
busstop_sf <- busstop_sf  %>% st_transform(crs = 3414)
busstop_sf
Simple feature collection with 5159 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21 / Singapore TM
First 10 features:
   BUS_STOP_N BUS_ROOF_N             LOC_DESC                  geometry
1       22069        B06   OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2       32071        B23         AFT TRACK 13 POINT (13228.59 44206.38)
3       44331        B01              BLK 239  POINT (21045.1 40242.08)
4       96081        B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5       11561        B05              BLK 166 POINT (24568.74 30391.85)
6       66191        B03         AFT CORFE PL POINT (30951.58 38079.61)
7       23389       B02A              PEC LTD   POINT (12476.9 32211.6)
8       54411        B02              BLK 527 POINT (30329.45 39373.92)
9       28531        B09              BLK 536 POINT (14993.31 36905.61)
10      96139        B01              BLK 148  POINT (41642.81 36513.9)
st_crs(primaryschools_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
primaryschools_sf <- primaryschools_sf  %>% st_transform(crs = 3414)
primaryschools_sf
Simple feature collection with 183 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11801.94 ymin: 28603.93 xmax: 42410.51 ymax: 48703.49
Projected CRS: SVY21 / Singapore TM
# A tibble: 183 × 5
   school_name                 address posta…¹ mainl…²            geometry
 * <chr>                       <chr>   <chr>   <chr>           <POINT [m]>
 1 ADMIRALTY PRIMARY SCHOOL    11   W… 738907  PRIMARY (24296.63 47144.77)
 2 AHMAD IBRAHIM PRIMARY SCHO… 10   Y… 768643  PRIMARY (27936.78 46125.16)
 3 AI TONG SCHOOL              100  B… 579646  PRIMARY (27966.81 38071.92)
 4 ALEXANDRA PRIMARY SCHOOL    2A   P… 159016  PRIMARY  (26964.86 30396.5)
 5 ANCHOR GREEN PRIMARY SCHOOL 31   A… 544969  PRIMARY (34022.25 41380.93)
 6 ANDERSON PRIMARY SCHOOL     19   A… 569785  PRIMARY (28898.48 40690.43)
 7 ANG MO KIO PRIMARY SCHOOL   20   A… 569920  PRIMARY (28710.77 38969.81)
 8 ANGLO-CHINESE SCHOOL (JUNI… 16   W… 227988  PRIMARY (28916.32 32403.75)
 9 ANGLO-CHINESE SCHOOL (PRIM… 50   B… 309918  PRIMARY (28225.54 33442.72)
10 ANGSANA PRIMARY SCHOOL      3    T… 529366  PRIMARY (41160.86 36732.32)
# … with 173 more rows, and abbreviated variable names ¹​postal_code,
#   ²​mainlevel_code
st_crs(primaryschools_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(mrt_sf)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]

mrt_sf is not in the right coordinate system - should be projected! Let’s convert now.

mrt_sf <- mrt_sf  %>% st_transform(crs = 3414)
mrt_sf
Simple feature collection with 474 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6134.085 ymin: 27499.7 xmax: 45356.36 ymax: 47865.92
Projected CRS: SVY21 / Singapore TM
First 10 features:
     Name
1   kml_1
2   kml_2
3   kml_3
4   kml_4
5   kml_5
6   kml_6
7   kml_7
8   kml_8
9   kml_9
10 kml_10
                                                                                                                                                                                                                                                                                                                                                  Description
1  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>ACFD572863DE422D</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
2  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>328763A3290E3CC8</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
3  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>DD4FAEF984D96A47</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
4  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>D92B99AC0FD16F8B</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
5  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>D2E20481ED62E439</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
6  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>7E887806CF052F4E</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
7  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C54A67D01293867F</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
8  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit B</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C87156CBBF363974</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
9  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit A</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C8C0EAA729F06B05</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
10 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>STATION_NA</th> <td></td> </tr><tr bgcolor=""> <th>EXIT_CODE</th> <td>Exit C</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>8A21E3735E9C4992</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190708195912</td> </tr></table></center>
                    geometry
1  POINT (32186.93 35631.26)
2  POINT (32138.84 35639.27)
3  POINT (31181.24 33499.86)
4  POINT (31938.02 34808.14)
5  POINT (32008.06 34817.14)
6  POINT (28565.44 31376.06)
7  POINT (22908.09 31341.41)
8  POINT (29909.28 30935.73)
9   POINT (30004.21 31025.6)
10  POINT (29884.3 31042.79)
st_crs(mrt_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(kindergarten_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]

The wrong ESPG code is assigned and this is not in Projected Coordinate System.

kindergarten_sf <- kindergarten_sf  %>% st_transform(crs = 3414)
kindergarten_sf
Simple feature collection with 448 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11909.7 ymin: 25596.33 xmax: 43395.47 ymax: 48562.06
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO ADDRESSBUI ADDRESSFLO ADDRESSPOS
1        <NA>       <NA>       <NA>     560644
2        <NA>       <NA>       <NA>     600251
3        <NA>       <NA>       <NA>     600317
4        <NA>       <NA>       <NA>     671455
5        <NA>       <NA>       <NA>     670528
6        <NA>       <NA>       <NA>     670620
7        <NA>       <NA>       <NA>     380008
8        <NA>       <NA>       <NA>     118643
9        <NA>       <NA>       <NA>     519420
10       <NA>       <NA>       <NA>     299574
                                                                   ADDRESSSTR
1                                     644 Ang Mo Kio Ave 4  #01-850 S(560644)
2                         251 Jurong East Street 24 Blk 247 #01-110 S(600251)
3                                 317 Jurong East Street 31  #01-14 S(600317)
4                                           455A Segar Road  #01-01 S(671455)
5                             528 Jelapang Road Blk 526, 532 #01-79 S(670528)
6                  620 Bukit Panjang Ring Road Blk 615, 619 #01-828 S(670620)
7  8, UPPER BOON KENG ROAD, #01 - 03, MULTI STOREY CAR PARK, SINGAPORE 380008
8                                          302 Pasir Panjang Road   S(118643)
9                                             4 Pasir Ris Drive 6   S(519420)
10                                                1 Dunearn Close   S(299574)
   ADDRESSTYP    DESCRIPTIO HYPERLINK LANDXADDRE LANDYADDRE
1        <NA> Kindergartens      <NA>          0          0
2        <NA> Kindergartens      <NA>          0          0
3        <NA> Kindergartens      <NA>          0          0
4        <NA> Kindergartens      <NA>          0          0
5        <NA> Kindergartens      <NA>          0          0
6        <NA> Kindergartens      <NA>          0          0
7        <NA> Kindergartens      <NA>          0          0
8        <NA> Kindergartens      <NA>          0          0
9        <NA> Kindergartens      <NA>          0          0
10       <NA> Kindergartens      <NA>          0          0
                                                    NAME PHOTOURL
1  PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)     <NA>
2         PCF Sparkletots Preschool @ Yuhua Blk 251 (KN)     <NA>
3         PCF Sparkletots Preschool @ Yuhua Blk 317 (KN)     <NA>
4     PCF Sparkletots Preschool @ Zhenghua Blk 455A (KN)     <NA>
5      PCF Sparkletots Preschool @ Zhenghua Blk 528 (KN)     <NA>
6      PCF Sparkletots Preschool @ Zhenghua Blk 620 (KN)     <NA>
7        PCF SPARKLETOTS PRESCHOOL@KOLAM AYER BLK 8 (DS)     <NA>
8                      Pearlbank Montessori Kindergarten     <NA>
9                Pentecost Methodist Church Kindergarten     <NA>
10                                Pibos Garden Preschool     <NA>
            INC_CRC FMEL_UPD_D ADDRESSUNI                  geometry
1  904D106E26156265 2020-08-13       <NA>  POINT (28847.97 40124.9)
2  F735342764BD6BCC 2020-08-13       <NA> POINT (17578.65 36146.17)
3  564523E27D221C4D 2020-08-13       <NA> POINT (16609.18 36520.38)
4  7EED470791C85B92 2020-08-13       <NA> POINT (21088.36 41055.75)
5  DEB389E855725B16 2020-08-13       <NA>  POINT (20457.5 40777.31)
6  AB8E78A4C780367B 2020-08-13       <NA>  POINT (20033.94 40866.5)
7  FCAA7F53E78614D6 2020-08-13       <NA> POINT (32159.15 32778.05)
8  52E6CED97C628B57 2020-08-13       <NA> POINT (22042.91 29991.61)
9  8C80D6087330C37A 2020-08-13       <NA> POINT (41780.92 39252.46)
10 179B665553AAD262 2020-08-13       <NA> POINT (25916.23 33989.13)
st_crs(kindergarten_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(hawkercentre_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21[WGS84]",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]
hawkercentre_sf <- hawkercentre_sf  %>% st_transform(crs = 3414)
hawkercentre_sf
Simple feature collection with 125 features and 21 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12874.19 ymin: 28355.97 xmax: 45241.4 ymax: 47872.53
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO   STATUS CLEANINGST ADDRESSUNI ADDRESSFLO HYPERLINK INFO_ON_CO
1         630 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
2          16 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
3          29 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
4         38A Existing       <NA>       <NA>       <NA>      <NA>       <NA>
5         166 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
6      221A/B Existing       <NA>       <NA>       <NA>      <NA>       <NA>
7         665 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
8         163 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
9         120 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
10        115 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
   AWARDED_DA LANDYADDRE CLEANINGEN
1        <NA>   35039.64       <NA>
2        <NA>   33645.70       <NA>
3        <NA>   33497.85       <NA>
4        <NA>   30136.92       <NA>
5        <NA>   32184.16       <NA>
6        <NA>   36373.79       <NA>
7        <NA>   32059.13       <NA>
8        <NA>   29569.73       <NA>
9        <NA>   29857.81       <NA>
10       <NA>   29770.11       <NA>
                                                                                       PHOTOURL
1  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262155176127.jpg
2  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267853021378.jpg
3  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262226446535.jpg
4  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267807628554.jpg
5  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1265176120532.jpg
6  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262154766447.jpg
7  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262156313186.jpg
8  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267851995788.jpg
9  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267852344747.jpg
10 http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267851251303.jpg
               DESCRIPTIO
1          HUP Rebuilding
2  HUP Standard Upgrading
3  HUP Standard Upgrading
4  HUP Standard Upgrading
5  HUP Standard Upgrading
6     HUP Reconfiguration
7  HUP Standard Upgrading
8        Opted out of HUP
9  HUP Standard Upgrading
10 HUP Standard Upgrading
                                                                         NAME
1                                                Bedok Reservoir Road Blk 630
2                                                     Bedok South Road Blk 16
3                    Bendemeer Road Blk 29 (Bendemeer Market and Food Centre)
4                                                         Beo Crescent Market
5                                                          Berseh Food Centre
6          Boon Lay Place Blk 221A/B (Boon Lay Place Market and Food Village)
7                         Buffalo Road Blk 665 (Tekka Centre/Zhu Jiao Market)
8               Bukit Merah Central Blk 163 (Bukit Merah Central Food Centre)
9                  Bukit Merah Lane 1 Blk 120 (Alexandra Village Food Centre)
10 Bukit Merah View Blk 115 (Blk 115 Bukit Merah View Market and Food Centre)
   ADDRESSTYP ADDRESSBUI LANDXADDRE           ADDRESSSTR ADDRESSPOS IMPLEMENTA
1           I       <NA>   36985.00 Bedok Reservoir Road     470630       <NA>
2           I       <NA>   39376.14     Bedok South Road     460016       <NA>
3           I       <NA>   31305.63       Bendemeer Road     330029       <NA>
4           I       <NA>   27336.64         Beo Crescent     169982       <NA>
5           I       <NA>   30619.27          Jalan Besar     208877       <NA>
6           I       <NA>   14587.57       Boon Lay Place     641221       <NA>
7           I       <NA>   29915.58         Buffalo Road     210665       <NA>
8           I       <NA>   26183.03  Bukit Merah Central     150163       <NA>
9           I       <NA>   24791.54   Bukit Merah Lane 1     150120       <NA>
10          I       <NA>   26745.83     Bukit Merah View     151115       <NA>
            INC_CRC FMEL_UPD_D                  geometry
1  BBA7FF2BCA329EE8 2021-10-25    POINT (36985 35039.64)
2  483F3877B4EE039F 2021-10-25  POINT (39376.14 33645.7)
3  A7DB5A3EB9F35DE3 2021-10-25 POINT (31305.63 33497.85)
4  D232402B3632CAB2 2021-10-25 POINT (27336.64 30136.92)
5  A7CCEB2CC0B77C6D 2021-10-25 POINT (30619.27 32184.16)
6  84F8C2B4D627C21E 2021-10-25 POINT (14587.57 36373.79)
7  D09ADF340C3B739E 2021-10-25 POINT (29915.58 32059.13)
8  16F0047A1B5AC0D0 2021-10-25 POINT (26183.03 29569.73)
9  98A3C26D3632E680 2021-10-25 POINT (24791.54 29857.81)
10 7A2BDBF509B4F7FF 2021-10-25 POINT (26745.83 29770.11)
st_crs(hawkercentre_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(nationalparks_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21[WGS84]",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]
nationalparks_sf <- nationalparks_sf  %>% st_transform(crs = 3414)
nationalparks_sf
Simple feature collection with 352 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12373.11 ymin: 21869.93 xmax: 46735.95 ymax: 49231.09
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO ADDRESSBUI ADDRESSTYP
1        <NA>       <NA>       <NA>
2        <NA>       <NA>       <NA>
3        <NA>       <NA>       <NA>
4        <NA>       <NA>       <NA>
5        <NA>       <NA>       <NA>
6        <NA>       <NA>       <NA>
7        <NA>       <NA>       <NA>
8        <NA>       <NA>       <NA>
9        <NA>       <NA>       <NA>
10       <NA>       <NA>       <NA>
                                                                             HYPERLINK
1                                                                                 <NA>
2                                                                                 <NA>
3                                                                                 <NA>
4                                                                                 <NA>
5                                                                                 <NA>
6                                                                                 <NA>
7  www.nparks.gov.sg/gardens-parks-and-nature/parks-and-nature-reserves/sun-plaza-park
8                                                                                 <NA>
9                                                                                 <NA>
10                                                                                <NA>
   LANDXADDRE LANDYADDRE                          NAME PHOTOURL ADDRESSPOS
1    29594.30   29323.41              Telok Ayer Green     <NA>       <NA>
2    28695.60   39413.70 Mayflower Crescent Playground     <NA>       <NA>
3    30676.61   41137.35    Sunrise Drive Playground 1     <NA>       <NA>
4    39994.09   39355.59      Elias Terrace Playground     <NA>       <NA>
5    40813.11   33764.61         Kew Avenue Playground     <NA>       <NA>
6    37385.95   32814.41   Greenfield Drive Playground     <NA>       <NA>
7    40371.20   37926.31                Sun Plaza Park     <NA>       <NA>
8    45135.69   41460.78   Changi Point Ferry Terminal     <NA>       <NA>
9    27896.88   39226.05         Shangri-La Playground     <NA>       <NA>
10   38142.70   33595.77   Opera Estate Football Field     <NA>       <NA>
                                                      DESCRIPTIO ADDRESSSTR
1  Bounded by Amoy Street, Boon Tat Street and Telok Ayer Street       <NA>
2     At the junction of Mayflower Crescent and Mayflower Avenue       <NA>
3                                    Located along Sunrise Drive       <NA>
4                      Junction of Elias Terrace and Elias Green       <NA>
5                           Junction of Kew Avenue and Kew Drive       <NA>
6                                               Greenfield Drive       <NA>
7                     Along Tampines Avenue 7, Tampines Avenue 9       <NA>
8                                              Near Lor Bekukong       <NA>
9                  Junction of Ang Mo Kio Ave 2 and Jalan Lanjut       <NA>
10                                              Swan Lake Avenue       <NA>
   ADDRESSFLO          INC_CRC FMEL_UPD_D ADDRESSUNI                  geometry
1        <NA> 07CFE6567539200A 2020-02-18       <NA>  POINT (29594.3 29323.41)
2        <NA> B01AE2FF8B58F5CA 2020-02-18       <NA>   POINT (28695.6 39413.7)
3        <NA> 66086C14E8DACE2D 2020-02-18       <NA> POINT (30676.61 41137.35)
4        <NA> 8B06ED4574E90FC9 2020-02-18       <NA> POINT (39994.09 39355.59)
5        <NA> E3FD62E109D01A9C 2020-02-18       <NA> POINT (40813.11 33764.61)
6        <NA> 8B896BD1155428FB 2020-02-18       <NA> POINT (37385.95 32814.41)
7        <NA> 5BBA8A8EB630BA01 2020-02-18       <NA>  POINT (40371.2 37926.31)
8        <NA> AA9DDE6381971B22 2020-02-18       <NA> POINT (45135.69 41460.78)
9        <NA> FF16ED3C3767BD2C 2020-02-18       <NA> POINT (27896.88 39226.05)
10       <NA> AD2BA4AF1306E38A 2020-02-18       <NA>  POINT (38142.7 33595.77)
st_crs(nationalparks_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(supermarkets_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["WGS 84",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]
supermarkets_sf <- supermarkets_sf  %>% st_transform(crs = 3414)
supermarkets_sf
Simple feature collection with 526 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4901.188 ymin: 25529.08 xmax: 46948.22 ymax: 49233.6
Projected CRS: SVY21 / Singapore TM
First 10 features:
                                      LIC_NAME BLK_HOUSE             STR_NAME
1  LI LI CHENG SUPERMARKET (PUNGGOL) PTE. LTD.      273C        PUNGGOL PLACE
2              SHENG SIONG SUPERMARKET PTE LTD        11 UPPER BOON KENG ROAD
3        COLD STORAGE SINGAPORE (1983) PTE LTD       683     HOUGANG AVENUE 8
4        COLD STORAGE SINGAPORE (1983) PTE LTD       631 BEDOK RESERVOIR ROAD
5                      YES SUPERMARKET PTE LTD      201B   TAMPINES STREET 21
6                   SUZYAMEER FROZEN PTE. LTD.      201D   TAMPINES STREET 21
7                            G8 MART PTE. LTD.       421 ANG MO KIO AVENUE 10
8              SHENG SIONG SUPERMARKET PTE LTD       233  ANG MO KIO AVENUE 3
9             PRIME SUPERMARKET (1996) PTE LTD       106     HOUGANG AVENUE 1
10                                TAN KWEE ENG       327     YISHUN RING ROAD
   UNIT_NO POSTCODE      LIC_NO          INC_CRC FMEL_UPD_D
1      884   823273 NE12I65N000 3DE8AF6E76F9D3D4 2017-11-29
2      901   380011  E73010V000 F361759A8261CD6E 2017-11-29
3      903   530683 NE11909C000 1DC69902E02077CE 2017-11-29
4      954   470631  S02210X000 4E2560154B58BA38 2017-11-29
5     1091   522201  S02037J000 559A9A00D9FF8A55 2017-11-29
6     1161   524201 NE08357A000 1D32060098628881 2017-11-29
7     1161   560421 CE13401C000 E83AE5A9842F67BC 2017-11-29
8     1168   560233 CE04334P000 08D1E417EB224327 2017-11-29
9     1213   530106  S02059X000 3DA5C840D472C779 2017-11-29
10    1320   760327  B02041C000 FBB8A845FD8ADDC4 2017-11-29
                    geometry
1  POINT (35561.22 42685.17)
2  POINT (32184.01 32947.46)
3  POINT (33903.48 39480.46)
4  POINT (37083.82 35017.47)
5   POINT (41320.3 37283.82)
6  POINT (41384.47 37152.14)
7  POINT (30186.63 38602.77)
8  POINT (28380.83 38842.16)
9  POINT (34383.76 37311.19)
10 POINT (29010.23 45755.51)
st_crs(supermarkets_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(topprimary_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
st_crs(mall_coordinates_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
plot(st_geometry(sg_area))

st_crs(cbd_coords_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
cbd_coords_sf
Simple feature collection with 1 feature and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 30055.05 ymin: 30040.83 xmax: 30055.05 ymax: 30040.83
Projected CRS: SVY21 / Singapore TM
      name                  geometry
1 CBD Area POINT (30055.05 30040.83)
st_crs(chas_sf)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
chas_sf <- chas_sf %>% st_transform(crs = 3414)
chas_sf
Simple feature collection with 1064 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 2.127126e-08 ymin: -5.585825e-07 xmax: 45475.65 ymax: 48626.7
Projected CRS: SVY21 / Singapore TM
First 10 features:
     Name
1   kml_1
2   kml_2
3   kml_3
4   kml_4
5   kml_5
6   kml_6
7   kml_7
8   kml_8
9   kml_9
10 kml_10
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Description
1  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>13C0197</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>BVH Community and Continuing Care Clinic</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62485749</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>547530</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>5</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td></td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>LORONG NAPIRI</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>BRIGHT VISION HOSPITAL (LEVEL 2)</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>32975.96333266</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>39338.44369977</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>78FBD9F763FA1748</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
2        <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9404225</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Balestier Clinic and Health Screening Centre</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62588798</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>329928</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>221</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>03</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>04</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Balestier Road</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>Rocca Balestier</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>30137.60782798</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>33727.68326718</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>FE3FEA6E72F6A9CD</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
3                           <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9402787</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Banyan Clinic Pte Ltd.</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>63652001</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>730768</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>768</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>02</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>04</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Woodlands Avenue 6</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>Woodlands Mart</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>24085.28700111</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>47475.18426852</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>BDFADA141BC3C5C0</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
4                                         <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>12C0244</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Beo Crescent Clinic & Surgery</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62750269</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>160040</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>40</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>02</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>Beo Crescent</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>27353.89036223</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>30162.99234494</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>ACDDB5D5BB3AF37F</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
5                          <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9405101</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bethesda Medical Centre</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>63378933</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>038983</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>3</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>B1</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>124</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TEMASEK BOULEVARD</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>SUNTEC CITY MALL</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>30859.71977217</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>30923.09790327</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>0F18873C06CB85A2</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
6         <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>14M0324</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bewell Clinic @ Dorm Pte Ltd</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MC</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>65656338</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>608596</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>28</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>01</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TOH GUAN ROAD EAST</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>WESTLITE TOH GUAN DORMITORY</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>18935.85979557</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>35408.97072583</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>38F5B4509423BD1C</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
7              <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>12C0252</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>Bless Medical Centre Pte Ltd</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62664119</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>640221</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>221</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>108</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>BOON LAY PLACE</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td>BOON LAY SHOPPING CENTRE</td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>14546.34900375</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>36498.19284422</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>0A525408785D9806</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
8                                     <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9402741</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C & K FAMILY CLINIC PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62429588</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>455297</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>B</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>108</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td></td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>UPPER EAST COAST ROAD</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>38682.39910624</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>32829.31658602</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>CAA157915ADB2310</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
9                      <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9400151</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C H TAN MEDICAL CLINIC & DENTAL SURGERY</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>65618712</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>650177</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>177</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>259</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>BT BATOK WEST AVE 8</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>17808.54269556</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>36497.59486993</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>1678EB41FB069F2B</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
10                          <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>HCI_CODE</th> <td>9400154</td> </tr><tr bgcolor=""> <th>HCI_NAME</th> <td>C K TAN FAMILY CLINIC & SURGERY PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>LICENCE_TYPE</th> <td>MD</td> </tr><tr bgcolor=""> <th>HCI_TEL</th> <td>62514438</td> </tr><tr bgcolor="#E3E3F3"> <th>POSTAL_CD</th> <td>310125</td> </tr><tr bgcolor=""> <th>ADDR_TYPE</th> <td>A</td> </tr><tr bgcolor="#E3E3F3"> <th>BLK_HSE_NO</th> <td>125</td> </tr><tr bgcolor=""> <th>FLOOR_NO</th> <td>01</td> </tr><tr bgcolor="#E3E3F3"> <th>UNIT_NO</th> <td>537</td> </tr><tr bgcolor=""> <th>STREET_NAME</th> <td>TOA PAYOH LOR 1</td> </tr><tr bgcolor="#E3E3F3"> <th>BUILDING_NAME</th> <td></td> </tr><tr bgcolor=""> <th>CLINIC_PROGRAMME_CODE</th> <td>CDMP,CHAS,ISP</td> </tr><tr bgcolor="#E3E3F3"> <th>X_COORDINATE</th> <td>29382.05076063</td> </tr><tr bgcolor=""> <th>Y_COORDINATE</th> <td>35611.3990661</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>C04D108D1F0B247A</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20190213130534</td> </tr></table></center>
                    geometry
1  POINT (32975.96 39338.44)
2  POINT (30137.61 33727.68)
3  POINT (24085.29 47475.18)
4  POINT (27353.89 30162.99)
5   POINT (30859.72 30923.1)
6  POINT (18935.86 35408.97)
7  POINT (14546.35 36498.19)
8   POINT (38682.4 32829.32)
9  POINT (17808.54 36497.59)
10  POINT (29382.05 35611.4)

3.1.2 Checking For Invalid Geometries

length(which(st_is_valid(sg_area) == FALSE))
[1] 6

Oh no! Looks like there are 6 records from sg_area with invalid geometries. Before proceeding, let’s remove these records from sg_area.

sg_area <-  sg_area %>% filter(st_is_valid(sg_area) == TRUE)
sg_area
Simple feature collection with 326 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0844 ymax: 1.470775
Geodetic CRS:  WGS 84
First 10 features:
                 SUBZONE_N SUBZONE_C      PLN_AREA_N PLN_AREA_C       REGION_N
1              MARINA EAST    MESZ01     MARINA EAST         ME CENTRAL REGION
2         INSTITUTION HILL    RVSZ05    RIVER VALLEY         RV CENTRAL REGION
3           ROBERTSON QUAY    SRSZ01 SINGAPORE RIVER         SR CENTRAL REGION
4  JURONG ISLAND AND BUKOM    WISZ01 WESTERN ISLANDS         WI    WEST REGION
5             FORT CANNING    MUSZ02          MUSEUM         MU CENTRAL REGION
6         MARINA EAST (MP)    MPSZ05   MARINE PARADE         MP CENTRAL REGION
7                   SUDONG    WISZ03 WESTERN ISLANDS         WI    WEST REGION
8                  SEMAKAU    WISZ02 WESTERN ISLANDS         WI    WEST REGION
9           CITY TERMINALS    BMSZ17     BUKIT MERAH         BM CENTRAL REGION
10                   ANSON    DTSZ10   DOWNTOWN CORE         DT CENTRAL REGION
   REGION_C                       geometry
1        CR MULTIPOLYGON (((103.8802 1....
2        CR MULTIPOLYGON (((103.8376 1....
3        CR MULTIPOLYGON (((103.8341 1....
4        WR MULTIPOLYGON (((103.7125 1....
5        CR MULTIPOLYGON (((103.8472 1....
6        CR MULTIPOLYGON (((103.8987 1....
7        WR MULTIPOLYGON (((103.7235 1....
8        WR MULTIPOLYGON (((103.76 1.21...
9        CR MULTIPOLYGON (((103.8323 1....
10       CR MULTIPOLYGON (((103.8441 1....
length(which(st_is_valid(sg_area) == FALSE))
[1] 0

Now, all the 9 rows have been successfully dropped.

length(which(st_is_valid(childcare_sf) == FALSE))
[1] 0
length(which(st_is_valid(eldercare_sf) == FALSE))
[1] 0
length(which(st_is_valid(busstop_sf) == FALSE))
[1] 0
length(which(st_is_valid(primaryschools_sf) == FALSE))
[1] 0
length(which(st_is_valid(mrt_sf) == FALSE))
[1] 0
length(which(st_is_valid(kindergarten_sf) == FALSE))
[1] 0
length(which(st_is_valid(hawkercentre_sf) == FALSE))
[1] 0
length(which(st_is_valid(nationalparks_sf ) == FALSE))
[1] 0
length(which(st_is_valid(supermarkets_sf ) == FALSE))
[1] 0
length(which(st_is_valid(topprimary_sf) == FALSE))
[1] 0
length(which(st_is_valid(mall_coordinates_sf) == FALSE))
[1] 0
length(which(st_is_valid(cbd_coords_sf) == FALSE))
[1] 0
length(which(st_is_valid(chas_sf) == FALSE))
[1] 0

3.1.3 Checking For Missing Values

Let’s check for missing rows.

sg_area[rowSums(is.na(sg_area))!=0,]
Simple feature collection with 0 features and 6 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Geodetic CRS:  WGS 84
[1] SUBZONE_N  SUBZONE_C  PLN_AREA_N PLN_AREA_C REGION_N   REGION_C   geometry  
<0 rows> (or 0-length row.names)

Looks like there are no missing rows!

childcare_sf[rowSums(is.na(childcare_sf))!=0,]
Simple feature collection with 0 features and 2 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] Name       Descriptio geometry  
<0 rows> (or 0-length row.names)
eldercare_sf[rowSums(is.na(eldercare_sf))!=0,]
Simple feature collection with 133 features and 18 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 28218.43 xmax: 41665.14 ymax: 46804.9
Projected CRS: SVY21 / Singapore TM
First 10 features:
   OBJECTID ADDRESSBLO ADDRESSBUI ADDRESSPOS
1         1       <NA>       <NA>     601318
2         2       <NA>       <NA>     462509
3         3       <NA>       <NA>     640190
4         4       <NA>       <NA>     190005
5         5       <NA>       <NA>     160044
6         6       <NA>       <NA>     160117
7         7       <NA>       <NA>     523499
8         8       <NA>       <NA>     731569
9         9       <NA>       <NA>     651210
10       10       <NA>       <NA>     540182
                              ADDRESSSTR ADDRESSTYP DESCRIPTIO HYPERLINK
1      318A Jurong East Avenue 1 #02-308       <NA>       <NA>      <NA>
2  Blk 509B Bedok North Street 3 #02-157       <NA>       <NA>      <NA>
3         Blk 190 Boon Lay Drive #01-242       <NA>       <NA>      <NA>
4                    5 Beach Rd #02-4943       <NA>       <NA>      <NA>
5             Blk 44 Beo Crescent #01-67       <NA>       <NA>      <NA>
6     Blk 117 Jalan Bukit Merah #01-1683       <NA>       <NA>      <NA>
7            499C Tampines Ave 9 #01-256       <NA>       <NA>      <NA>
8              569A Champion Way #01-346       <NA>       <NA>      <NA>
9         210A Bukit Batok St 21 #01-294       <NA>       <NA>      <NA>
10   Blk 182 Rivervale Crescent\n#01-311       <NA>       <NA>      <NA>
   LANDXADDRE LANDYADDRE
1           0          0
2           0          0
3           0          0
4           0          0
5           0          0
6           0          0
7           0          0
8           0          0
9           0          0
10          0          0
                                                           NAME PHOTOURL
1                                  Yuhua Senior Activity Centre     <NA>
2                                          THK SAC @ Kaki Bukit     <NA>
3                                            THK SAC @ Boon Lay     <NA>
4                        PEACE-Connect Senior Activity Centre@5     <NA>
5                                        THK SAC @ Beo Crescent     <NA>
6                                      Silver ACE @ Bukit Merah     <NA>
7  Lions Befrienders Senior Activity Centre @ Tampines Blk 499C     <NA>
8                    Care Corner Senior Activity Centre (WL569)     <NA>
9           Fei Yue Senior Activity Centre (Bukit Batok Branch)     <NA>
10       COMNET Senior Activity Centre @ 182 Rivervale Crescent     <NA>
   ADDRESSFLO          INC_CRC FMEL_UPD_D ADDRESSUNI   X_ADDR   Y_ADDR
1        <NA> 2B0DB92FDD914FFC 2016-07-28       <NA> 16614.08 36639.12
2        <NA> 82728FA30612F3FD 2016-07-28       <NA> 38803.81 35098.78
3        <NA> DE7A8D4EA0BD1D9B 2016-07-28       <NA> 14481.92 36357.61
4        <NA> A2C058FC5785F7FE 2016-07-28       <NA> 31505.35 31853.52
5        <NA> 9DBFD51E056AEE70 2016-07-28       <NA> 27218.35 30135.49
6        <NA> 169DABA5B6ECEA87 2016-07-28       <NA> 27278.94 29350.17
7        <NA> 5DB6B9F0BF276F6D 2016-07-28       <NA> 41665.14 37956.92
8        <NA> 4DC6800E9BB385BE 2016-07-28       <NA> 23147.94 45761.17
9        <NA> EFBD712DA5DD6FEC 2016-07-28       <NA> 18820.58 36396.32
10       <NA> 6BB0D7698D7B4C7D 2016-07-28       <NA> 36446.37 41376.90
                    geometry
1  POINT (16614.08 36639.12)
2  POINT (38803.81 35098.78)
3  POINT (14481.92 36357.61)
4  POINT (31505.35 31853.52)
5  POINT (27218.35 30135.49)
6  POINT (27278.94 29350.17)
7  POINT (41665.14 37956.92)
8  POINT (23147.94 45761.17)
9  POINT (18820.58 36396.32)
10  POINT (36446.37 41376.9)

Looks like there are NA values but upon closer analysis, we notice that the NA values exist in the columns that we do not need. Let’s drop these irrelevant NA containing columns in eldercare_sf.

eldercare_sf <- eldercare_sf %>% select(4,5,11,17,18,19)
eldercare_sf[rowSums(is.na(eldercare_sf))!=0,]
Simple feature collection with 0 features and 5 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] ADDRESSPOS ADDRESSSTR NAME       X_ADDR     Y_ADDR     geometry  
<0 rows> (or 0-length row.names)
primaryschools_sf[rowSums(is.na(primaryschools_sf))!=0,]
Simple feature collection with 0 features and 4 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
# A tibble: 0 × 5
# … with 5 variables: school_name <chr>, address <chr>, postal_code <chr>,
#   mainlevel_code <chr>, geometry <GEOMETRY [m]>
mrt_sf[rowSums(is.na(mrt_sf))!=0,]
Simple feature collection with 0 features and 2 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] Name        Description geometry   
<0 rows> (or 0-length row.names)
kindergarten_sf[rowSums(is.na(kindergarten_sf))!=0,]
Simple feature collection with 448 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11909.7 ymin: 25596.33 xmax: 43395.47 ymax: 48562.06
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO ADDRESSBUI ADDRESSFLO ADDRESSPOS
1        <NA>       <NA>       <NA>     560644
2        <NA>       <NA>       <NA>     600251
3        <NA>       <NA>       <NA>     600317
4        <NA>       <NA>       <NA>     671455
5        <NA>       <NA>       <NA>     670528
6        <NA>       <NA>       <NA>     670620
7        <NA>       <NA>       <NA>     380008
8        <NA>       <NA>       <NA>     118643
9        <NA>       <NA>       <NA>     519420
10       <NA>       <NA>       <NA>     299574
                                                                   ADDRESSSTR
1                                     644 Ang Mo Kio Ave 4  #01-850 S(560644)
2                         251 Jurong East Street 24 Blk 247 #01-110 S(600251)
3                                 317 Jurong East Street 31  #01-14 S(600317)
4                                           455A Segar Road  #01-01 S(671455)
5                             528 Jelapang Road Blk 526, 532 #01-79 S(670528)
6                  620 Bukit Panjang Ring Road Blk 615, 619 #01-828 S(670620)
7  8, UPPER BOON KENG ROAD, #01 - 03, MULTI STOREY CAR PARK, SINGAPORE 380008
8                                          302 Pasir Panjang Road   S(118643)
9                                             4 Pasir Ris Drive 6   S(519420)
10                                                1 Dunearn Close   S(299574)
   ADDRESSTYP    DESCRIPTIO HYPERLINK LANDXADDRE LANDYADDRE
1        <NA> Kindergartens      <NA>          0          0
2        <NA> Kindergartens      <NA>          0          0
3        <NA> Kindergartens      <NA>          0          0
4        <NA> Kindergartens      <NA>          0          0
5        <NA> Kindergartens      <NA>          0          0
6        <NA> Kindergartens      <NA>          0          0
7        <NA> Kindergartens      <NA>          0          0
8        <NA> Kindergartens      <NA>          0          0
9        <NA> Kindergartens      <NA>          0          0
10       <NA> Kindergartens      <NA>          0          0
                                                    NAME PHOTOURL
1  PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)     <NA>
2         PCF Sparkletots Preschool @ Yuhua Blk 251 (KN)     <NA>
3         PCF Sparkletots Preschool @ Yuhua Blk 317 (KN)     <NA>
4     PCF Sparkletots Preschool @ Zhenghua Blk 455A (KN)     <NA>
5      PCF Sparkletots Preschool @ Zhenghua Blk 528 (KN)     <NA>
6      PCF Sparkletots Preschool @ Zhenghua Blk 620 (KN)     <NA>
7        PCF SPARKLETOTS PRESCHOOL@KOLAM AYER BLK 8 (DS)     <NA>
8                      Pearlbank Montessori Kindergarten     <NA>
9                Pentecost Methodist Church Kindergarten     <NA>
10                                Pibos Garden Preschool     <NA>
            INC_CRC FMEL_UPD_D ADDRESSUNI                  geometry
1  904D106E26156265 2020-08-13       <NA>  POINT (28847.97 40124.9)
2  F735342764BD6BCC 2020-08-13       <NA> POINT (17578.65 36146.17)
3  564523E27D221C4D 2020-08-13       <NA> POINT (16609.18 36520.38)
4  7EED470791C85B92 2020-08-13       <NA> POINT (21088.36 41055.75)
5  DEB389E855725B16 2020-08-13       <NA>  POINT (20457.5 40777.31)
6  AB8E78A4C780367B 2020-08-13       <NA>  POINT (20033.94 40866.5)
7  FCAA7F53E78614D6 2020-08-13       <NA> POINT (32159.15 32778.05)
8  52E6CED97C628B57 2020-08-13       <NA> POINT (22042.91 29991.61)
9  8C80D6087330C37A 2020-08-13       <NA> POINT (41780.92 39252.46)
10 179B665553AAD262 2020-08-13       <NA> POINT (25916.23 33989.13)
kindergarten_sf <- kindergarten_sf %>% select(4,5,7,11,13,16)
kindergarten_sf
Simple feature collection with 448 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11909.7 ymin: 25596.33 xmax: 43395.47 ymax: 48562.06
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSPOS
1      560644
2      600251
3      600317
4      671455
5      670528
6      670620
7      380008
8      118643
9      519420
10     299574
                                                                   ADDRESSSTR
1                                     644 Ang Mo Kio Ave 4  #01-850 S(560644)
2                         251 Jurong East Street 24 Blk 247 #01-110 S(600251)
3                                 317 Jurong East Street 31  #01-14 S(600317)
4                                           455A Segar Road  #01-01 S(671455)
5                             528 Jelapang Road Blk 526, 532 #01-79 S(670528)
6                  620 Bukit Panjang Ring Road Blk 615, 619 #01-828 S(670620)
7  8, UPPER BOON KENG ROAD, #01 - 03, MULTI STOREY CAR PARK, SINGAPORE 380008
8                                          302 Pasir Panjang Road   S(118643)
9                                             4 Pasir Ris Drive 6   S(519420)
10                                                1 Dunearn Close   S(299574)
      DESCRIPTIO                                                  NAME
1  Kindergartens PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)
2  Kindergartens        PCF Sparkletots Preschool @ Yuhua Blk 251 (KN)
3  Kindergartens        PCF Sparkletots Preschool @ Yuhua Blk 317 (KN)
4  Kindergartens    PCF Sparkletots Preschool @ Zhenghua Blk 455A (KN)
5  Kindergartens     PCF Sparkletots Preschool @ Zhenghua Blk 528 (KN)
6  Kindergartens     PCF Sparkletots Preschool @ Zhenghua Blk 620 (KN)
7  Kindergartens       PCF SPARKLETOTS PRESCHOOL@KOLAM AYER BLK 8 (DS)
8  Kindergartens                     Pearlbank Montessori Kindergarten
9  Kindergartens               Pentecost Methodist Church Kindergarten
10 Kindergartens                                Pibos Garden Preschool
            INC_CRC                  geometry
1  904D106E26156265  POINT (28847.97 40124.9)
2  F735342764BD6BCC POINT (17578.65 36146.17)
3  564523E27D221C4D POINT (16609.18 36520.38)
4  7EED470791C85B92 POINT (21088.36 41055.75)
5  DEB389E855725B16  POINT (20457.5 40777.31)
6  AB8E78A4C780367B  POINT (20033.94 40866.5)
7  FCAA7F53E78614D6 POINT (32159.15 32778.05)
8  52E6CED97C628B57 POINT (22042.91 29991.61)
9  8C80D6087330C37A POINT (41780.92 39252.46)
10 179B665553AAD262 POINT (25916.23 33989.13)
kindergarten_sf[rowSums(is.na(kindergarten_sf))!=0,]
Simple feature collection with 0 features and 5 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] ADDRESSPOS ADDRESSSTR DESCRIPTIO NAME       INC_CRC    geometry  
<0 rows> (or 0-length row.names)
hawkercentre_sf[rowSums(is.na(hawkercentre_sf))!=0,]
Simple feature collection with 125 features and 21 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12874.19 ymin: 28355.97 xmax: 45241.4 ymax: 47872.53
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO   STATUS CLEANINGST ADDRESSUNI ADDRESSFLO HYPERLINK INFO_ON_CO
1         630 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
2          16 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
3          29 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
4         38A Existing       <NA>       <NA>       <NA>      <NA>       <NA>
5         166 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
6      221A/B Existing       <NA>       <NA>       <NA>      <NA>       <NA>
7         665 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
8         163 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
9         120 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
10        115 Existing       <NA>       <NA>       <NA>      <NA>       <NA>
   AWARDED_DA LANDYADDRE CLEANINGEN
1        <NA>   35039.64       <NA>
2        <NA>   33645.70       <NA>
3        <NA>   33497.85       <NA>
4        <NA>   30136.92       <NA>
5        <NA>   32184.16       <NA>
6        <NA>   36373.79       <NA>
7        <NA>   32059.13       <NA>
8        <NA>   29569.73       <NA>
9        <NA>   29857.81       <NA>
10       <NA>   29770.11       <NA>
                                                                                       PHOTOURL
1  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262155176127.jpg
2  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267853021378.jpg
3  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262226446535.jpg
4  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267807628554.jpg
5  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1265176120532.jpg
6  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262154766447.jpg
7  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1262156313186.jpg
8  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267851995788.jpg
9  http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267852344747.jpg
10 http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/resize_1267851251303.jpg
               DESCRIPTIO
1          HUP Rebuilding
2  HUP Standard Upgrading
3  HUP Standard Upgrading
4  HUP Standard Upgrading
5  HUP Standard Upgrading
6     HUP Reconfiguration
7  HUP Standard Upgrading
8        Opted out of HUP
9  HUP Standard Upgrading
10 HUP Standard Upgrading
                                                                         NAME
1                                                Bedok Reservoir Road Blk 630
2                                                     Bedok South Road Blk 16
3                    Bendemeer Road Blk 29 (Bendemeer Market and Food Centre)
4                                                         Beo Crescent Market
5                                                          Berseh Food Centre
6          Boon Lay Place Blk 221A/B (Boon Lay Place Market and Food Village)
7                         Buffalo Road Blk 665 (Tekka Centre/Zhu Jiao Market)
8               Bukit Merah Central Blk 163 (Bukit Merah Central Food Centre)
9                  Bukit Merah Lane 1 Blk 120 (Alexandra Village Food Centre)
10 Bukit Merah View Blk 115 (Blk 115 Bukit Merah View Market and Food Centre)
   ADDRESSTYP ADDRESSBUI LANDXADDRE           ADDRESSSTR ADDRESSPOS IMPLEMENTA
1           I       <NA>   36985.00 Bedok Reservoir Road     470630       <NA>
2           I       <NA>   39376.14     Bedok South Road     460016       <NA>
3           I       <NA>   31305.63       Bendemeer Road     330029       <NA>
4           I       <NA>   27336.64         Beo Crescent     169982       <NA>
5           I       <NA>   30619.27          Jalan Besar     208877       <NA>
6           I       <NA>   14587.57       Boon Lay Place     641221       <NA>
7           I       <NA>   29915.58         Buffalo Road     210665       <NA>
8           I       <NA>   26183.03  Bukit Merah Central     150163       <NA>
9           I       <NA>   24791.54   Bukit Merah Lane 1     150120       <NA>
10          I       <NA>   26745.83     Bukit Merah View     151115       <NA>
            INC_CRC FMEL_UPD_D                  geometry
1  BBA7FF2BCA329EE8 2021-10-25    POINT (36985 35039.64)
2  483F3877B4EE039F 2021-10-25  POINT (39376.14 33645.7)
3  A7DB5A3EB9F35DE3 2021-10-25 POINT (31305.63 33497.85)
4  D232402B3632CAB2 2021-10-25 POINT (27336.64 30136.92)
5  A7CCEB2CC0B77C6D 2021-10-25 POINT (30619.27 32184.16)
6  84F8C2B4D627C21E 2021-10-25 POINT (14587.57 36373.79)
7  D09ADF340C3B739E 2021-10-25 POINT (29915.58 32059.13)
8  16F0047A1B5AC0D0 2021-10-25 POINT (26183.03 29569.73)
9  98A3C26D3632E680 2021-10-25 POINT (24791.54 29857.81)
10 7A2BDBF509B4F7FF 2021-10-25 POINT (26745.83 29770.11)
hawkercentre_sf <- hawkercentre_sf %>% select(1,2,9,12,13,16,17,18,20,21,22)
hawkercentre_sf
Simple feature collection with 125 features and 10 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12874.19 ymin: 28355.97 xmax: 45241.4 ymax: 47872.53
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO   STATUS LANDYADDRE             DESCRIPTIO
1         630 Existing   35039.64         HUP Rebuilding
2          16 Existing   33645.70 HUP Standard Upgrading
3          29 Existing   33497.85 HUP Standard Upgrading
4         38A Existing   30136.92 HUP Standard Upgrading
5         166 Existing   32184.16 HUP Standard Upgrading
6      221A/B Existing   36373.79    HUP Reconfiguration
7         665 Existing   32059.13 HUP Standard Upgrading
8         163 Existing   29569.73       Opted out of HUP
9         120 Existing   29857.81 HUP Standard Upgrading
10        115 Existing   29770.11 HUP Standard Upgrading
                                                                         NAME
1                                                Bedok Reservoir Road Blk 630
2                                                     Bedok South Road Blk 16
3                    Bendemeer Road Blk 29 (Bendemeer Market and Food Centre)
4                                                         Beo Crescent Market
5                                                          Berseh Food Centre
6          Boon Lay Place Blk 221A/B (Boon Lay Place Market and Food Village)
7                         Buffalo Road Blk 665 (Tekka Centre/Zhu Jiao Market)
8               Bukit Merah Central Blk 163 (Bukit Merah Central Food Centre)
9                  Bukit Merah Lane 1 Blk 120 (Alexandra Village Food Centre)
10 Bukit Merah View Blk 115 (Blk 115 Bukit Merah View Market and Food Centre)
   LANDXADDRE           ADDRESSSTR ADDRESSPOS          INC_CRC FMEL_UPD_D
1    36985.00 Bedok Reservoir Road     470630 BBA7FF2BCA329EE8 2021-10-25
2    39376.14     Bedok South Road     460016 483F3877B4EE039F 2021-10-25
3    31305.63       Bendemeer Road     330029 A7DB5A3EB9F35DE3 2021-10-25
4    27336.64         Beo Crescent     169982 D232402B3632CAB2 2021-10-25
5    30619.27          Jalan Besar     208877 A7CCEB2CC0B77C6D 2021-10-25
6    14587.57       Boon Lay Place     641221 84F8C2B4D627C21E 2021-10-25
7    29915.58         Buffalo Road     210665 D09ADF340C3B739E 2021-10-25
8    26183.03  Bukit Merah Central     150163 16F0047A1B5AC0D0 2021-10-25
9    24791.54   Bukit Merah Lane 1     150120 98A3C26D3632E680 2021-10-25
10   26745.83     Bukit Merah View     151115 7A2BDBF509B4F7FF 2021-10-25
                    geometry
1     POINT (36985 35039.64)
2   POINT (39376.14 33645.7)
3  POINT (31305.63 33497.85)
4  POINT (27336.64 30136.92)
5  POINT (30619.27 32184.16)
6  POINT (14587.57 36373.79)
7  POINT (29915.58 32059.13)
8  POINT (26183.03 29569.73)
9  POINT (24791.54 29857.81)
10 POINT (26745.83 29770.11)
hawkercentre_sf <- na.omit(hawkercentre_sf)
hawkercentre_sf
Simple feature collection with 116 features and 10 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12874.19 ymin: 28355.97 xmax: 45241.4 ymax: 47282.71
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO   STATUS LANDYADDRE             DESCRIPTIO
1         630 Existing   35039.64         HUP Rebuilding
2          16 Existing   33645.70 HUP Standard Upgrading
3          29 Existing   33497.85 HUP Standard Upgrading
4         38A Existing   30136.92 HUP Standard Upgrading
5         166 Existing   32184.16 HUP Standard Upgrading
6      221A/B Existing   36373.79    HUP Reconfiguration
7         665 Existing   32059.13 HUP Standard Upgrading
8         163 Existing   29569.73       Opted out of HUP
9         120 Existing   29857.81 HUP Standard Upgrading
10        115 Existing   29770.11 HUP Standard Upgrading
                                                                         NAME
1                                                Bedok Reservoir Road Blk 630
2                                                     Bedok South Road Blk 16
3                    Bendemeer Road Blk 29 (Bendemeer Market and Food Centre)
4                                                         Beo Crescent Market
5                                                          Berseh Food Centre
6          Boon Lay Place Blk 221A/B (Boon Lay Place Market and Food Village)
7                         Buffalo Road Blk 665 (Tekka Centre/Zhu Jiao Market)
8               Bukit Merah Central Blk 163 (Bukit Merah Central Food Centre)
9                  Bukit Merah Lane 1 Blk 120 (Alexandra Village Food Centre)
10 Bukit Merah View Blk 115 (Blk 115 Bukit Merah View Market and Food Centre)
   LANDXADDRE           ADDRESSSTR ADDRESSPOS          INC_CRC FMEL_UPD_D
1    36985.00 Bedok Reservoir Road     470630 BBA7FF2BCA329EE8 2021-10-25
2    39376.14     Bedok South Road     460016 483F3877B4EE039F 2021-10-25
3    31305.63       Bendemeer Road     330029 A7DB5A3EB9F35DE3 2021-10-25
4    27336.64         Beo Crescent     169982 D232402B3632CAB2 2021-10-25
5    30619.27          Jalan Besar     208877 A7CCEB2CC0B77C6D 2021-10-25
6    14587.57       Boon Lay Place     641221 84F8C2B4D627C21E 2021-10-25
7    29915.58         Buffalo Road     210665 D09ADF340C3B739E 2021-10-25
8    26183.03  Bukit Merah Central     150163 16F0047A1B5AC0D0 2021-10-25
9    24791.54   Bukit Merah Lane 1     150120 98A3C26D3632E680 2021-10-25
10   26745.83     Bukit Merah View     151115 7A2BDBF509B4F7FF 2021-10-25
                    geometry
1     POINT (36985 35039.64)
2   POINT (39376.14 33645.7)
3  POINT (31305.63 33497.85)
4  POINT (27336.64 30136.92)
5  POINT (30619.27 32184.16)
6  POINT (14587.57 36373.79)
7  POINT (29915.58 32059.13)
8  POINT (26183.03 29569.73)
9  POINT (24791.54 29857.81)
10 POINT (26745.83 29770.11)
hawkercentre_sf[rowSums(is.na(hawkercentre_sf))!=0,]
Simple feature collection with 0 features and 10 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
 [1] ADDRESSBLO STATUS     LANDYADDRE DESCRIPTIO NAME       LANDXADDRE
 [7] ADDRESSSTR ADDRESSPOS INC_CRC    FMEL_UPD_D geometry  
<0 rows> (or 0-length row.names)
nationalparks_sf [rowSums(is.na(nationalparks_sf ))!=0,]
Simple feature collection with 352 features and 15 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12373.11 ymin: 21869.93 xmax: 46735.95 ymax: 49231.09
Projected CRS: SVY21 / Singapore TM
First 10 features:
   ADDRESSBLO ADDRESSBUI ADDRESSTYP
1        <NA>       <NA>       <NA>
2        <NA>       <NA>       <NA>
3        <NA>       <NA>       <NA>
4        <NA>       <NA>       <NA>
5        <NA>       <NA>       <NA>
6        <NA>       <NA>       <NA>
7        <NA>       <NA>       <NA>
8        <NA>       <NA>       <NA>
9        <NA>       <NA>       <NA>
10       <NA>       <NA>       <NA>
                                                                             HYPERLINK
1                                                                                 <NA>
2                                                                                 <NA>
3                                                                                 <NA>
4                                                                                 <NA>
5                                                                                 <NA>
6                                                                                 <NA>
7  www.nparks.gov.sg/gardens-parks-and-nature/parks-and-nature-reserves/sun-plaza-park
8                                                                                 <NA>
9                                                                                 <NA>
10                                                                                <NA>
   LANDXADDRE LANDYADDRE                          NAME PHOTOURL ADDRESSPOS
1    29594.30   29323.41              Telok Ayer Green     <NA>       <NA>
2    28695.60   39413.70 Mayflower Crescent Playground     <NA>       <NA>
3    30676.61   41137.35    Sunrise Drive Playground 1     <NA>       <NA>
4    39994.09   39355.59      Elias Terrace Playground     <NA>       <NA>
5    40813.11   33764.61         Kew Avenue Playground     <NA>       <NA>
6    37385.95   32814.41   Greenfield Drive Playground     <NA>       <NA>
7    40371.20   37926.31                Sun Plaza Park     <NA>       <NA>
8    45135.69   41460.78   Changi Point Ferry Terminal     <NA>       <NA>
9    27896.88   39226.05         Shangri-La Playground     <NA>       <NA>
10   38142.70   33595.77   Opera Estate Football Field     <NA>       <NA>
                                                      DESCRIPTIO ADDRESSSTR
1  Bounded by Amoy Street, Boon Tat Street and Telok Ayer Street       <NA>
2     At the junction of Mayflower Crescent and Mayflower Avenue       <NA>
3                                    Located along Sunrise Drive       <NA>
4                      Junction of Elias Terrace and Elias Green       <NA>
5                           Junction of Kew Avenue and Kew Drive       <NA>
6                                               Greenfield Drive       <NA>
7                     Along Tampines Avenue 7, Tampines Avenue 9       <NA>
8                                              Near Lor Bekukong       <NA>
9                  Junction of Ang Mo Kio Ave 2 and Jalan Lanjut       <NA>
10                                              Swan Lake Avenue       <NA>
   ADDRESSFLO          INC_CRC FMEL_UPD_D ADDRESSUNI                  geometry
1        <NA> 07CFE6567539200A 2020-02-18       <NA>  POINT (29594.3 29323.41)
2        <NA> B01AE2FF8B58F5CA 2020-02-18       <NA>   POINT (28695.6 39413.7)
3        <NA> 66086C14E8DACE2D 2020-02-18       <NA> POINT (30676.61 41137.35)
4        <NA> 8B06ED4574E90FC9 2020-02-18       <NA> POINT (39994.09 39355.59)
5        <NA> E3FD62E109D01A9C 2020-02-18       <NA> POINT (40813.11 33764.61)
6        <NA> 8B896BD1155428FB 2020-02-18       <NA> POINT (37385.95 32814.41)
7        <NA> 5BBA8A8EB630BA01 2020-02-18       <NA>  POINT (40371.2 37926.31)
8        <NA> AA9DDE6381971B22 2020-02-18       <NA> POINT (45135.69 41460.78)
9        <NA> FF16ED3C3767BD2C 2020-02-18       <NA> POINT (27896.88 39226.05)
10       <NA> AD2BA4AF1306E38A 2020-02-18       <NA>  POINT (38142.7 33595.77)
nationalparks_sf <- nationalparks_sf %>% select(5,6,7,13,14,16)
nationalparks_sf
Simple feature collection with 352 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12373.11 ymin: 21869.93 xmax: 46735.95 ymax: 49231.09
Projected CRS: SVY21 / Singapore TM
First 10 features:
   LANDXADDRE LANDYADDRE                          NAME          INC_CRC
1    29594.30   29323.41              Telok Ayer Green 07CFE6567539200A
2    28695.60   39413.70 Mayflower Crescent Playground B01AE2FF8B58F5CA
3    30676.61   41137.35    Sunrise Drive Playground 1 66086C14E8DACE2D
4    39994.09   39355.59      Elias Terrace Playground 8B06ED4574E90FC9
5    40813.11   33764.61         Kew Avenue Playground E3FD62E109D01A9C
6    37385.95   32814.41   Greenfield Drive Playground 8B896BD1155428FB
7    40371.20   37926.31                Sun Plaza Park 5BBA8A8EB630BA01
8    45135.69   41460.78   Changi Point Ferry Terminal AA9DDE6381971B22
9    27896.88   39226.05         Shangri-La Playground FF16ED3C3767BD2C
10   38142.70   33595.77   Opera Estate Football Field AD2BA4AF1306E38A
   FMEL_UPD_D                  geometry
1  2020-02-18  POINT (29594.3 29323.41)
2  2020-02-18   POINT (28695.6 39413.7)
3  2020-02-18 POINT (30676.61 41137.35)
4  2020-02-18 POINT (39994.09 39355.59)
5  2020-02-18 POINT (40813.11 33764.61)
6  2020-02-18 POINT (37385.95 32814.41)
7  2020-02-18  POINT (40371.2 37926.31)
8  2020-02-18 POINT (45135.69 41460.78)
9  2020-02-18 POINT (27896.88 39226.05)
10 2020-02-18  POINT (38142.7 33595.77)
nationalparks_sf [rowSums(is.na(nationalparks_sf ))!=0,]
Simple feature collection with 0 features and 5 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] LANDXADDRE LANDYADDRE NAME       INC_CRC    FMEL_UPD_D geometry  
<0 rows> (or 0-length row.names)
supermarkets_sf[rowSums(is.na(supermarkets_sf ))!=0,]
Simple feature collection with 75 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 5096.172 ymin: 25529.08 xmax: 42504.96 ymax: 49218.44
Projected CRS: SVY21 / Singapore TM
First 10 features:
                                LIC_NAME BLK_HOUSE             STR_NAME UNIT_NO
26             B & S MINI MART PTE. LTD.       310         GEYLANG ROAD    <NA>
27                  CHEE MIN KUAN, SIMON       342  TANJONG KATONG ROAD    <NA>
32               MURUGAN TRADERS PTE LTD         9         TRACTOR ROAD    <NA>
45 SWISS BUTCHERY (2013) PRIVATE LIMITED        30     GREENWOOD AVENUE    <NA>
46            ROSWELL ENTERPRISE PTE LTD        34     GREENWOOD AVENUE    <NA>
51 COLD STORAGE SINGAPORE (1983) PTE LTD       154      WEST COAST ROAD    <NA>
52 COLD STORAGE SINGAPORE (1983) PTE LTD         1 VISTA EXCHANGE GREEN    <NA>
53         FOODIE MARKET PLACE PTE. LTD.       225          OUTRAM ROAD    <NA>
54              MUSTAFA HOLDINGS PTE LTD       147       SYED ALWI ROAD    <NA>
55                   SUPERNATURE PTE LTD       583         ORCHARD ROAD    <NA>
   POSTCODE      LIC_NO          INC_CRC FMEL_UPD_D                  geometry
26   389349 CE14Q85A000 4DFE6BEAE93CDC87 2018-10-03 POINT (33153.17 32749.73)
27   437112 SE14965K000 7827732A1824519B 2018-10-03 POINT (35031.32 31785.86)
32   627970 SW18B79P000 F1EBC8EAFDAA9EF8 2018-10-03 POINT (13730.52 34128.79)
45   289230 CE07705A000 C869661C01CC7122 2017-11-29 POINT (25025.04 34884.91)
46   289236  B03078N000 1AC0DC1360B91602 2017-11-29 POINT (25014.53 34892.79)
51   127371 SW08807K000 50EDCA33BB1577BC 2017-11-29 POINT (20428.43 31769.17)
52   138617 SW12J97L000 A0FE827922FA7CEA 2017-11-29 POINT (22980.34 32130.27)
53   169038 CE10I13B000 F07E9FCD9F10FF17 2017-11-29 POINT (28228.45 29542.99)
54   207706  W03142N000 774A8FF4716A0468 2017-11-29 POINT (30491.02 32451.33)
55   238884 CE13539N000 1CBCEAA88F882DA8 2017-11-29 POINT (27464.74 32058.01)

Looks like the “unit_no” column in supermarkets_sf contains NA values. Since this column is irrelevant, let’s just drop it.

supermarkets_sf <- supermarkets_sf %>% select(1,2,3,5,6,7,8,9)
supermarkets_sf
Simple feature collection with 526 features and 7 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4901.188 ymin: 25529.08 xmax: 46948.22 ymax: 49233.6
Projected CRS: SVY21 / Singapore TM
First 10 features:
                                      LIC_NAME BLK_HOUSE             STR_NAME
1  LI LI CHENG SUPERMARKET (PUNGGOL) PTE. LTD.      273C        PUNGGOL PLACE
2              SHENG SIONG SUPERMARKET PTE LTD        11 UPPER BOON KENG ROAD
3        COLD STORAGE SINGAPORE (1983) PTE LTD       683     HOUGANG AVENUE 8
4        COLD STORAGE SINGAPORE (1983) PTE LTD       631 BEDOK RESERVOIR ROAD
5                      YES SUPERMARKET PTE LTD      201B   TAMPINES STREET 21
6                   SUZYAMEER FROZEN PTE. LTD.      201D   TAMPINES STREET 21
7                            G8 MART PTE. LTD.       421 ANG MO KIO AVENUE 10
8              SHENG SIONG SUPERMARKET PTE LTD       233  ANG MO KIO AVENUE 3
9             PRIME SUPERMARKET (1996) PTE LTD       106     HOUGANG AVENUE 1
10                                TAN KWEE ENG       327     YISHUN RING ROAD
   POSTCODE      LIC_NO          INC_CRC FMEL_UPD_D                  geometry
1    823273 NE12I65N000 3DE8AF6E76F9D3D4 2017-11-29 POINT (35561.22 42685.17)
2    380011  E73010V000 F361759A8261CD6E 2017-11-29 POINT (32184.01 32947.46)
3    530683 NE11909C000 1DC69902E02077CE 2017-11-29 POINT (33903.48 39480.46)
4    470631  S02210X000 4E2560154B58BA38 2017-11-29 POINT (37083.82 35017.47)
5    522201  S02037J000 559A9A00D9FF8A55 2017-11-29  POINT (41320.3 37283.82)
6    524201 NE08357A000 1D32060098628881 2017-11-29 POINT (41384.47 37152.14)
7    560421 CE13401C000 E83AE5A9842F67BC 2017-11-29 POINT (30186.63 38602.77)
8    560233 CE04334P000 08D1E417EB224327 2017-11-29 POINT (28380.83 38842.16)
9    530106  S02059X000 3DA5C840D472C779 2017-11-29 POINT (34383.76 37311.19)
10   760327  B02041C000 FBB8A845FD8ADDC4 2017-11-29 POINT (29010.23 45755.51)
supermarkets_sf[rowSums(is.na(supermarkets_sf ))!=0,]
Simple feature collection with 0 features and 7 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] LIC_NAME   BLK_HOUSE  STR_NAME   POSTCODE   LIC_NO     INC_CRC    FMEL_UPD_D
[8] geometry  
<0 rows> (or 0-length row.names)
topprimary_sf[rowSums(is.na(topprimary_sf ))!=0,]
Simple feature collection with 0 features and 2 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] address  postal   geometry
<0 rows> (or 0-length row.names)
mall_coordinates_sf[rowSums(is.na(mall_coordinates_sf))!=0,]
Simple feature collection with 0 features and 2 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] address  postal   geometry
<0 rows> (or 0-length row.names)
cbd_coords_sf[rowSums(is.na(cbd_coords_sf))!=0,]
Simple feature collection with 0 features and 1 field
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] name     geometry
<0 rows> (or 0-length row.names)
chas_sf[rowSums(is.na(chas_sf))!=0,]
Simple feature collection with 0 features and 2 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] Name        Description geometry   
<0 rows> (or 0-length row.names)

4.0 Importing Aspatial Data

hdb_resale = read_csv("data/geospatial/data_extracted/resale/resale-flat-prices-based-on-registration-date-from-jan-2017-onwards.csv")
hdb_resale
# A tibble: 148,000 × 11
   month   town    flat_…¹ block stree…² store…³ floor…⁴ flat_…⁵ lease…⁶ remai…⁷
   <chr>   <chr>   <chr>   <chr> <chr>   <chr>     <dbl> <chr>     <dbl> <chr>  
 1 2017-01 ANG MO… 2 ROOM  406   ANG MO… 10 TO …      44 Improv…    1979 61 yea…
 2 2017-01 ANG MO… 3 ROOM  108   ANG MO… 01 TO …      67 New Ge…    1978 60 yea…
 3 2017-01 ANG MO… 3 ROOM  602   ANG MO… 01 TO …      67 New Ge…    1980 62 yea…
 4 2017-01 ANG MO… 3 ROOM  465   ANG MO… 04 TO …      68 New Ge…    1980 62 yea…
 5 2017-01 ANG MO… 3 ROOM  601   ANG MO… 01 TO …      67 New Ge…    1980 62 yea…
 6 2017-01 ANG MO… 3 ROOM  150   ANG MO… 01 TO …      68 New Ge…    1981 63 yea…
 7 2017-01 ANG MO… 3 ROOM  447   ANG MO… 04 TO …      68 New Ge…    1979 61 yea…
 8 2017-01 ANG MO… 3 ROOM  218   ANG MO… 04 TO …      67 New Ge…    1976 58 yea…
 9 2017-01 ANG MO… 3 ROOM  447   ANG MO… 04 TO …      68 New Ge…    1979 61 yea…
10 2017-01 ANG MO… 3 ROOM  571   ANG MO… 01 TO …      67 New Ge…    1979 61 yea…
# … with 147,990 more rows, 1 more variable: resale_price <dbl>, and
#   abbreviated variable names ¹​flat_type, ²​street_name, ³​storey_range,
#   ⁴​floor_area_sqm, ⁵​flat_model, ⁶​lease_commence_date, ⁷​remaining_lease

Let’s check for NA values.

hdb_resale[rowSums(is.na(hdb_resale ))!=0,]
# A tibble: 0 × 11
# … with 11 variables: month <chr>, town <chr>, flat_type <chr>, block <chr>,
#   street_name <chr>, storey_range <chr>, floor_area_sqm <dbl>,
#   flat_model <chr>, lease_commence_date <dbl>, remaining_lease <chr>,
#   resale_price <dbl>

Looks like there’s none!

When we look at hdb_resale_train, we realise that there are no longitude and latitude columns. Moreover, we should be focusing on either three-room, four-room or five-room flats.

hdb_resale <- hdb_resale  %>% 
  filter(flat_type == "4 ROOM")
hdb_resale
# A tibble: 61,830 × 11
   month   town    flat_…¹ block stree…² store…³ floor…⁴ flat_…⁵ lease…⁶ remai…⁷
   <chr>   <chr>   <chr>   <chr> <chr>   <chr>     <dbl> <chr>     <dbl> <chr>  
 1 2017-01 ANG MO… 4 ROOM  472   ANG MO… 10 TO …      92 New Ge…    1979 61 yea…
 2 2017-01 ANG MO… 4 ROOM  475   ANG MO… 07 TO …      91 New Ge…    1979 61 yea…
 3 2017-01 ANG MO… 4 ROOM  629   ANG MO… 01 TO …      94 New Ge…    1981 63 yea…
 4 2017-01 ANG MO… 4 ROOM  546   ANG MO… 01 TO …      92 New Ge…    1981 63 yea…
 5 2017-01 ANG MO… 4 ROOM  131   ANG MO… 01 TO …      98 New Ge…    1979 61 yea…
 6 2017-01 ANG MO… 4 ROOM  254   ANG MO… 04 TO …      97 New Ge…    1977 59 yea…
 7 2017-01 ANG MO… 4 ROOM  470   ANG MO… 04 TO …      92 New Ge…    1979 61 yea…
 8 2017-01 ANG MO… 4 ROOM  601   ANG MO… 04 TO …      91 New Ge…    1980 62 yea…
 9 2017-01 ANG MO… 4 ROOM  463   ANG MO… 04 TO …      92 New Ge…    1980 62 yea…
10 2017-01 ANG MO… 4 ROOM  207   ANG MO… 04 TO …      97 New Ge…    1976 58 yea…
# … with 61,820 more rows, 1 more variable: resale_price <dbl>, and abbreviated
#   variable names ¹​flat_type, ²​street_name, ³​storey_range, ⁴​floor_area_sqm,
#   ⁵​flat_model, ⁶​lease_commence_date, ⁷​remaining_lease
hdb_resale <- hdb_resale %>% filter(month >= "2021-01" & month <= "2023-02")

4.1 Transform Resale Data

4.1.1 Create new columns

Here, we use mutate function of dplyr package to create columns such as:

  • address: concatenation of the block and street_name columns using paste() function of base R package

  • remaining_lease_yr & remaining_lease_mth: split the year and months part of the remaining_lease respectively using str_sub() function of stringr package then converting the character to integer using as.integer() function of base R package

hdb_resale_transformed <- hdb_resale %>%
  mutate(hdb_resale, address = paste(block,street_name)) %>%
  mutate(hdb_resale, remaining_lease_yr = as.integer(str_sub(remaining_lease, 0, 2))) %>%
  mutate(hdb_resale, remaining_lease_mth = as.integer(str_sub(remaining_lease, 9, 11)))
hdb_resale_transformed
# A tibble: 25,505 × 14
   month   town    flat_…¹ block stree…² store…³ floor…⁴ flat_…⁵ lease…⁶ remai…⁷
   <chr>   <chr>   <chr>   <chr> <chr>   <chr>     <dbl> <chr>     <dbl> <chr>  
 1 2021-01 ANG MO… 4 ROOM  547   ANG MO… 04 TO …      92 New Ge…    1981 59 yea…
 2 2021-01 ANG MO… 4 ROOM  414   ANG MO… 01 TO …      92 New Ge…    1979 57 yea…
 3 2021-01 ANG MO… 4 ROOM  509   ANG MO… 01 TO …      91 New Ge…    1980 58 yea…
 4 2021-01 ANG MO… 4 ROOM  467   ANG MO… 07 TO …      92 New Ge…    1979 57 yea…
 5 2021-01 ANG MO… 4 ROOM  571   ANG MO… 07 TO …      92 New Ge…    1979 57 yea…
 6 2021-01 ANG MO… 4 ROOM  134   ANG MO… 07 TO …      98 New Ge…    1978 56 yea…
 7 2021-01 ANG MO… 4 ROOM  204   ANG MO… 07 TO …      92 New Ge…    1977 55 yea…
 8 2021-01 ANG MO… 4 ROOM  429   ANG MO… 04 TO …      92 New Ge…    1978 56 yea…
 9 2021-01 ANG MO… 4 ROOM  413   ANG MO… 10 TO …      92 New Ge…    1979 57 yea…
10 2021-01 ANG MO… 4 ROOM  415   ANG MO… 07 TO …      92 New Ge…    1979 57 yea…
# … with 25,495 more rows, 4 more variables: resale_price <dbl>, address <chr>,
#   remaining_lease_yr <int>, remaining_lease_mth <int>, and abbreviated
#   variable names ¹​flat_type, ²​street_name, ³​storey_range, ⁴​floor_area_sqm,
#   ⁵​flat_model, ⁶​lease_commence_date, ⁷​remaining_lease

4.1.2 Sum up remaining lease in months

  • Replace NA values in remaining_lease_mth with the value 0 with the help of is.na() function of base R package

  • Multiply remaining_lease_yr by 12 to convert it to months unit

  • Create remaining_lease_mths column using mutate function of dplyr package which contains the summation of the remaining_lease_yr and remaining_lease_mths using rowSums() function of base R package

  • Select required columns for analysis using select() function of base R package

hdb_resale_transformed$remaining_lease_mth[is.na(hdb_resale_transformed$remaining_lease_mth)] <- 0
hdb_resale_transformed$remaining_lease_yr <- hdb_resale_transformed$remaining_lease_yr * 12
hdb_resale_transformed <- hdb_resale_transformed %>% 
  mutate(hdb_resale_transformed, remaining_lease_mths = rowSums(hdb_resale_transformed[, c("remaining_lease_yr", "remaining_lease_mth")])) %>%
  select(month, town, address, block, street_name, flat_type, storey_range, floor_area_sqm, flat_model, 
         lease_commence_date, remaining_lease_mths, resale_price)
head(hdb_resale_transformed)
# A tibble: 6 × 12
  month   town     address block stree…¹ flat_…² store…³ floor…⁴ flat_…⁵ lease…⁶
  <chr>   <chr>    <chr>   <chr> <chr>   <chr>   <chr>     <dbl> <chr>     <dbl>
1 2021-01 ANG MO … 547 AN… 547   ANG MO… 4 ROOM  04 TO …      92 New Ge…    1981
2 2021-01 ANG MO … 414 AN… 414   ANG MO… 4 ROOM  01 TO …      92 New Ge…    1979
3 2021-01 ANG MO … 509 AN… 509   ANG MO… 4 ROOM  01 TO …      91 New Ge…    1980
4 2021-01 ANG MO … 467 AN… 467   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
5 2021-01 ANG MO … 571 AN… 571   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
6 2021-01 ANG MO … 134 AN… 134   ANG MO… 4 ROOM  07 TO …      98 New Ge…    1978
# … with 2 more variables: remaining_lease_mths <dbl>, resale_price <dbl>, and
#   abbreviated variable names ¹​street_name, ²​flat_type, ³​storey_range,
#   ⁴​floor_area_sqm, ⁵​flat_model, ⁶​lease_commence_date

4.2 Retrieve Postal Codes and Coordinates of Addresses

This section will focus on retrieving the relevant data like postal codes and coordinates of the addresses which is required to get the proximity to locational factors later on.

4.2.1 Create a list storing unique addresses

  • We create a list to store unique addresses to ensure that we do not run the GET request more than what is necessary

  • We can also sort it to make it easier for us to see at which address the GET request will fail.

  • Here, we use unique() function of base R package to extract the unique addresses then use sort() function of base R package to sort the unique vector.

add_list <- sort(unique(hdb_resale_transformed$address))
head(add_list)
[1] "1 CHAI CHEE RD"    "1 PINE CL"         "1 ST. GEORGE'S RD"
[4] "1 TECK WHYE AVE"   "1 TOH YI DR"       "10 CHAI CHEE RD"  

4.2.2 Create function to retrieve coordinates from OneMap.Sg API

get_coords <- function(add_list){
  
  # Create a data frame to store all retrieved coordinates
  postal_coords <- data.frame()
    
  for (i in add_list){
    #print(i)

    r <- GET('https://developers.onemap.sg/commonapi/search?',
           query=list(searchVal=i,
                     returnGeom='Y',
                     getAddrDetails='Y'))
    data <- fromJSON(rawToChar(r$content))
    found <- data$found
    res <- data$results
    
    # Create a new data frame for each address
    new_row <- data.frame()
    
    # If single result, append 
    if (found == 1){
      postal <- res$POSTAL 
      lat <- res$LATITUDE
      lng <- res$LONGITUDE
      new_row <- data.frame(address= i, postal = postal, latitude = lat, longitude = lng)
    }
    
    # If multiple results, drop NIL and append top 1
    else if (found > 1){
      # Remove those with NIL as postal
      res_sub <- res[res$POSTAL != "NIL", ]
      
      # Set as NA first if no Postal
      if (nrow(res_sub) == 0) {
          new_row <- data.frame(address= i, postal = NA, latitude = NA, longitude = NA)
      }
      
      else{
        top1 <- head(res_sub, n = 1)
        postal <- top1$POSTAL 
        lat <- top1$LATITUDE
        lng <- top1$LONGITUDE
        new_row <- data.frame(address= i, postal = postal, latitude = lat, longitude = lng)
      }
    }

    else {
      new_row <- data.frame(address= i, postal = NA, latitude = NA, longitude = NA)
    }
    
    # Add the row
    postal_coords <- rbind(postal_coords, new_row)
  }
  return(postal_coords)
}

4.2.3 Call get_coords function to retrieve resale coordinates

coords <- get_coords(add_list)
coords

4.2.4 Inspect results

Here, we check whether the relevant columns contains any NA values with is.na() function of base R package and also “NIL”.

coords[(is.na(coords$postal) | is.na(coords$latitude) | is.na(coords$longitude) | coords$postal=="NIL"), ]

There are 2 addresses that does not contain any postal code but contains the geographical coordinates - 215 CHOA CHU KANG CTRL and 216 CHOA CHU KANG CTRL. However, as OneMapAPISG returned the same set of coordinates for both of these addresses, we shall proceed with keeping them as we are more interested in the coordinates for our analysis later on.

4.2.4 Combine resale and coordinates data

After retrieving the coordinates, we should combine the successful ones with our transformed resale dataset.

We can do this using left_join() function of dplyr package and our data will be stored in rs_coords.

rs_coords <- left_join(hdb_resale_transformed, coords, by = c('address' = 'address'))
rs_coords

4.3 Write file to rds

As our subset resale dataset is now complete with the coordinates, we can now save it as an rds file.

This also helps us to prevent running the GET request more than what is needed.

rs_coords_rds <- write_rds(rs_coords, "data/model/rds/rs_coords.rds")

4.4 Read rds_coords RDS file

rs_coords <- read_rds("data/model/rds/rs_coords.rds")
head(rs_coords)
# A tibble: 6 × 15
  month   town     address block stree…¹ flat_…² store…³ floor…⁴ flat_…⁵ lease…⁶
  <chr>   <chr>    <chr>   <chr> <chr>   <chr>   <chr>     <dbl> <chr>     <dbl>
1 2021-01 ANG MO … 547 AN… 547   ANG MO… 4 ROOM  04 TO …      92 New Ge…    1981
2 2021-01 ANG MO … 414 AN… 414   ANG MO… 4 ROOM  01 TO …      92 New Ge…    1979
3 2021-01 ANG MO … 509 AN… 509   ANG MO… 4 ROOM  01 TO …      91 New Ge…    1980
4 2021-01 ANG MO … 467 AN… 467   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
5 2021-01 ANG MO … 571 AN… 571   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
6 2021-01 ANG MO … 134 AN… 134   ANG MO… 4 ROOM  07 TO …      98 New Ge…    1978
# … with 5 more variables: remaining_lease_mths <dbl>, resale_price <dbl>,
#   postal <chr>, latitude <chr>, longitude <chr>, and abbreviated variable
#   names ¹​street_name, ²​flat_type, ³​storey_range, ⁴​floor_area_sqm,
#   ⁵​flat_model, ⁶​lease_commence_date

4.5 Assign and Transform CRS and Check

Since the coordinate columns are Latitude & Longitude which are in decimal degrees, the projected CRS will be WGS84.

We will need to assign them the respective EPSG code 4326 first before transforming it to 3414 which is the EPSG code for SVY21.

Here we use, - st_as_sf() function of sf package to convert the data frame into sf object

  • st_transform() function of sf package to transform the coordinates of the sf object
rs_coords_sf <- st_as_sf(rs_coords,
                    coords = c("longitude", 
                               "latitude"),
                    crs=4326) %>%
  st_transform(crs = 3414)
st_crs(rs_coords_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

4.6 Check for invalid geometries

length(which(st_is_valid(rs_coords_sf) == FALSE))
[1] 0

Looks like there’s no invalid geometry.

4.7 Plot hdb resale points

tmap_mode("view")
tm_shape(rs_coords_sf)+
  tm_dots(col="blue", size = 0.02)
tmap_mode("plot")

6.0 Calculating Proximity Between Locational Factors & HDB flats

In this section, we’re going to calculate proximity between each locational factor (eg kindergartens…) and hdb flats.

6.1 Create GET_PROX function to calculate proximity

The following code chunk performs 3 steps:

  • creates a matrix of distances between the HDB and the locational factor using st_distance of sf package.

  • get the nearest point of the locational factor by looking at the minimum distance using min function of base R package then add it to HDB resale data under a new column using mutate() function of dpylr package.

  • will rename the column name according to input given by user so that the columns have appropriate and distinct names that are different from one another.

get_prox <- function(origin_df, dest_df, col_name){
  
  # creates a matrix of distances
  dist_matrix <- st_distance(origin_df, dest_df)           
  
  # find the nearest location_factor and create new data frame
  near <- origin_df %>% 
    mutate(PROX = apply(dist_matrix, 1, function(x) min(x)) / 1000) 
  
  # rename column name according to input parameter
  names(near)[names(near) == 'PROX'] <- col_name

  # Return df
  return(near)
}

6.2 Call GET_PROX function

Here, we call the get_prox function created earlier to get the proximity of the resale HDB and locational factors such as: - childcare - eldercare - primary schools - top primary schools - mrt - kindergartens - hawker centres - national parks - supermarkets - shopping malls

The proximity will then be created as a new column under the rs_coords_sf dataframe.

rs_coords_sf <- get_prox(rs_coords_sf, childcare_sf, "PROX_CHILDCARE") 

rs_coords_sf <- get_prox(rs_coords_sf, eldercare_sf, "PROX_ELDERLYCARE") 

rs_coords_sf <- get_prox(rs_coords_sf, busstop_sf, "PROX_BUSSTOP") 

rs_coords_sf <- get_prox(rs_coords_sf, primaryschools_sf, "PROX_PRISCH_ALL") 

rs_coords_sf <- get_prox(rs_coords_sf, mrt_sf, "PROX_MRT") 

rs_coords_sf <- get_prox(rs_coords_sf, kindergarten_sf, "PROX_KINDERGARTEN") 

rs_coords_sf <- get_prox(rs_coords_sf, hawkercentre_sf, "PROX_HAWKERCENTRE") 

rs_coords_sf <- get_prox(rs_coords_sf, nationalparks_sf, "PROX_NATIONALPARKS") 

rs_coords_sf <- get_prox(rs_coords_sf, supermarkets_sf, "PROX_SUPERMARKET") 

rs_coords_sf <- get_prox(rs_coords_sf, topprimary_sf, "PROX_GOOD_PRISCH")

rs_coords_sf <- get_prox(rs_coords_sf, mall_coordinates_sf, "PROX_MALL") 

rs_coords_sf <- get_prox(rs_coords_sf, cbd_coords_sf, "PROX_CBD")

rs_coords_sf <- get_prox(rs_coords_sf, chas_sf, "PROX_CHAS")
rs_coords_sf

6.3 Create function to calculate number of factors within distance

The code chunk will perform 3 steps:

  • create a matrix of distances between the HDB and the locational factor using st_distance of sf package.

  • get the sum of points of the locational factor that are within the threshold distance using sum function of base R package then add it to HDB resale data under a new column using mutate() function of dpylr package.

  • Lastly, it will rename the column name according to input given by user so that the columns have appropriate and distinct names that are different from one another.

get_within <- function(origin_df, dest_df, threshold_dist, col_name){
  
  # creates a matrix of distances
  dist_matrix <- st_distance(origin_df, dest_df)   
  
  # count the number of location_factors within threshold_dist and create new data frame
  wdist <- origin_df %>% 
    mutate(WITHIN_DT = apply(dist_matrix, 1, function(x) sum(x <= threshold_dist)))
  
  # rename column name according to input parameter
  names(wdist)[names(wdist) == 'WITHIN_DT'] <- col_name

  # Return df
  return(wdist)
}

6.4 Call get_within function

  • call the get_within function created earlier to get the number of locational factors that are within a certain threshold distance.

  • In this case, the threshold we set it to will be Within 350m for locational factors such as, Kindergartens, Childcare centres,Bus stops and primary schools.

rs_coords_sf <- get_within(rs_coords_sf, kindergarten_sf, 350, "WITHIN_350M_KINDERGARTEN")

rs_coords_sf <- get_within(rs_coords_sf, childcare_sf, 350, "WITHIN_350M_CHILDCARE")

rs_coords_sf <- get_within(rs_coords_sf, busstop_sf, 350, "WITHIN_350M_BUSSTOP")

rs_coords_sf <- get_within(rs_coords_sf, primaryschools_sf, 1000, "WITHIN_1KM_PRIMARYSCHOOLS")

rs_coords_sf <- get_within(rs_coords_sf, primaryschools_sf, 1000, "WITHIN_1KM_PRISCH")
rs_coords_sf

6.5 Write factors to RDS file

rs_factors_rds <- write_rds(rs_coords_sf, "data/model/rds/rs_factors.rds")

7.0 Resale with locational factors

rs_sf <- read_rds("data/model/rds/rs_factors.rds")
rs_sf

7.1 Extract unique storey_range and sort

storeys <- sort(unique(rs_sf$storey_range))

7.2 Extract unique storey_range and sort

storey_order <- 1:length(storeys)
storey_range_order <- data.frame(storeys, storey_order)
head(storey_range_order)
   storeys storey_order
1 01 TO 03            1
2 04 TO 06            2
3 07 TO 09            3
4 10 TO 12            4
5 13 TO 15            5
6 16 TO 18            6

Hence, the storey range are in the correct order and is now in the correct type to be used for our regression model later on.

7.3 Combine storey_order with resale dataframe

rs_sf <- left_join(rs_sf, storey_range_order, by= c("storey_range" = "storeys"))
rs_sf
Simple feature collection with 25505 features and 32 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11519.79 ymin: 28217.39 xmax: 42645.18 ymax: 48741.06
Projected CRS: SVY21 / Singapore TM
# A tibble: 25,505 × 33
   month   town    address block stree…¹ flat_…² store…³ floor…⁴ flat_…⁵ lease…⁶
   <chr>   <chr>   <chr>   <chr> <chr>   <chr>   <chr>     <dbl> <chr>     <dbl>
 1 2021-01 ANG MO… 547 AN… 547   ANG MO… 4 ROOM  04 TO …      92 New Ge…    1981
 2 2021-01 ANG MO… 414 AN… 414   ANG MO… 4 ROOM  01 TO …      92 New Ge…    1979
 3 2021-01 ANG MO… 509 AN… 509   ANG MO… 4 ROOM  01 TO …      91 New Ge…    1980
 4 2021-01 ANG MO… 467 AN… 467   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
 5 2021-01 ANG MO… 571 AN… 571   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
 6 2021-01 ANG MO… 134 AN… 134   ANG MO… 4 ROOM  07 TO …      98 New Ge…    1978
 7 2021-01 ANG MO… 204 AN… 204   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1977
 8 2021-01 ANG MO… 429 AN… 429   ANG MO… 4 ROOM  04 TO …      92 New Ge…    1978
 9 2021-01 ANG MO… 413 AN… 413   ANG MO… 4 ROOM  10 TO …      92 New Ge…    1979
10 2021-01 ANG MO… 415 AN… 415   ANG MO… 4 ROOM  07 TO …      92 New Ge…    1979
# … with 25,495 more rows, 23 more variables: remaining_lease_mths <dbl>,
#   resale_price <dbl>, postal <chr>, geometry <POINT [m]>,
#   PROX_CHILDCARE <dbl>, PROX_ELDERLYCARE <dbl>, PROX_BUSSTOP <dbl>,
#   PROX_PRISCH_ALL <dbl>, PROX_MRT <dbl>, PROX_KINDERGARTEN <dbl>,
#   PROX_HAWKERCENTRE <dbl>, PROX_NATIONALPARKS <dbl>, PROX_SUPERMARKET <dbl>,
#   PROX_GOOD_PRISCH <dbl>, PROX_MALL <dbl>, PROX_CBD <dbl>, PROX_CHAS <dbl>,
#   WITHIN_350M_KINDERGARTEN <int>, WITHIN_350M_CHILDCARE <int>, …

Select required columns

rs_train <- rs_sf %>% filter(month >= "2021-01" & month <= "2022-12") %>% 
  select(resale_price, floor_area_sqm, storey_order, remaining_lease_mths,
         PROX_ELDERLYCARE, PROX_HAWKERCENTRE, PROX_MRT, PROX_NATIONALPARKS, PROX_MALL, PROX_SUPERMARKET, PROX_GOOD_PRISCH, PROX_CBD, PROX_CHAS, WITHIN_350M_KINDERGARTEN, WITHIN_350M_CHILDCARE, WITHIN_350M_BUSSTOP, WITHIN_1KM_PRISCH)
rs_test <- rs_sf %>% filter(month >= "2023-01" & month <= "2023-02") %>% 
  select(resale_price, floor_area_sqm, storey_order, remaining_lease_mths,
         PROX_ELDERLYCARE, PROX_HAWKERCENTRE, PROX_MRT, PROX_NATIONALPARKS, PROX_MALL, PROX_SUPERMARKET, PROX_GOOD_PRISCH, PROX_CBD, PROX_CHAS, WITHIN_350M_KINDERGARTEN, WITHIN_350M_CHILDCARE, WITHIN_350M_BUSSTOP, WITHIN_1KM_PRISCH)
summary(rs_train)
  resale_price     floor_area_sqm    storey_order    remaining_lease_mths
 Min.   : 250000   Min.   : 70.00   Min.   : 1.000   Min.   : 534.0      
 1st Qu.: 440000   1st Qu.: 91.00   1st Qu.: 2.000   1st Qu.: 786.0      
 Median : 490000   Median : 93.00   Median : 3.000   Median : 949.0      
 Mean   : 526125   Mean   : 94.67   Mean   : 3.475   Mean   : 945.5      
 3rd Qu.: 568000   3rd Qu.:100.00   3rd Qu.: 4.000   3rd Qu.:1115.0      
 Max.   :1370000   Max.   :145.00   Max.   :17.000   Max.   :1168.0      
 PROX_ELDERLYCARE PROX_HAWKERCENTRE    PROX_MRT       PROX_NATIONALPARKS
 Min.   :0.0000   Min.   :0.0306    Min.   :0.02179   Min.   :0.04416   
 1st Qu.:0.3184   1st Qu.:0.4443    1st Qu.:0.27161   1st Qu.:0.50493   
 Median :0.6001   Median :0.8329    Median :0.48501   Median :0.71934   
 Mean   :0.7800   Mean   :1.2991    Mean   :0.56828   Mean   :0.81492   
 3rd Qu.:1.0839   3rd Qu.:1.9122    3rd Qu.:0.77665   3rd Qu.:1.02960   
 Max.   :3.3016   Max.   :4.7664    Max.   :2.12909   Max.   :2.44018   
   PROX_MALL      PROX_SUPERMARKET    PROX_GOOD_PRISCH      PROX_CBD      
 Min.   :0.0000   Min.   :0.0000001   Min.   : 0.06525   Min.   : 0.9994  
 1st Qu.:0.3946   1st Qu.:0.1620901   1st Qu.: 2.29285   1st Qu.: 9.6995  
 Median :0.6042   Median :0.2509684   Median : 3.63214   Median :13.1064  
 Mean   :0.7099   Mean   :0.2745337   Mean   : 3.99358   Mean   :12.1310  
 3rd Qu.:0.9176   3rd Qu.:0.3623083   3rd Qu.: 5.41144   3rd Qu.:14.8838  
 Max.   :2.6686   Max.   :1.5713170   Max.   :10.62237   Max.   :19.6501  
   PROX_CHAS      WITHIN_350M_KINDERGARTEN WITHIN_350M_CHILDCARE
 Min.   :0.0000   Min.   :0.000            Min.   : 0.000       
 1st Qu.:0.1149   1st Qu.:0.000            1st Qu.: 3.000       
 Median :0.1779   Median :1.000            Median : 4.000       
 Mean   :0.1955   Mean   :1.002            Mean   : 3.895       
 3rd Qu.:0.2566   3rd Qu.:1.000            3rd Qu.: 5.000       
 Max.   :0.8699   Max.   :7.000            Max.   :20.000       
 WITHIN_350M_BUSSTOP WITHIN_1KM_PRISCH          geometry    
 Min.   : 0.000      Min.   :0.0       POINT        :23657  
 1st Qu.: 6.000      1st Qu.:2.0       epsg:3414    :    0  
 Median : 8.000      Median :3.0       +proj=tmer...:    0  
 Mean   : 7.954      Mean   :3.2                            
 3rd Qu.:10.000      3rd Qu.:4.0                            
 Max.   :19.000      Max.   :9.0                            

8.0 EDA

8.1 Plot Histogram of resale_price

ggplot(data=rs_train, aes(x=`resale_price`)) +
  geom_histogram(bins=20, color="black", fill="light coral")

  • Right skewed distribution
rs_train_visualise <- rs_train %>%
  mutate(`LOG_SELLING_PRICE` = log(resale_price))

8.2 Plot Histogram of log(resale_price)

ggplot(data=rs_train_visualise, aes(x=`LOG_SELLING_PRICE`)) +
  geom_histogram(bins=20, color="black", fill="light green")

9.0 Hedonic Pricing Modelling in R

9.1 Visualise relationships of independent variables

rs_train_nogeom <- st_set_geometry(rs_train, NULL) 

9.1.1 Plot a scatterplot matrix

  • Here we use corrplot() function of corrplot package to visualise the relationships between the independent variables.

  • tl.cex is set to 0.8 so that the variables are more visible.

corrplot(cor(rs_train_nogeom[, 2:16]), diag = FALSE, order = "AOE",
          tl.pos = "td", tl.cex = 0.4, method = "number", type = "upper")

Looks like there’s a moderately high positive correlation between log selling price and storey order. However, it’s not high enough to exclude either log selling price or storey order to reduce collinearity.

9.2 Multiple Linear Regression (MLR)

rs_mlr1 <- lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+
         PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+  PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_350M_BUSSTOP+ WITHIN_1KM_PRISCH, data=rs_train)
summary(rs_mlr1)

Call:
lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths + 
    PROX_ELDERLYCARE + PROX_HAWKERCENTRE + PROX_MRT + PROX_NATIONALPARKS + 
    PROX_MALL + PROX_SUPERMARKET + PROX_GOOD_PRISCH + PROX_CBD + 
    PROX_CHAS + WITHIN_350M_KINDERGARTEN + WITHIN_350M_CHILDCARE + 
    WITHIN_350M_BUSSTOP + WITHIN_1KM_PRISCH, data = rs_train)

Residuals:
    Min      1Q  Median      3Q     Max 
-370782  -44031    -712   43009  399344 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)               38167.548   7881.171   4.843 1.29e-06 ***
floor_area_sqm             3300.305     68.785  47.980  < 2e-16 ***
storey_order              15939.593    224.779  70.912  < 2e-16 ***
remaining_lease_mths        383.481      3.289 116.597  < 2e-16 ***
PROX_ELDERLYCARE          -8741.974    784.798 -11.139  < 2e-16 ***
PROX_HAWKERCENTRE        -14862.885    492.209 -30.196  < 2e-16 ***
PROX_MRT                 -31738.498   1344.897 -23.599  < 2e-16 ***
PROX_NATIONALPARKS        -6159.773   1115.371  -5.523 3.37e-08 ***
PROX_MALL                  3984.884   1270.254   3.137  0.00171 ** 
PROX_SUPERMARKET         -17339.014   3290.618  -5.269 1.38e-07 ***
PROX_GOOD_PRISCH           5356.830    270.792  19.782  < 2e-16 ***
PROX_CBD                 -17697.975    184.469 -95.940  < 2e-16 ***
PROX_CHAS                 41338.480   4425.817   9.340  < 2e-16 ***
WITHIN_350M_KINDERGARTEN   6376.277    479.171  13.307  < 2e-16 ***
WITHIN_350M_CHILDCARE     -2221.721    256.608  -8.658  < 2e-16 ***
WITHIN_350M_BUSSTOP        1714.228    163.712  10.471  < 2e-16 ***
WITHIN_1KM_PRISCH         -5466.512    340.042 -16.076  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 68000 on 23640 degrees of freedom
Multiple R-squared:  0.7244,    Adjusted R-squared:  0.7242 
F-statistic:  3884 on 16 and 23640 DF,  p-value: < 2.2e-16
  • Adjusted r square is now at 0.7242. This means that our model can accurately predict 72.4% of HDB Resale prices.
  • Since PROX_MALL is slightly less statistically significant than the other variables, let’s try to remove it and see the difference.
rs_mlr2 <- lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+
         PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH, data=rs_train)
summary(rs_mlr2)

Call:
lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths + 
    PROX_ELDERLYCARE + PROX_HAWKERCENTRE + PROX_MRT + PROX_NATIONALPARKS + 
    PROX_SUPERMARKET + PROX_GOOD_PRISCH + PROX_CBD + PROX_CHAS + 
    WITHIN_350M_KINDERGARTEN + WITHIN_350M_CHILDCARE + WITHIN_1KM_PRISCH, 
    data = rs_train)

Residuals:
    Min      1Q  Median      3Q     Max 
-370510  -43788    -459   43029  402994 

Coefficients:
                           Estimate Std. Error  t value Pr(>|t|)    
(Intercept)               50866.463   7663.049    6.638 3.25e-11 ***
floor_area_sqm             3312.094     68.936   48.046  < 2e-16 ***
storey_order              15952.686    225.227   70.829  < 2e-16 ***
remaining_lease_mths        383.377      3.285  116.702  < 2e-16 ***
PROX_ELDERLYCARE          -8706.766    785.096  -11.090  < 2e-16 ***
PROX_HAWKERCENTRE        -15407.180    489.937  -31.447  < 2e-16 ***
PROX_MRT                 -31400.540   1331.533  -23.582  < 2e-16 ***
PROX_NATIONALPARKS        -4713.708   1108.420   -4.253 2.12e-05 ***
PROX_SUPERMARKET         -16560.386   3280.144   -5.049 4.48e-07 ***
PROX_GOOD_PRISCH           5253.989    270.157   19.448  < 2e-16 ***
PROX_CBD                 -17625.129    171.734 -102.631  < 2e-16 ***
PROX_CHAS                 38988.406   4430.766    8.799  < 2e-16 ***
WITHIN_350M_KINDERGARTEN   6761.754    478.914   14.119  < 2e-16 ***
WITHIN_350M_CHILDCARE     -1913.132    251.144   -7.618 2.68e-14 ***
WITHIN_1KM_PRISCH         -5405.453    337.315  -16.025  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 68170 on 23642 degrees of freedom
Multiple R-squared:  0.7231,    Adjusted R-squared:  0.7229 
F-statistic:  4410 on 14 and 23642 DF,  p-value: < 2.2e-16
  • After removing “PROX_MALL” variable which is slightly less statistically significant than the other variables, the adjusted R squared value falls. We should keep “PROX_MALL” variable since it is still statistically significant and increases the adjusted R squared value.

9.2.1 Calibrate the revised MLR model

rs_mlr2 <- lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+
         PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH, data=rs_train)
ols_regress(rs_mlr2)
                            Model Summary                              
----------------------------------------------------------------------
R                       0.850       RMSE                    68160.322 
R-Squared               0.723       Coef. Var                  12.955 
Adj. R-Squared          0.723       MSE                4645829542.019 
Pred R-Squared          0.723       MAE                     53048.112 
----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                     ANOVA                                       
--------------------------------------------------------------------------------
                    Sum of                                                      
                   Squares           DF       Mean Square       F          Sig. 
--------------------------------------------------------------------------------
Regression    2.868958e+14           15      1.912638e+13    4116.893    0.0000 
Residual      1.098321e+14        23641    4645829542.019                       
Total         3.967278e+14        23656                                         
--------------------------------------------------------------------------------

                                               Parameter Estimates                                                 
------------------------------------------------------------------------------------------------------------------
                   model          Beta    Std. Error    Std. Beta       t        Sig          lower         upper 
------------------------------------------------------------------------------------------------------------------
             (Intercept)     47074.815      7853.110                   5.994    0.000     31682.214     62467.416 
          floor_area_sqm      3310.066        68.937        0.176     48.016    0.000      3174.945      3445.186 
            storey_order     15938.921       225.295        0.271     70.747    0.000     15497.328     16380.514 
    remaining_lease_mths       383.979         3.296        0.488    116.493    0.000       377.519       390.440 
        PROX_ELDERLYCARE     -8815.071       786.568       -0.042    -11.207    0.000    -10356.795     -7273.347 
       PROX_HAWKERCENTRE    -15444.419       490.189       -0.136    -31.507    0.000    -16405.220    -14483.618 
                PROX_MRT    -31864.117      1347.930       -0.094    -23.639    0.000    -34506.147    -29222.087 
      PROX_NATIONALPARKS     -4894.129      1111.347       -0.017     -4.404    0.000     -7072.442     -2715.817 
               PROX_MALL      2795.289      1268.067        0.009      2.204    0.028       309.796      5280.781 
        PROX_SUPERMARKET    -17325.064      3298.170       -0.020     -5.253    0.000    -23789.689    -10860.438 
        PROX_GOOD_PRISCH      5205.552       271.027        0.092     19.207    0.000      4674.322      5736.782 
                PROX_CBD    -17481.136       183.724       -0.567    -95.149    0.000    -17841.247    -17121.025 
               PROX_CHAS     39154.491      4431.045        0.034      8.836    0.000     30469.357     47839.625 
WITHIN_350M_KINDERGARTEN      6732.483       479.059        0.051     14.054    0.000      5793.497      7671.468 
   WITHIN_350M_CHILDCARE     -1823.928       254.363       -0.028     -7.171    0.000     -2322.495     -1325.361 
       WITHIN_1KM_PRISCH     -5303.173       340.464       -0.065    -15.576    0.000     -5970.504     -4635.843 
------------------------------------------------------------------------------------------------------------------

9.2.2 Check for multicollinearity using VIF

ols_vif_tol(rs_mlr2)
                  Variables Tolerance      VIF
1            floor_area_sqm 0.8762921 1.141172
2              storey_order 0.7953276 1.257343
3      remaining_lease_mths 0.6684373 1.496027
4          PROX_ELDERLYCARE 0.8412456 1.188714
5         PROX_HAWKERCENTRE 0.6297687 1.587884
6                  PROX_MRT 0.7441233 1.343863
7        PROX_NATIONALPARKS 0.8287816 1.206591
8                 PROX_MALL 0.6546109 1.527625
9          PROX_SUPERMARKET 0.7774174 1.286310
10         PROX_GOOD_PRISCH 0.5089235 1.964932
11                 PROX_CBD 0.3302257 3.028232
12                PROX_CHAS 0.7915278 1.263380
13 WITHIN_350M_KINDERGARTEN 0.9024450 1.108101
14    WITHIN_350M_CHILDCARE 0.7682702 1.301625
15        WITHIN_1KM_PRISCH 0.6700048 1.492527

There’s little evidence of multicollinearity among variables as VIF <10 for all variables.

Before moving onto the next section, let’s save the mlr model into a rds file.

write_rds(rs_mlr2, "data/model/rds/price_mlr.rds" ) 
price_mlr <- read_rds("data/model/rds/price_mlr.rds" ) 
price_mlr

Call:
lm(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths + 
    PROX_ELDERLYCARE + PROX_HAWKERCENTRE + PROX_MRT + PROX_NATIONALPARKS + 
    PROX_MALL + PROX_SUPERMARKET + PROX_GOOD_PRISCH + PROX_CBD + 
    PROX_CHAS + WITHIN_350M_KINDERGARTEN + WITHIN_350M_CHILDCARE + 
    WITHIN_1KM_PRISCH, data = rs_train)

Coefficients:
             (Intercept)            floor_area_sqm              storey_order  
                   47075                      3310                     15939  
    remaining_lease_mths          PROX_ELDERLYCARE         PROX_HAWKERCENTRE  
                     384                     -8815                    -15444  
                PROX_MRT        PROX_NATIONALPARKS                 PROX_MALL  
                  -31864                     -4894                      2795  
        PROX_SUPERMARKET          PROX_GOOD_PRISCH                  PROX_CBD  
                  -17325                      5206                    -17481  
               PROX_CHAS  WITHIN_350M_KINDERGARTEN     WITHIN_350M_CHILDCARE  
                   39154                      6732                     -1824  
       WITHIN_1KM_PRISCH  
                   -5303  

Let’s analyse the factors and see how they affect resale prices in Singapore.

Being near to following places greatly reduces the value of a resale flat: - Elderlycare - Hawkercentre - MRT - Nationalparks - Supermarkets - CBD - Childcare - Primary schools

The coefficient of MRT proximity is negative and largest in magnitude. Seems like Singaporeans really don’t like their flats near MRTs. It could possibly be due to the noise and business during the day.

These following factors help increase the value of a resale flat: - Higher Storey order - Malls nearby - Large floor area space - Near top primary schools - Near kindergartens - Longer remaining lease - Near clinics

The coefficient of clinic proximity is positive and largest in magnitude. Seems like Singaporeans really like their flats near clinics for convenience.

9.2.3 Compile train dataset (write and read rds)

write_rds(rs_train, "data/model/rds/train_data.rds")
train_data <- read_rds("data/model/rds/train_data.rds")

9.2.4 Compile test dataset (write and read rds)

write_rds(rs_test, "data/model/rds/test_data.rds")
mlr_test_data <- read_rds("data/model/rds/test_data.rds")

9.2.5 Getting OLS predictions

ols_predictions <- predict.lm(price_mlr, mlr_test_data)

Now, let’s store the predictions.

write_rds(ols_predictions, "data/model/rds/ols_predictions.rds")

9.2.6 Converting the predicting output into a data frame

The output of the predict.lm() is a vector of predicted values. It is wiser to convert it into a data frame for further visualisation and analysis.

ols_pred <- read_rds("data/model/rds/ols_predictions.rds")
ols_pred_df <- as.data.frame(ols_pred)
test_data_mlr_binded <- cbind(mlr_test_data, ols_pred_df)
write_rds(test_data_mlr_binded, "data/model/rds/test_data_mlr_binded.rds")
test_data_mlr_binded <- read_rds("data/model/rds/test_data_mlr_binded.rds")

9.2.7 Calculating Root Mean Square Error (OLS)

The root mean square error (RMSE) allows us to measure how far predicted values are from observed values in a regression analysis. In the code chunk below, rmse() of Metrics package is used to compute the RMSE.

rmse(test_data_mlr_binded$resale_price, 
     test_data_mlr_binded$ols_pred)
[1] 82104.39

The root mean square error is quite high for OLS model - 82104.39. We can tell that the resale price predictions may not be so accurate.

9.2.8 Visualising the predicted values (OLS)

ggplot(data = test_data_mlr_binded,
       aes(x = ols_pred,
           y = resale_price)) +
  geom_point()

10.0 GWR (Geographically Weighted Regression Method )

In this section, we will learn how to calibrate a model to predict HDB resale price by using geographically weighted regression method of GWmodel package.

Let’s compile our train and test datasets.

10.0.3 Computing adaptive bandwidth

In this section, we will learn how to calibrate a model to predict HDB resale price by using geographically weighted regression method of GWmodel package.

10.0.3.1 Converting the sf data.frame to SpatialPointDataFrame

train_data_sp <- as_Spatial(train_data)
train_data_sp
class       : SpatialPointsDataFrame 
features    : 23657 
extent      : 11519.79, 42645.18, 28217.39, 48741.06  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 17
names       : resale_price, floor_area_sqm, storey_order, remaining_lease_mths,     PROX_ELDERLYCARE, PROX_HAWKERCENTRE,         PROX_MRT, PROX_NATIONALPARKS,        PROX_MALL,    PROX_SUPERMARKET,   PROX_GOOD_PRISCH,          PROX_CBD,            PROX_CHAS, WITHIN_350M_KINDERGARTEN, WITHIN_350M_CHILDCARE, ... 
min values  :       250000,             70,            1,                  534, 4.28110190953706e-07, 0.030603180648446, 0.02179340216212, 0.0441643213745594,                0, 1.2170877176887e-07, 0.0652540365486641, 0.999393538715878, 2.99641325933321e-09,                        0,                     0, ... 
max values  :      1370000,            145,           17,                 1168,     3.30163731686804,  4.76638932311821, 2.12908586250823,   2.44018373848866, 2.66857012072661,    1.57131703659268,   10.6223726149914,  19.6500691667807,    0.869920099464176,                        7,                    20, ... 

Next, bw.gwr() of GWmodel package will be used to determine the optimal bandwidth to be used.

bw_adaptive <- bw.gwr(resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
                  data=train_data_sp,
                  approach="CV",
                  kernel="gaussian",
                  adaptive=TRUE,
                  longlat=FALSE)

The adaptive bandwidth tells us that 119 neighbours will be the optimal bandwidth to be used if adaptive bandwidth is used for this data set.

write_rds(bw_adaptive, "data/model/rds/bw_adaptive.rds")

10.0.4 Constructing the adaptive bandwidth gwr model

First, let us call the save bandwidth by using the code chunk below.

bw_adaptive <- read_rds("data/model/rds/bw_adaptive.rds")
bw_adaptive
[1] 119

Now, we can go ahead to calibrate the gwr-based hedonic pricing model by using adaptive bandwidth and Gaussian kernel as shown in the code chunk below.

gwr_adaptive <- gwr.basic(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
                          data=train_data_sp,
                          bw=bw_adaptive, 
                          kernel = 'gaussian', 
                          adaptive=TRUE,
                          longlat = FALSE)

write_rds(gwr_adaptive, "data/model/rds/gwr_adaptive.rds")
gwr_adaptive <- read_rds("data/model/rds/gwr_adaptive.rds")
gwr_adaptive
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2023-03-25 09:29:04 
   Call:
   gwr.basic(formula = resale_price ~ floor_area_sqm + storey_order + 
    remaining_lease_mths + PROX_ELDERLYCARE + PROX_HAWKERCENTRE + 
    PROX_MRT + PROX_NATIONALPARKS + PROX_MALL + PROX_SUPERMARKET + 
    PROX_GOOD_PRISCH + PROX_CBD + PROX_CHAS + WITHIN_350M_KINDERGARTEN + 
    WITHIN_350M_CHILDCARE + WITHIN_1KM_PRISCH, data = train_data_sp, 
    bw = bw_adaptive, kernel = "gaussian", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  resale_price
   Independent variables:  floor_area_sqm storey_order remaining_lease_mths PROX_ELDERLYCARE PROX_HAWKERCENTRE PROX_MRT PROX_NATIONALPARKS PROX_MALL PROX_SUPERMARKET PROX_GOOD_PRISCH PROX_CBD PROX_CHAS WITHIN_350M_KINDERGARTEN WITHIN_350M_CHILDCARE WITHIN_1KM_PRISCH
   Number of data points: 23657
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-369455  -43776    -419   42986  402338 

   Coefficients:
                              Estimate Std. Error t value Pr(>|t|)    
   (Intercept)               47074.815   7853.110   5.994 2.07e-09 ***
   floor_area_sqm             3310.066     68.937  48.016  < 2e-16 ***
   storey_order              15938.921    225.295  70.747  < 2e-16 ***
   remaining_lease_mths        383.979      3.296 116.493  < 2e-16 ***
   PROX_ELDERLYCARE          -8815.071    786.568 -11.207  < 2e-16 ***
   PROX_HAWKERCENTRE        -15444.419    490.189 -31.507  < 2e-16 ***
   PROX_MRT                 -31864.117   1347.930 -23.639  < 2e-16 ***
   PROX_NATIONALPARKS        -4894.129   1111.347  -4.404 1.07e-05 ***
   PROX_MALL                  2795.289   1268.067   2.204   0.0275 *  
   PROX_SUPERMARKET         -17325.064   3298.170  -5.253 1.51e-07 ***
   PROX_GOOD_PRISCH           5205.552    271.027  19.207  < 2e-16 ***
   PROX_CBD                 -17481.136    183.724 -95.149  < 2e-16 ***
   PROX_CHAS                 39154.491   4431.045   8.836  < 2e-16 ***
   WITHIN_350M_KINDERGARTEN   6732.483    479.059  14.054  < 2e-16 ***
   WITHIN_350M_CHILDCARE     -1823.928    254.363  -7.171 7.69e-13 ***
   WITHIN_1KM_PRISCH         -5303.173    340.464 -15.576  < 2e-16 ***

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 68160 on 23641 degrees of freedom
   Multiple R-squared: 0.7232
   Adjusted R-squared: 0.723 
   F-statistic:  4117 on 15 and 23641 DF,  p-value: < 2.2e-16 
   ***Extra Diagnostic information
   Residual sum of squares: 1.098321e+14
   Sigma(hat): 68140.15
   AIC:  593740.4
   AICc:  593740.4
   BIC:  570391.8
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Adaptive bandwidth: 119 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.     1st Qu.      Median     3rd Qu.
   Intercept                -5.8551e+08 -6.8764e+05  8.1064e+04  2.1458e+06
   floor_area_sqm           -6.9631e+04  1.6903e+03  2.6952e+03  4.4414e+03
   storey_order              5.8459e+03  9.6146e+03  1.1699e+04  1.3838e+04
   remaining_lease_mths     -5.4730e+03  1.0667e+02  2.9043e+02  4.9196e+02
   PROX_ELDERLYCARE         -1.0862e+07 -3.1917e+04  2.1463e+04  7.0673e+04
   PROX_HAWKERCENTRE        -1.6519e+07 -6.0046e+04  1.0821e+03  6.2917e+04
   PROX_MRT                 -3.4125e+06 -1.0854e+05 -4.8551e+04  8.0842e+03
   PROX_NATIONALPARKS       -1.1960e+07 -7.1462e+04 -2.0273e+04  2.6099e+04
   PROX_MALL                -5.4578e+07 -5.9960e+04 -1.5869e+03  5.0609e+04
   PROX_SUPERMARKET         -3.0957e+11 -7.2296e+04 -1.0332e+04  3.5250e+04
   PROX_GOOD_PRISCH         -5.5050e+07 -1.1692e+05 -3.6473e+03  2.3493e+05
   PROX_CBD                 -6.2447e+07 -2.0084e+05 -1.9186e+04  8.3585e+04
   PROX_CHAS                -1.7514e+06 -5.3272e+04 -8.4917e+01  4.7771e+04
   WITHIN_350M_KINDERGARTEN -1.8384e+05 -7.8370e+03 -2.6234e+03  3.0178e+03
   WITHIN_350M_CHILDCARE    -8.9858e+04 -3.0748e+03  1.3592e+02  2.7617e+03
   WITHIN_1KM_PRISCH        -5.8136e+05 -6.9302e+03  3.5621e+02  6.2100e+03
                                  Max.
   Intercept                1.3615e+08
   floor_area_sqm           9.7805e+04
   storey_order             2.7306e+04
   remaining_lease_mths     8.4334e+02
   PROX_ELDERLYCARE         6.6168e+06
   PROX_HAWKERCENTRE        5.3428e+07
   PROX_MRT                 2.4047e+07
   PROX_NATIONALPARKS       1.2444e+07
   PROX_MALL                7.6041e+06
   PROX_SUPERMARKET         2.1228e+06
   PROX_GOOD_PRISCH         1.1384e+08
   PROX_CBD                 8.2911e+07
   PROX_CHAS                3.0957e+11
   WITHIN_350M_KINDERGARTEN 2.3725e+05
   WITHIN_350M_CHILDCARE    3.3137e+04
   WITHIN_1KM_PRISCH        4.2642e+06
   ************************Diagnostic information*************************
   Number of data points: 23657 
   Effective number of parameters (2trace(S) - trace(S'S)): 1439.61 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 22217.39 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 566701.7 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 565434 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 552192.9 
   Residual sum of squares: 3.166845e+13 
   R-square value:  0.9201759 
   Adjusted R-square value:  0.9150033 

   ***********************************************************************
   Program stops at: 2023-03-25 09:36:18 

Predicting with gwr model

test_data <- read_rds("data/model/rds/test_data.rds")
test_data_sp <- as_Spatial(test_data)
test_data_sp
class       : SpatialPointsDataFrame 
features    : 1848 
extent      : 11655.33, 42444.75, 28287.8, 48675.05  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 17
names       : resale_price, floor_area_sqm, storey_order, remaining_lease_mths,     PROX_ELDERLYCARE,  PROX_HAWKERCENTRE,           PROX_MRT, PROX_NATIONALPARKS,         PROX_MALL,     PROX_SUPERMARKET,  PROX_GOOD_PRISCH,         PROX_CBD,            PROX_CHAS, WITHIN_350M_KINDERGARTEN, WITHIN_350M_CHILDCARE, ... 
min values  :       320000,             74,            1,                  517, 9.89208662673193e-07, 0.0503736883973231, 0.0294613865317787, 0.0441643213745594, 0.062935819053852, 2.98738633834197e-07, 0.137507022142069, 1.17059076246311, 1.59237979817407e-08,                        0,                     0, ... 
max values  :      1290000,            121,           16,                 1143,     3.26154359027314,   4.73879432069556,   1.98048290496911,   2.43949004787036,  2.61053688219547,     1.27671800614123,  10.4293237896033, 19.4036799155058,    0.869920099464176,                        5,                    16, ... 
set.seed(1234)
gwr_prediction <- gwr.predict(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
                          kernel = 'gaussian',
                          data=test_data_sp,
                          bw=bw_adaptive, 
                          adaptive=TRUE,
                          longlat = FALSE)
gwr_prediction
write_rds(gwr_prediction, "data/model/rds/gwr_prediction.rds")
gwr_prediction <- read_rds("data/model/rds/gwr_prediction.rds")

10.0.5 Converting the predicting output into a data frame (GWRF)

We cannot feed the whole gwr_prediction into a dataframe directly. It will throw us an error. gwr_prediction contains 4 lists - let’s select the SDF list which is a spatial point dataframe. After which, we will select “prediction” inside SDF as it contains all our predicted values.

gwr_pred_df <- as.data.frame(gwr_prediction$SDF$prediction)

Now, we want to combine gwr_pred_df (contains our predictions) with test data (contains actual HDB values) to compare and see the model’s performance. To combine, we will use c_bind.

test_data <- read_rds("data/model/rds/test_data.rds")
test_data_gwr <- cbind(test_data, gwr_pred_df)
write_rds(test_data_gwr, "data/model/test_data_gwr.rds")
test_data_gwr <- read_rds("data/model/test_data_gwr.rds")

10.0.6 Calculating Root Mean Square Error (gwrf)

rmse(test_data_gwr$resale_price, 
     test_data_gwr$gwr_prediction.SDF.prediction)
[1] 41250.16

10.0.7 Visualising predictions (gwrf)

ggplot(data = test_data_gwr,
       aes(x = gwr_prediction.SDF.prediction,
           y = resale_price)) +
  geom_point()

10.0.5 Preparing coordinates data

Extracting coordinates data

train_data <- read_rds("data/model/rds/train_data.rds")
test_data <- read_rds("data/model/rds/test_data.rds")
coords_train <- st_coordinates(train_data)
coords_test <- st_coordinates(test_data)

Before continue, we write all the output into rds for future used.

coords_train <- write_rds(coords_train,"data/model/rds/coords_train.rds" )

coords_test <- write_rds(coords_test,"data/model/rds/coords_test.rds" )
coords_train <- read_rds("data/model/rds/coords_train.rds")

coords_test <- read_rds("data/model/rds/coords_test.rds")

10.0.6 Dropping geometry field

First, we will drop geometry column of the sf data.frame by using st_drop_geometry() of sf package.

train_data_geo_dropped <- train_data %>% 
  st_drop_geometry()
head(train_data_geo_dropped)
# A tibble: 6 × 17
  resale_price floor_a…¹ store…² remai…³ PROX_…⁴ PROX_…⁵ PROX_…⁶ PROX_…⁷ PROX_…⁸
         <dbl>     <dbl>   <int>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1       370000        92       2     708  1.09     0.444   1.05    0.829   1.18 
2       375000        92       1     693  0.150    0.205   0.757   0.785   0.843
3       380000        91       1     702  0.722    0.450   0.457   0.380   0.354
4       385000        92       3     695  0.0982   0.319   0.887   0.924   0.930
5       410000        92       3     689  0.593    0.259   0.554   0.514   0.717
6       410000        98       3     681  0.317    0.432   0.841   0.229   0.412
# … with 8 more variables: PROX_SUPERMARKET <dbl>, PROX_GOOD_PRISCH <dbl>,
#   PROX_CBD <dbl>, PROX_CHAS <dbl>, WITHIN_350M_KINDERGARTEN <int>,
#   WITHIN_350M_CHILDCARE <int>, WITHIN_350M_BUSSTOP <int>,
#   WITHIN_1KM_PRISCH <int>, and abbreviated variable names ¹​floor_area_sqm,
#   ²​storey_order, ³​remaining_lease_mths, ⁴​PROX_ELDERLYCARE,
#   ⁵​PROX_HAWKERCENTRE, ⁶​PROX_MRT, ⁷​PROX_NATIONALPARKS, ⁸​PROX_MALL
write_rds(train_data_geo_dropped, "data/model/rds/train_data_geo_dropped.rds")
train_data_geo_dropped <- read_rds("data/model/rds/train_data_geo_dropped.rds")
train_data_geo_dropped
# A tibble: 23,657 × 17
   resale_price floor_…¹ store…² remai…³ PROX_…⁴ PROX_…⁵ PROX_…⁶ PROX_…⁷ PROX_…⁸
 *        <dbl>    <dbl>   <int>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1       370000       92       2     708  1.09     0.444   1.05    0.829   1.18 
 2       375000       92       1     693  0.150    0.205   0.757   0.785   0.843
 3       380000       91       1     702  0.722    0.450   0.457   0.380   0.354
 4       385000       92       3     695  0.0982   0.319   0.887   0.924   0.930
 5       410000       92       3     689  0.593    0.259   0.554   0.514   0.717
 6       410000       98       3     681  0.317    0.432   0.841   0.229   0.412
 7       410000       92       3     661  0.251    0.442   0.646   0.745   0.549
 8       418000       92       2     682  0.454    0.339   0.390   0.385   0.549
 9       420000       92       4     692  0.175    0.206   0.777   0.746   0.854
10       420000       92       3     692  0.0855   0.267   0.691   0.773   0.780
# … with 23,647 more rows, 8 more variables: PROX_SUPERMARKET <dbl>,
#   PROX_GOOD_PRISCH <dbl>, PROX_CBD <dbl>, PROX_CHAS <dbl>,
#   WITHIN_350M_KINDERGARTEN <int>, WITHIN_350M_CHILDCARE <int>,
#   WITHIN_350M_BUSSTOP <int>, WITHIN_1KM_PRISCH <int>, and abbreviated
#   variable names ¹​floor_area_sqm, ²​storey_order, ³​remaining_lease_mths,
#   ⁴​PROX_ELDERLYCARE, ⁵​PROX_HAWKERCENTRE, ⁶​PROX_MRT, ⁷​PROX_NATIONALPARKS,
#   ⁸​PROX_MALL

10.0.7 Calibrating Random Forest Model

In this section, we will calibrate a model to predict HDB resale price by using random forest function of ranger package.

set.seed(1234)
rf <- ranger(resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
             data=train_data_geo_dropped)
print(rf)
function (n, df1, df2, ncp) 
{
    if (missing(ncp)) 
        .Call(C_rf, n, df1, df2)
    else (rchisq(n, df1, ncp = ncp)/df1)/(rchisq(n, df2)/df2)
}
<bytecode: 0x7fa6414c4710>
<environment: namespace:stats>

rf <- write_rds(rf, "data/model/rds/rf.rds")
rf <- read_rds("data/model/rds/rf.rds")
test_data_rf <- read_rds("data/model/rds/test_data.rds")
test_data_rf <- test_data_rf %>% 
  st_drop_geometry()
set.seed(1234)
rf_pred <- predict(rf, test_data_rf)
rf_pred <- write_rds(rf_pred, "data/model/rds/rf_pred.rds")
rf_pred <- read_rds("data/model/rds/rf_pred.rds")

10.0.7.1 Converting the predicting output into a data frame

The output of the predict() is a vector of predicted values. It is wiser to convert it into a data frame for further visualisation and analysis.

rf_pred_df <- as.data.frame(rf_pred)

In the code chunk below, cbind() is used to append the predicted values onto test_data.

test_data_bind_rf <- read_rds("data/model/rds/test_data.rds")
test_data_bind_rf <- cbind(test_data_bind_rf, rf_pred_df)
write_rds(test_data_bind_rf, "data/model/rds/test_data_rf.rds")
test_data_bind_rf <- read_rds("data/model/rds/test_data_rf.rds")

10.0.7.2 Calculating Root Mean Square Error

rmse(test_data_bind_rf$resale_price, 
     test_data_bind_rf$prediction)
[1] 48912.52

10.0.7.3 Visualising the predicted values

ggplot(data = test_data_bind_rf,
       aes(x = prediction,
           y = resale_price)) +
  geom_point()

10.0.8 Calibrating Geographical Random Forest Model

In this section, we will learn how to calibrate a model to predict HDB resale price by using grf() of SpatialML package.

10.0.8.1 Calibrating using training data

10.0.8.1.1 Finding Optimal Bandwidth using grf.bw()

bwRF_adaptive <- grf.bw(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
                     train_data_geo_dropped, 
                     trees=30, # 30 or 50 would be good, no need 500
                     kernel="adaptive",
                     coords=coords_train)

The highest R2 value of 0.85120 appears when the bandwidth is 1187.

write_rds(1187, "data/model/rds/bwRF_adaptive_value.rds")
bwRF_adaptive_value <- read_rds("data/model/rds/bwRF_adaptive_value.rds")
bwRF_adaptive_value
[1] 1187

The code chunk below calibrate a geographic random forest model by using grf() of SpatialML package.

set.seed(1234)
gwRF_adaptive <- grf(formula = resale_price ~ floor_area_sqm + storey_order + remaining_lease_mths+ PROX_ELDERLYCARE+ PROX_HAWKERCENTRE+ PROX_MRT+ PROX_NATIONALPARKS+ PROX_MALL+ PROX_SUPERMARKET+ PROX_GOOD_PRISCH+ PROX_CBD+ PROX_CHAS+ WITHIN_350M_KINDERGARTEN+ WITHIN_350M_CHILDCARE+ WITHIN_1KM_PRISCH,
                     dframe=train_data_geo_dropped, 
                     ntree = 30,
                     bw=bwRF_adaptive_value,
                     kernel="adaptive",
                     coords=coords_train)

write_rds(gwRF_adaptive, "data/model/rds/gwRF_adaptive.rds")
gwRF_adaptive <- read_rds("data/model/rds/gwRF_adaptive.rds")

10.0.8 Predicting by using test data

10.0.8.1 Preparing the test data

test_data_geo_dropped <- cbind(test_data, coords_test) %>%
  st_drop_geometry()
write_rds(test_data_geo_dropped, "data/model/rds/test_data_geo_dropped.rds")
test_data_geo_dropped <- read_rds("data/model/rds/test_data_geo_dropped.rds")

10.0.8.2 Predicting with test data

gwRF_pred <- predict.grf(gwRF_adaptive, 
                           test_data_geo_dropped, 
                           x.var.name="X",
                           y.var.name="Y", 
                           local.w=1,
                           global.w=0)
GRF_pred <- write_rds(gwRF_pred, "data/model/rds/GRF_pred.rds")

10.0.8.3 Converting the predicting output into a data frame

The output of the predict.grf() is a vector of predicted values. It is wiser to convert it into a data frame for further visualisation and analysis.

GRF_pred <- read_rds("data/model/rds/GRF_pred.rds")
GRF_pred_df <- as.data.frame(GRF_pred)

In the code chunk below, cbind() is used to append the predicted values onto test_data_geo_dropped.

test_data_p <- cbind(test_data_geo_dropped, GRF_pred_df)
write_rds(test_data_p, "data/model/rds/test_data_p.rds")
test_data_p <- read_rds("data/model/rds/test_data_p.rds")

10.0.9 Calculating Root Mean Square Error

The root mean square error (RMSE) allows us to measure how far predicted values are from observed values in a regression analysis. In the code chunk below, rmse() of Metrics package is used to compute the RMSE.

rmse(test_data_p$resale_price, 
     test_data_p$GRF_pred)
[1] 43401.41

10.0.10 Visualising the predicted values

ggplot(data = test_data_p,
       aes(x = GRF_pred,
           y = resale_price)) +
  geom_point()

Comparing against models (OLS, GWR, GRF)

Comparing Graphs

ggplot(data = test_data_mlr_binded,
       aes(x = ols_pred,
           y = resale_price)) +
  geom_point()

ggplot(data = test_data_gwr,
       aes(x = gwr_prediction.SDF.prediction,
           y = resale_price)) +
  geom_point()

ggplot(data = test_data_bind_rf,
       aes(x = prediction,
           y = resale_price)) +
  geom_point()

ggplot(data = test_data_p,
       aes(x = GRF_pred,
           y = resale_price)) +
  geom_point()

Comparing RMSE Values

rmse(test_data_mlr_binded$resale_price, 
     test_data_mlr_binded$ols_pred)
[1] 82104.39
rmse(test_data_gwr$resale_price, 
     test_data_gwr$gwr_prediction.SDF.prediction)
[1] 41250.16
rmse(test_data_bind_rf$resale_price, 
     test_data_bind_rf$prediction)
[1] 48912.52
rmse(test_data_p$resale_price, 
     test_data_p$GRF_pred)
[1] 43401.41

From the RMSE comparison, we can tell that GWR is the best predictor of resale house prices given its lowest RMSE score. Out of all the models, OLS performed the worst as it has the highest RMSE score. This means that out of all the models, OLS is the worst predictor of resale house prices.

When predicting resale prices, we should use GWR methods instead of OLS method. The OLS method is extremely inaccurate in predicting resale house prices.

Improvements that could have been made: - A larger test set could have been used. - For this assignment, only 30 trees were used. However, greater trees would help improve the models. - Instead of just analysing 4 room flats, I could’ve have analysed more flats with different numbers of rooms.

References:

Thank you Prof Kam and seniors Megan and Aisyah for the material and useful pointers!