quick-start.Rmd
# Install development version from GitHub
install_github("spatialnetworkslab/homelocator")
#load homelocator package
library(homelocator)
#> Welcome to homelocator package!
The test data includes 100 random users, and it can be used as an
example to get started with homelocator
package.
data("test_sample", package = "homelocator")
The validate_dataset()
function makes sure your input
dataset contains all three necessary variables: user, location and
timestamp. There are 4 arguments in this function:
user
: name of column that holds a unique identifier for
each user.timestamp
: name of column that holds specific timestamp
for each data point. This timestamp should be in POSIXct
format.location
: name of column that holds a unique identifier
for each location.keep_other_vars
: option to keep or remove any other
variables of the input dataset. The default is FALSE
.When validating the dataset, please specify the names of column for user, timestamp and location.
validate_dataset(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", keep_other_vars = FALSE)
#> 🎉 Congratulations!! Your dataset has passed validation.
#> 👤 There are 100 unique users in your dataset.
#> 🌏 Now start your journey identifying their meaningful location(s)!
#> 👏 Good luck!
#>
#> # A tibble: 16,300 × 3
#> u_id grid_id created_at
#> <int> <int> <dttm>
#> 1 92298565 1581 2016-04-17 22:43:06
#> 2 33908340 1461 2014-10-03 16:29:48
#> 3 92298565 1136 2014-02-07 07:26:15
#> 4 11616678 1375 2014-07-18 10:08:21
#> 5 11616678 1375 2013-11-24 23:16:24
#> 6 47727539 736 2016-06-21 15:59:49
#> 7 54875363 1572 2013-01-23 01:56:07
#> 8 40153763 759 2016-06-23 00:36:25
#> 9 9982726 1340 2015-03-12 17:00:39
#> 10 90403900 339 2014-08-04 19:10:12
#> # ℹ 16,290 more rows
Weighs data points across multiple time frames to ‘score’ potentially meaningful locations for each user
# recipe: homelocator -- HMLC
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "HMLC")
Selects the most frequently ‘visited’ location assuming a user is active mainly around their home location.
# recipe: Frequency -- FREQ
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "FREQ")
Finds the most ‘popular’ location during ‘rest’, ‘active’ and ‘leisure time. Here we focus on ’rest’ and ‘leisure’ time to find the most possible home location for each user.
# recipe: Online Social Network Activity -- OSNA
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "OSNA")
Calculates the average and standard deviation of start time data points by a single user, in a single location.
# recipe: Online Social Network Activity -- APDM
## APDM recipe strictly returns the most likely home location
## It is important to load the neighbors table before you use the recipe!!
## example: st_queen <- function(a, b = a) st_relate(a, b, pattern = "F***T****")
## neighbors <- st_queen(df_sf) ===> convert result to dataframe
data("df_neighbors", package = "homelocator")
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", recipe = "APDM", keep_score = F)