Quick start • homelocator

Install package

# Install development version from GitHub
install_github("spatialnetworkslab/homelocator")

Load library

#load homelocator package 
library(homelocator)
#> Welcome to homelocator package!

# load other packages needed for the analysis
library(tidyverse)
library(here)

Load test data

The test data includes 100 random users, and it can be used as an example to get started with homelocator package.

data("test_sample", package = "homelocator")

Validate test sample

The validate_dataset() function makes sure your input dataset contains all three necessary variables: user, location and timestamp. There are 4 arguments in this function:

user: name of column that holds a unique identifier for each user.
timestamp: name of column that holds specific timestamp for each data point. This timestamp should be in POSIXct format.
location: name of column that holds a unique identifier for each location.
keep_other_vars: option to keep or remove any other variables of the input dataset. The default is FALSE.

When validating the dataset, please specify the names of column for user, timestamp and location.

validate_dataset(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", keep_other_vars = FALSE)
#> 🎉 Congratulations!! Your dataset has passed validation.
#> 👤 There are 100 unique users in your dataset.
#> 🌏 Now start your journey identifying their meaningful location(s)!
#> 👏 Good luck!
#> 
#> # A tibble: 16,300 × 3
#>        u_id grid_id created_at         
#>       <int>   <int> <dttm>             
#>  1 92298565    1581 2016-04-17 22:43:06
#>  2 33908340    1461 2014-10-03 16:29:48
#>  3 92298565    1136 2014-02-07 07:26:15
#>  4 11616678    1375 2014-07-18 10:08:21
#>  5 11616678    1375 2013-11-24 23:16:24
#>  6 47727539     736 2016-06-21 15:59:49
#>  7 54875363    1572 2013-01-23 01:56:07
#>  8 40153763     759 2016-06-23 00:36:25
#>  9  9982726    1340 2015-03-12 17:00:39
#> 10 90403900     339 2014-08-04 19:10:12
#> # ℹ 16,290 more rows

Identify home locations with embedded recipes

Recipe: HMLC

Weighs data points across multiple time frames to ‘score’ potentially meaningful locations for each user

# recipe: homelocator -- HMLC
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "HMLC")

Recipe: FREQ

Selects the most frequently ‘visited’ location assuming a user is active mainly around their home location.

# recipe: Frequency -- FREQ
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "FREQ")

Recipe: OSNA (Efstathiades et al.2015):

Finds the most ‘popular’ location during ‘rest’, ‘active’ and ‘leisure time. Here we focus on ’rest’ and ‘leisure’ time to find the most possible home location for each user.

# recipe: Online Social Network Activity -- OSNA
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "OSNA")

Recipe: APDM (Ahas et al. 2010)

Calculates the average and standard deviation of start time data points by a single user, in a single location.

# recipe: Online Social Network Activity -- APDM
## APDM recipe strictly returns the most likely home location
## It is important to load the neighbors table before you use the recipe!!
## example: st_queen <- function(a, b = a) st_relate(a, b, pattern = "F***T****")
##          neighbors <- st_queen(df_sf) ===> convert result to dataframe 
data("df_neighbors", package = "homelocator")
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", recipe = "APDM", keep_score = F)