A R package to compare data frames in R. The assumption is that the
user wants the two data frames to be the same. myrror()
highlights the differences between values. When there is no difference,
the comparison is “successful”.
You can install the released version of myrror from CRAN with:
install.packages("myrror")You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("PIP-Technical-Team/myrror")The main function is myrror(), which goes through each
single step of the comparison:
library(myrror)
myrror(survey_data, survey_data_all, by = c('country' = "COUNTRY", "year" = "YEAR"),
interactive = FALSE)
#>
#> ── Myrror Report ───────────────────────────────────────────────────────────────
#>
#> ── General Information: ──
#>
#> dfx: survey_data with 16 rows and 6 columns.
#> dfy: survey_data_all with 12 rows and 5 columns.
#> keys dfx: country and year.
#> keys dfy: COUNTRY and YEAR.
#>
#> ── Note: comparison is done for shared columns and rows. ──
#>
#> ✔ Total shared columns (no keys): 3
#> ! Non-shared columns in survey_data: 1 ("variable3")
#> ! Non-shared columns in survey_data_all: 0 ()
#>
#> ✔ Total shared rows: 12
#> ! Non-shared rows in survey_data: 4.
#> ! Non-shared rows in survey_data_all: 0.
#>
#> ℹ Note: run `extract_diff_rows()` to extract the missing/new rows.
#>
#> ── 1. Shared Columns Class Comparison ──────────────────────────────────────────
#>
#> ! 1 shared column(s) have different class(es):
#>
#> variable class_x class_y
#> <char> <char> <char>
#> 1: variable1 numeric character
#>
#> ── 2. Shared Columns Values Comparison ─────────────────────────────────────────
#>
#> ! 1 shared column(s) have different value(s):
#> ℹ Note: character-numeric comparison is allowed.
#>
#> ── Overview: ──
#>
#> # A tibble: 1 × 4
#> variable change_in_value na_to_value value_to_na
#> <fct> <int> <int> <int>
#> 1 variable2 12 0 0
#>
#> ── Value comparison: ──
#>
#> ! 1 shared column(s) have different value(s):
#> ℹ Note: Only first 5 rows shown for each variable.
#>
#> ── "variable2"
#> diff indexes country year variable2.x variable2.y
#> <char> <char> <char> <int> <num> <num>
#> 1: change_in_value 5 A 2014 -1.0678237 0.9222675
#> 2: change_in_value 6 A 2015 -0.2179749 2.0500847
#> 3: change_in_value 7 A 2016 -1.0260044 -0.4910312
#> 4: change_in_value 8 A 2017 -0.7288912 -2.3091689
#> 5: change_in_value 9 B 2010 -0.6250393 1.0057385
#> ...
#>
#> ℹ Note: run `extract_diff_values()` or `extract_diff_table()` to access the results in list or table format.
#>
#> ✔ End of Myrror Report.The auxiliary functions go through a specific step of the comparison, and can be used independently:
compare_type(): compares the type of shared
columns.
compare_values(): compares the values of shared
columns.
extract_diff_values(): extract the values that are
different between two data frames, returns a list of data frames with
the differences, one for each variable.
extract_diff_table(): extract the values that are
different between two data frames, returns a data.table with all
differences.
See more in the Get started vignette.