Methods Hub (beta)

rang

Abstract:

Resolve the dependency graph of R packages at a specific time point for reconstruct the R computational environment

Type: Method
License: GNU General Public License v3.0 only
Programming Language: R

Description

Resolve the dependency graph of R packages at a specific time point based on the information from various ‘R-hub’ web services https://blog.r-hub.io/. The dependency graph can then be used to reconstruct the R computational environment with ‘Rocker’ https://rocker-project.org.

Keywords

  • Computational Environment
  • Computational Reproducibility
  • Open Science

Use Cases

This package is designed to retrospectively construct a constant computational environment for running shared R scripts, in which the computational environment is not specified. Additional functions are provided for creating executable research compendia.

Input Data

The main function resolve() accepts various input data. One example is a path to a directory of R scripts.

Output Data

The main function resolve() gives an S3 object of dependency graph. Please refer to Section 1.8.

Hardware Requirements

Rang runs on any hardware that can run R.

Environment Setup

With R installed:

install.packages("rang")

Installation of Docker or Singularity is strongly recommended.

How to Use

Suppose you would like to run this code snippet in this 2018 paper of the R package quanteda (an R package for text analysis).

library("quanteda")
# construct the feature co-occurrence matrix
examplefcm <-
tokens(data_corpus_irishbudget2010, remove_punct = TRUE) %>%
tokens_tolower() %>%
tokens_remove(stopwords("english"), padding = FALSE) %>%
fcm(context = "window", window = 5, tri = FALSE)
# choose 30 most frequency features
topfeats <- names(topfeatures(examplefcm, 30))
# select the top 30 features only, plot the network
set.seed(100)
textplot_network(fcm_select(examplefcm, topfeats), min_freq = 0.8)

This code cannot be executed with a recent version of quanteda. As the above code was written in 2018, one can get the dependency graph of quanteda in 2018:

library(rang)
graph <- resolve(pkgs = "quanteda",
                 snapshot_date = "2018-10-06",
                 os = "ubuntu-18.04")
graph
resolved: 1 package(s). Unresolved package(s): 0 
$`cran::quanteda`
The latest version of `quanteda` [cran] at 2018-10-06 was 1.3.4, which has 55 unique dependencies (29 with no dependencies.)

This dependency graph can be used to create a dockerized computational environment (in form of Dockerfile) for running the abovementioned code. Suppose one would like to generate the Dockerfile in the directory “quanteda_docker”.

dockerize(graph, "quanteda_docker", method = "evercran")

A Docker container can then be built and launched, e.g. from the shell:

cd quanteda_docker
docker build -t rang .
docker run --rm --name "rangtest" -ti rang

The launched container is based on R 3.5.1 and quanteda 1.3.4 and is able to run the abovementioned code snippet.

Please refer to the official website for further information.

Technical Details

See the publication for information about technical details.

References

Chan, C. H., & Schoch, D. (2023). rang: Reconstructing reproducible R computational environments. PLoS ONE, 18(6): e0286761. https://doi.org/10.1371/journal.pone.0286761.

Contact Details

Maintainer: Chung-hong Chan

Issue Tracker: https://github.com/gesistsa/rang/issues

Scholarly articles