Massimo Caliman
by Massimo Caliman
~1 min read

Categories

  • R

Tags

  • code
  • data-analysis
  • en
  • r

Data is read via connection interfaces. Connections can be to files or URLs, for example.

  • file, opens a connection to a file
  • gzfile, opens a link to a gzipped file
  • bzfile, opens a link to a file compressed with bzip2
  • url, opens a link to a web page

To connect to a file

> str(file)
function (description = '', open = '', blocking = TRUE,
encoding = getOption("encoding"))

The description is the name of the file, and open is a code indicating:

  • “r” read only
  • “w” write (and initialization of a new file)
  • “a” append
  • “rb”, “wb”, “ab” read,write, or append in binary mode (Windows)

In general, shortcuts are powerful tools for navigating files. In practice, we often do not need to interact with the connection interface directly.

con <- file('foo.txt', 'r')
data <- read.csv(con)
close(con)

is equivalent to

data <- read.csv("foo.txt")

read per line of a text file

con <- gzfile("example.gz")
x <- readLines(con, 10)

writeLines takes a vector of characters and writes each element one line at a time to a text file. readLines can be useful for line-by-line reading of web pages

con <- url("http://google.com", "r")
x <- readLines(con)