How to read when delimiter is space and missing values are blank?

I have a space delimited file and some columns are blank, so we end up having multiple spaces, and fread fails with error. But read.table works fine. See example:

library(data.table)
# R version 3.4.2 (2017-09-28)
# data.table_1.10.4-3

fread("A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)
Error in fread("A B C D\n1 2  3\n4 5 6 7") : 
  Expected sep (' ') but new line, EOF (or other non printing character) ends field 2 when detecting types from point 0: 1 2  3
read.table(text ="A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)
#   A B  C D
# 1 1 2 NA 3
# 2 4 5  6 7

How do we read using fread, I tried setting sep = " " and na.string = "", didn't help.

1 answer

  • answered 2018-01-11 20:20 zx8754

    In fread function, by default strip.white is set to TRUE, meaning leading trailing spaces are removed. That is useful to read files with fixed width or with irregular number of spaces as separator.

    Whereas in read.table strip.white by default is set to FALSE.

    fread("A B C D
    1 2  3
    4 5 6 7", sep = " ", header = TRUE, strip.white = FALSE)
    #    A B  C D
    # 1: 1 2 NA 3
    # 2: 4 5  6 7
    

    Note: Providing self-answer as I couldn't find relevant post, also this tripped me over once and twice.