Lines Matching refs:that
9 to find out whether the format of that file is plain text. Although
23 of false negatives is sometimes too high, which means that the recall
25 the false positives that may occur when binary files containing large
28 In this article we propose a new, simple detection scheme that features
48 If a file contains at least one byte that belongs to the white list and
49 no byte that belongs to the black list, then the file is categorized as
59 The first observation is that, although the full range of 7-bit codes
63 10 (LF) and 13 (CR). There are a few more control codes that are
70 The second observation is that most of the binary files tend to contain
74 labeled as textual, because the files that are genuinely binary tend to
86 There is an extra category of plain text files that are "polluted" with
88 considerations. In such cases, a scheme that tolerates a small fraction
91 false positives are more likely to appear in binary files that contain
95 Under this premise, it is safe to say that our detection method provides