Category Archives: TSV / TAB / CSV

ASCII Delimited text

From Wikipedia

ASCII delimited text
The ASCII and Unicode character sets were designed to solve this problem by the provision of non-printing characters that can be used as delimiters. These are the range from ASCII 28 to 31.

ASCII # Unicode Name Common Name Usage
28 INFORMATION SEPARATOR FOUR file separator (FS) End of file. Or between a concatenation of what might otherwise be separate files.
29 INFORMATION SEPARATOR THREE group separator (GS) Between sections of data. Not needed in simple data files.
30 INFORMATION SEPARATOR TWO record separator (RS) End of a record or row.
31 INFORMATION SEPARATOR ONE unit separator (US) Between fields of a record, or members of a row.

The use of ASCII 31 Unit separator as a field separator and ASCII 30 Record separator solves the problem of both field and record delimiters that appear in a text data stream.[

C# write to TSV / Tab file

Using System.IO;

TextWriter writer = new StreamWriter(@"C:\test.tab");
writer.WriteLine(String.Join("\t", new[] { "abc", "def" }));
writer.WriteLine(String.Join("\t", new[] { "123", "456" }));
writer.Close();

Important thing to note is that it's "\t" not @"\t"

MYSQL to TSV / TAB / CSV

TSV (much easier to use in practice due to lack of need to escape)

SELECT *
FROM pageviews
INTO OUTFILE '/tmp/pageviews.tsv'
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'

CSV (if you must)

SELECT *
FROM pageviews
INTO OUTFILE '/tmp/pageviews.csv'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n';




Note: won’t include column names. There’s no easy way to do this from within mysql, though can be done from the command line.