-- Leo's gemini proxy

-- Connecting to sotiris.papatheodorou.xyz:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

Announcing tsvutils


I think I have mentioned how much I like Tab Separated Value (TSV) files before. They are simple to write and read which makes them easy to manipulate using standard Unix tools like awk. Here is an example TSV file:


Time (s)	Temperature	Distance (m)
0.0	29	0
0.5	29.5	0.8
2.2	31	3

Over the last months my collection of TSV related scripts grew. I thought I could write a few more and create a collection of Unix-like utilities for TSV files. The scripts try to mimic Unix utilities in behavior while taking into account the structure of TSV files. For example tsvtail will always print the TSV header and won't count it in the number of lines to print and tsvcut allows selecting columns by name instead of index.


The utilities read data from standard input and write to standard output so you can compose them into pipelines. They are all documented via man pages for quick, offline access to documentation. Most of the utilities are POSIX shell scripts with the exception of the Comma Separated Value (CSV) converters where I used Python's csv module and tsvplot which uses gnuplot. I have a few more utilities planned so the list will grow eventually.



Usage examples


Plot the data in file.tsv using the first column as the x-axis data and all other columns as y-axis data and save the result to plot.png.

tsvplot file.tsv > plot.png

Convert the data in file.csv to TSV, keep the columns whose names match one of the extended regular expressions Time and Distance and then plot the data.

csv2tsv file.csv | tsvcut Time Distance | tsvplot

Keep the columns of file.tsv whose names contain (m), format them as an HTML table and display it in the lynx browser. The parentheses have to be backslash escaped because they are extended regular expression special characters.

tsvcut '\(m\)' < file.tsv | tsv2html | lynx -stdin

Sort the rows of file.tsv in descending order based on the values of the column whose name is Temperature, keep the first 5 and save them as a Markdown table in top_5.md.

tsvsort -r '^Temperature$' < file.tsv | tsvtail -n 5 | tsv2md > top_5.md

I hope you enjoy using tsvutils as much as I do!


Sotiris 2022-07-07


tsvutils project page

-- Response ended

-- Page fetched on Sun May 19 05:03:50 2024