E Amazings
  • Home
  • Automotive
  • Business
  • CBD
  • Crypto
  • Education
  • Entertainment
  • Fashion
  • Finance
  • Health
  • Home Improvement
  • Law \ Legal
  • News
  • Shopping
  • Sports
  • Technology
  • Travel
  • Need Help?

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

What Closing Costs Do Home Buyers Have?

February 25, 2023

What Is Realtek HD Audio Manager

February 2, 2023

A Basic Guide To Cell Tower Leasing

February 2, 2023
Facebook Twitter Instagram
E Amazings
  • Home
  • Automotive
  • Business
  • CBD
  • Crypto
  • Education
  • Entertainment
  • Fashion
  • Finance
  • Health
  • Home Improvement
  • Law \ Legal
  • News
  • Shopping
  • Sports
  • Technology
  • Travel
  • Need Help?
Facebook Twitter Instagram
E Amazings
You are at:Home»Technology»Linux Fu: Miller The Killer Makes CSV No Pest
Technology

Linux Fu: Miller The Killer Makes CSV No Pest

By December 28, 2022No Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter Pinterest WhatsApp Email

[ad_1]

Historically, one of the nice things about Unix and Linux is that everything is a file, and files are just sequences of characters. Of course, modern practice is that everything is not a file, and there is a proliferation of files with some imposed structure. However, if you’ve ever worked on old systems where your file access was by the block, you’ll appreciate the Unix-like files. Classic tools like awk, sed, and grep work with this idea. Files are just characters. But this sometimes has its problems. That’s the motivation behind a tool called Miller, and I think it deserves more attention because, for certain tasks, it is a lifesaver.

The Problem

Consider trying to process a comma-delimited file, known as a CSV file. There are a lot of variations to this type of file. Here’s one that defines two “columns.” I’ve deliberately used different line formats as a test, but most often, you get one format for the entire file:

Slot,String 
A,"Hello" 
"B",Howdy 
"C","Hello Hackaday" 
"D","""Madam, I'm Adam,"" he said." 
E 100,With some spaces!
X,"With a comma, or two, even"


The first column, Slot, has items A, B, C, D, and E 100. Note that some of the items are quoted, but others are not. In any event, the column content is B not “B” because the quotes are not part of the data.

The second column, String, has a mix of quotes, no quotes, spaces, and even commas inside quotes. Suppose you wanted to process this with awk. You can do it, but it is painful. Notice the quotes are escaped using double quotes, as is the custom in CSV files. Writing a regular expression to break that up is not impossible but painful. That’s where Miller comes in. It knows about data formats like CSV, JSON, KDVP8, and a few others. It can also output in those formats and others like Markdown, for example.

Simple Example Runs

Because it knows about the format, it can process the file handily:

$ mlr –icsv cat miller.in
Slot=A,String=Hello
Slot=B,String=Howdy
Slot=C,String=Hello Hackaday
Slot=D,String=”Madam, I’m Adam,” he said.
Slot=E 100,String=With some spaces!
Slot=X,String=With a comma, or two, even

Notice there is no command called “miller.” The command name is “mlr.” This output wouldn’t be a bad format to further process with awk, but we don’t have to. Miller can probably do everything we need. Before we look at that, though, consider what would happen if you just wanted a pretty format output:

Not too bad! Don’t forget, the tool would do the same trick with JSON and other formats, too.

So Many Options

The number of options can be daunting. There are options to pass or ignore comments, process compressed data, or customize the input or output file format a bit.

But the real power to Miller is the verbs. In the above example, the verb was cat. These are mostly named after the Linux commands they duplicate. For example, cut will remove certain fields from the data. The grep, head, and tail commands all do what you expect.

There are many new verbs, too. Count will give you a count of how much data has gone by and filter is a better version of grep. You can do database-like joins, sorting, and even statistics and generate text-based bar graphs.

The filter and put commands have an entire programming language at their disposal that has all the things you’d expect to find in a language like awk or Perl.

What’s nice is that when you want to remove a field or sort, you can refer to it by name (like “Slot”), and Miller will know what you mean. There is a way to refer to fields with numbers if you must, but that’s a rare thing in a Miller script.

For example, if you have some data with fields “stock” and “reserve” that you want to get rid of, you could write something like this:

mlr --icsv --opprint cut -f stock,reserve inventory.csv

Or, perhaps you want to select lines where stock is “N”:

mlr --icsv --opprint filter '$stock == "N"' inventory.csv

Go Read

There’s simply not enough room to cover all the features of this powerful program. I’d suggest you check out Miller in 10 Minutes which is part of the official documentation. You’ll still need to read the documentation further, but at least you’ll have a good start.

Don’t get me wrong, we still like awk. With a little work, you can make it do almost anything. But if you can do less work with Miller, why not?

[ad_2]

Source link

Related Posts

What Is Realtek HD Audio Manager

By Corbin BowenFebruary 2, 2023

A Basic Guide To Cell Tower Leasing

By Corbin BowenFebruary 2, 2023

The Flight Of The Dremel

By January 5, 2023

A White-Light Laser, On The Cheap

By January 5, 2023
Add A Comment

Comments are closed.

Our Picks

What Closing Costs Do Home Buyers Have?

By Corbin BowenFebruary 25, 2023

What Is Realtek HD Audio Manager

By Corbin BowenFebruary 2, 2023

A Basic Guide To Cell Tower Leasing

By Corbin BowenFebruary 2, 2023
Recent Posts
  • What Closing Costs Do Home Buyers Have? February 25, 2023
  • What Is Realtek HD Audio Manager February 2, 2023
  • A Basic Guide To Cell Tower Leasing February 2, 2023
  • Air Duct Repair 101: Everything You Need To Know February 2, 2023
  • Advantage LIC? How Budget Insurance Amendment Bill may benefit the PSU insurance giant January 5, 2023
  • The Flight Of The Dremel January 5, 2023
  • LIC offering multiple benefits on premium payment with co-branded credit cards with Axis Bank: Check features, offer January 5, 2023
Archives
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • September 2021
Facebook Twitter Instagram Pinterest TikTok
© 2022 E Amazings - All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.