rustkas

rustkas

Property-Based Testing with PropEr, Erlang, and Elixir: implementation without map module restriction is better choise for CSV parser (page 148)

CSV parsing

Dear author, @ferd, I am solely out of positive motives and a desire to improve the books, I would like to suggest that you think about update the section without the help of the maps module functionality in future editions of your very informative and useful book. As shown by the working tests and the implementation that I did without using this module, but using an older ones (lists, proplists) (and as the last test convincingly showed) - the maps module is not the best and not very visual solution for this task, moreover, it has limitations in which there is no need.

%% @doc this counterexample is taken literally from the RFC and cannot
%% work with the current implementation because maps have no dupe keys
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    [Map1, Map2] = bday_csv:decode(CSV),
    %?debugFmt("Map1 = ~p~nMap2 = ~p~n", [Map1, Map2]),
    %?debugFmt("Map2 = ~p~n",[Map2]),
    ?assertEqual(1, length(maps:keys(Map1))),
    ?assertEqual(1, length(maps:keys(Map2))),
    ?assertMatch(#{"field_name" := _}, Map1),
    ?assertMatch(#{"field_name" := _}, Map2).

See what we can get by simplifying our CSV parser implementation:

%% @doc this counterexample is taken literally from the RFC
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    Result = bday_csv_tuple:decode(CSV),
    List = lists:flatten(Result),
    ?assertEqual(6, length(List)),
    lists:foreach(fun(Elem) -> ?assertMatch({"field_name", _}, Elem) end, List).

Link to source code

Marked As Solved

ferd

ferd

Author of Property-Based Testing with PropEr, LYSE, & Erlang in Anger

That would make the implementation and testing shorter, but do note that the chapter has chosen to use maps as a datastructure for its ease of use to the callers.

That there is a mismatch between the chosen disk format and the useful code format is one of the interesting things that come up and we have to adjust to: either change the spec, or tweak the tests. You are suggesting the former, the book went for the latter.

There is a last gotcha implicit to the implementation of our CSV parser: since it uses maps, duplicate column names are not tolerated. Since our CSV files have to be used to represent a database, it is probably a fine assumption to make about the data set that column names are all unique. All in all, we’re probably good ignoring duplicate columns and single-columns CSV files since it’s unlikely database tables would be that way either, but it’s not fully CSV compliant.

If your CSV parser now supports multiple duplicate columns, there is now a concern that the code that uses the returned lists is able to deal with the edge case of multiple keys being returned, or that a conversion step that removes (or errors on) duplicates is added and also tested. I tend to like narrowing all of this at the edge of the system (when converting from CSV to what is now safe internally).

Your approach is fine and simplifies the CSV testing (your snippets are cleaner), but you should still expect to add specific testing elsewhere in the application that tackles that mismatch between what CSV supports and what the records represented by a database would support somewhere.

Where Next?

Popular Pragmatic Bookshelf topics Top

jesse050717
Title: Web Development with Clojure, Third Edition, pg 116 Hi - I just started chapter 5 and I am stuck on page 116 while trying to star...
New
yulkin
your book suggests to use Image.toByteData() to convert image to bytes, however I get the following error: "the getter ‘toByteData’ isn’t...
New
AleksandrKudashkin
On the page xv there is an instruction to run bin/setup from the main folder. I downloaded the source code today (12/03/21) and can’t see...
New
alanq
This isn’t directly about the book contents so maybe not the right forum…but in some of the code apps (e.g. turbo/06) it sends a TURBO_ST...
New
dsmith42
Hey there, I’m enjoying this book and have learned a few things alredayd. However, in Chapter 4 I believe we are meant to see the “>...
New
akraut
The markup used to display the uploaded image results in a Phoenix.LiveView.HTMLTokenizer.ParseError error. lib/pento_web/live/product_l...
New
creminology
Skimming ahead, much of the following is explained in Chapter 3, but new readers (like me!) will hit a roadblock in Chapter 2 with their ...
New
ggerico
I got this error when executing the plot files on macOS Ventura 13.0.1 with Python 3.10.8 and matplotlib 3.6.1: programming_ML/code/03_...
New
SlowburnAZ
Getting an error when installing the dependencies at the start of this chapter: could not compile dependency :exla, "mix compile" failed...
New
roadbike
From page 13: On Python 3.7, you can install the libraries with pip by running these commands inside a Python venv using Visual Studio ...
New

Other popular topics Top

Devtalk
Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...
1052 22283 402
New
AstonJ
What chair do you have while working… and why? Is there a ‘best’ type of chair or working position for developers?
New
dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
Rainer
My first contact with Erlang was about 2 years ago when I used RabbitMQ, which is written in Erlang, for my job. This made me curious and...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
PragmaticBookshelf
Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
PragmaticBookshelf
Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...
New
PragmaticBookshelf
Author Spotlight Rebecca Skinner @RebeccaSkinner Welcome to our latest author spotlight, where we sit down with Rebecca Skinner, auth...
New
husaindevelop
Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...
New

Sub Categories: