rustkas

rustkas

Property-Based Testing with PropEr, Erlang, and Elixir: implementation without map module restriction is better choise for CSV parser (page 148)

CSV parsing

Dear author, @ferd, I am solely out of positive motives and a desire to improve the books, I would like to suggest that you think about update the section without the help of the maps module functionality in future editions of your very informative and useful book. As shown by the working tests and the implementation that I did without using this module, but using an older ones (lists, proplists) (and as the last test convincingly showed) - the maps module is not the best and not very visual solution for this task, moreover, it has limitations in which there is no need.

%% @doc this counterexample is taken literally from the RFC and cannot
%% work with the current implementation because maps have no dupe keys
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    [Map1, Map2] = bday_csv:decode(CSV),
    %?debugFmt("Map1 = ~p~nMap2 = ~p~n", [Map1, Map2]),
    %?debugFmt("Map2 = ~p~n",[Map2]),
    ?assertEqual(1, length(maps:keys(Map1))),
    ?assertEqual(1, length(maps:keys(Map2))),
    ?assertMatch(#{"field_name" := _}, Map1),
    ?assertMatch(#{"field_name" := _}, Map2).

See what we can get by simplifying our CSV parser implementation:

%% @doc this counterexample is taken literally from the RFC
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    Result = bday_csv_tuple:decode(CSV),
    List = lists:flatten(Result),
    ?assertEqual(6, length(List)),
    lists:foreach(fun(Elem) -> ?assertMatch({"field_name", _}, Elem) end, List).

Link to source code

Marked As Solved

ferd

ferd

Author of Property-Based Testing with PropEr, LYSE, & Erlang in Anger

That would make the implementation and testing shorter, but do note that the chapter has chosen to use maps as a datastructure for its ease of use to the callers.

That there is a mismatch between the chosen disk format and the useful code format is one of the interesting things that come up and we have to adjust to: either change the spec, or tweak the tests. You are suggesting the former, the book went for the latter.

There is a last gotcha implicit to the implementation of our CSV parser: since it uses maps, duplicate column names are not tolerated. Since our CSV files have to be used to represent a database, it is probably a fine assumption to make about the data set that column names are all unique. All in all, we’re probably good ignoring duplicate columns and single-columns CSV files since it’s unlikely database tables would be that way either, but it’s not fully CSV compliant.

If your CSV parser now supports multiple duplicate columns, there is now a concern that the code that uses the returned lists is able to deal with the edge case of multiple keys being returned, or that a conversion step that removes (or errors on) duplicates is added and also tested. I tend to like narrowing all of this at the edge of the system (when converting from CSV to what is now safe internally).

Your approach is fine and simplifies the CSV testing (your snippets are cleaner), but you should still expect to add specific testing elsewhere in the application that tackles that mismatch between what CSV supports and what the records represented by a database would support somewhere.

Where Next?

Popular Pragmatic Bookshelf topics Top

jimmykiang
This test is broken right out of the box… — FAIL: TestAgent (7.82s) agent_test.go:77: Error Trace: agent_test.go:77 agent_test.go:...
New
jimschubert
In Chapter 3, the source for index introduces Config on page 31, followed by more code including tests; Config isn’t introduced until pag...
New
jeffmcompsci
Title: Design and Build Great Web APIs - typo “https://company-atk.herokuapp.com/2258ie4t68jv” (page 19, third bullet in URL list) Typo:...
New
patoncrispy
I’m new to Rust and am using this book to learn more as well as to feed my interest in game dev. I’ve just finished the flappy dragon exa...
New
AndyDavis3416
@noelrappin Running the webpack dev server, I receive the following warning: ERROR in tsconfig.json TS18003: No inputs were found in c...
New
brunogirin
When trying to run tox in parallel as explained on page 151, I got the following error: tox: error: argument -p/–parallel: expected one...
New
adamwoolhether
Is there any place where we can discuss the solutions to some of the exercises? I can figure most of them out, but am having trouble with...
New
mert
AWDWR 7, page 152, page 153: Hello everyone, I’m a little bit lost on the hotwire part. I didn’t fully understand it. On page 152 @rub...
New
redconfetti
Docker-Machine became part of the Docker Toolbox, which was deprecated in 2020, long after Docker Desktop supported Docker Engine nativel...
New
mcpierce
@mfazio23 I’ve applied the changes from Chapter 5 of the book and everything builds correctly and runs. But, when I try to start a game,...
New

Other popular topics Top

AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
PragmaticBookshelf
Tailwind CSS is an exciting new CSS framework that allows you to design your site by composing simple utility classes to create complex e...
New
rustkas
Intensively researching Erlang books and additional resources on it, I have found that the topic of using Regular Expressions is either c...
New
mafinar
This is going to be a long an frequently posted thread. While talking to a friend of mine who has taken data structure and algorithm cou...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
husaindevelop
Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New

Sub Categories: