rustkas

rustkas

Property-Based Testing with PropEr, Erlang, and Elixir: implementation without map module restriction is better choise for CSV parser (page 148)

CSV parsing

Dear author, @ferd, I am solely out of positive motives and a desire to improve the books, I would like to suggest that you think about update the section without the help of the maps module functionality in future editions of your very informative and useful book. As shown by the working tests and the implementation that I did without using this module, but using an older ones (lists, proplists) (and as the last test convincingly showed) - the maps module is not the best and not very visual solution for this task, moreover, it has limitations in which there is no need.

%% @doc this counterexample is taken literally from the RFC and cannot
%% work with the current implementation because maps have no dupe keys
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    [Map1, Map2] = bday_csv:decode(CSV),
    %?debugFmt("Map1 = ~p~nMap2 = ~p~n", [Map1, Map2]),
    %?debugFmt("Map2 = ~p~n",[Map2]),
    ?assertEqual(1, length(maps:keys(Map1))),
    ?assertEqual(1, length(maps:keys(Map2))),
    ?assertMatch(#{"field_name" := _}, Map1),
    ?assertMatch(#{"field_name" := _}, Map2).

See what we can get by simplifying our CSV parser implementation:

%% @doc this counterexample is taken literally from the RFC
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    Result = bday_csv_tuple:decode(CSV),
    List = lists:flatten(Result),
    ?assertEqual(6, length(List)),
    lists:foreach(fun(Elem) -> ?assertMatch({"field_name", _}, Elem) end, List).

Link to source code

Marked As Solved

ferd

ferd

Author of Property-Based Testing with PropEr, LYSE, & Erlang in Anger

That would make the implementation and testing shorter, but do note that the chapter has chosen to use maps as a datastructure for its ease of use to the callers.

That there is a mismatch between the chosen disk format and the useful code format is one of the interesting things that come up and we have to adjust to: either change the spec, or tweak the tests. You are suggesting the former, the book went for the latter.

There is a last gotcha implicit to the implementation of our CSV parser: since it uses maps, duplicate column names are not tolerated. Since our CSV files have to be used to represent a database, it is probably a fine assumption to make about the data set that column names are all unique. All in all, we’re probably good ignoring duplicate columns and single-columns CSV files since it’s unlikely database tables would be that way either, but it’s not fully CSV compliant.

If your CSV parser now supports multiple duplicate columns, there is now a concern that the code that uses the returned lists is able to deal with the edge case of multiple keys being returned, or that a conversion step that removes (or errors on) duplicates is added and also tested. I tend to like narrowing all of this at the edge of the system (when converting from CSV to what is now safe internally).

Your approach is fine and simplifies the CSV testing (your snippets are cleaner), but you should still expect to add specific testing elsewhere in the application that tackles that mismatch between what CSV supports and what the records represented by a database would support somewhere.

Where Next?

Popular Pragmatic Bookshelf topics Top

johnp
Running the examples in chapter 5 c under pytest 5.4.1 causes an AttributeError: ‘module’ object has no attribute ‘config’. In particula...
New
brianokken
Many tasks_proj/tests directories exist in chapters 2, 3, 5 that have tests that use the custom markers smoke and get, which are not decl...
New
rmurray10127
Title: Intuitive Python: docker run… denied error (page 2) Attempted to run the docker command in both CLI and Powershell PS C:\Users\r...
New
Chrichton
Dear Sophie. I tried to do the “Authorization” exercise and have two questions: When trying to plug in an email-service, I found the ...
New
patoncrispy
I’m new to Rust and am using this book to learn more as well as to feed my interest in game dev. I’ve just finished the flappy dragon exa...
New
AndyDavis3416
@noelrappin Running the webpack dev server, I receive the following warning: ERROR in tsconfig.json TS18003: No inputs were found in c...
New
dsmith42
Hey there, I’m enjoying this book and have learned a few things alredayd. However, in Chapter 4 I believe we are meant to see the “>...
New
rainforest
Hi, I’ve got a question about the implementation of PubSub when using a Phoenix.Socket.Transport behaviour rather than channels. Before ...
New
tkhobbes
After some hassle, I was able to finally run bin/setup, now I have started the rails server but I get this error message right when I vis...
New
a.zampa
@mfazio23 I’m following the indications of the book and arriver ad chapter 10, but the app cannot be compiled due to an error in the Bas...
New

Other popular topics Top

PragmaticBookshelf
Machine learning can be intimidating, with its reliance on math and algorithms that most programmers don't encounter in their regular wor...
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
New
Exadra37
I am asking for any distro that only has the bare-bones to be able to get a shell in the server and then just install the packages as we ...
New
PragmaticBookshelf
Build highly interactive applications without ever leaving Elixir, the way the experts do. Let LiveView take care of performance, scalabi...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
foxtrottwist
A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...
New
xiji2646-netizen
Woke up to this today: Claude Code’s complete source code exposed via npm source map. Not a snippet. All 512,000 lines. 1,900 TypeScript ...
New

Sub Categories: