rustkas

rustkas

Property-Based Testing with PropEr, Erlang, and Elixir: implementation without map module restriction is better choise for CSV parser (page 148)

CSV parsing

Dear author, @ferd, I am solely out of positive motives and a desire to improve the books, I would like to suggest that you think about update the section without the help of the maps module functionality in future editions of your very informative and useful book. As shown by the working tests and the implementation that I did without using this module, but using an older ones (lists, proplists) (and as the last test convincingly showed) - the maps module is not the best and not very visual solution for this task, moreover, it has limitations in which there is no need.

%% @doc this counterexample is taken literally from the RFC and cannot
%% work with the current implementation because maps have no dupe keys
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    [Map1, Map2] = bday_csv:decode(CSV),
    %?debugFmt("Map1 = ~p~nMap2 = ~p~n", [Map1, Map2]),
    %?debugFmt("Map2 = ~p~n",[Map2]),
    ?assertEqual(1, length(maps:keys(Map1))),
    ?assertEqual(1, length(maps:keys(Map2))),
    ?assertMatch(#{"field_name" := _}, Map1),
    ?assertMatch(#{"field_name" := _}, Map2).

See what we can get by simplifying our CSV parser implementation:

%% @doc this counterexample is taken literally from the RFC
dupe_keys_unsupported_test() ->
    CSV = "field_name,field_name,field_name\r\n"
          "aaa,bbb,ccc\r\n"
          "zzz,yyy,xxx\r\n",
    Result = bday_csv_tuple:decode(CSV),
    List = lists:flatten(Result),
    ?assertEqual(6, length(List)),
    lists:foreach(fun(Elem) -> ?assertMatch({"field_name", _}, Elem) end, List).

Link to source code

Marked As Solved

ferd

ferd

Author of Property-Based Testing with PropEr, LYSE, & Erlang in Anger

That would make the implementation and testing shorter, but do note that the chapter has chosen to use maps as a datastructure for its ease of use to the callers.

That there is a mismatch between the chosen disk format and the useful code format is one of the interesting things that come up and we have to adjust to: either change the spec, or tweak the tests. You are suggesting the former, the book went for the latter.

There is a last gotcha implicit to the implementation of our CSV parser: since it uses maps, duplicate column names are not tolerated. Since our CSV files have to be used to represent a database, it is probably a fine assumption to make about the data set that column names are all unique. All in all, we’re probably good ignoring duplicate columns and single-columns CSV files since it’s unlikely database tables would be that way either, but it’s not fully CSV compliant.

If your CSV parser now supports multiple duplicate columns, there is now a concern that the code that uses the returned lists is able to deal with the edge case of multiple keys being returned, or that a conversion step that removes (or errors on) duplicates is added and also tested. I tend to like narrowing all of this at the edge of the system (when converting from CSV to what is now safe internally).

Your approach is fine and simplifies the CSV testing (your snippets are cleaner), but you should still expect to add specific testing elsewhere in the application that tackles that mismatch between what CSV supports and what the records represented by a database would support somewhere.

Where Next?

Popular Pragmatic Bookshelf topics Top

New
telemachus
Python Testing With Pytest - Chapter 2, warnings for “unregistered custom marks” While running the smoke tests in Chapter 2, I get these...
New
GilWright
Working through the steps (checking that the Info,plist matches exactly), run the demo game and what appears is grey but does not fill th...
New
JohnS
I can’t setup the Rails source code. This happens in a working directory containing multiple (postgres) Rails apps. With: ruby-3.0.0 s...
New
joepstender
The generated iex result below should list products instead of product for the metadata. (page 67) iex> product = %Product{} %Pento....
New
swlaschin
The book has the same “Problem space/Solution space” diagram on page 18 as is on page 17. The correct Problem/Solution space diagrams ar...
New
jskubick
I think I might have found a problem involving SwitchCompat, thumbTint, and trackTint. As entered, the SwitchCompat changes color to hol...
New
dtonhofer
@parrt In the context of Chapter 4.3, the grammar Java.g4, meant to parse Java 6 compilation units, no longer passes ANTLR (currently 4....
New
SlowburnAZ
Getting an error when installing the dependencies at the start of this chapter: could not compile dependency :exla, "mix compile" failed...
New
dachristenson
@mfazio23 Android Studio will not accept anything I do when trying to use the Transformations class, as described on pp. 140-141. Googl...
New

Other popular topics Top

PragmaticBookshelf
Take your Go skills to the next level by learning how to design, develop, and deploy a distributed service. Start from the bare essential...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
PragmaticBookshelf
Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
sir.laksmana_wenk
I’m able to do the “artistic” part of game-development; character designing/modeling, music, environment modeling, etc. However, I don’t...
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New

Sub Categories: