pieteeken

Machine Learning in Elixir: comments on chapter 11, version B 3.0

Comments to Chapter 11 version B 3.0

The dependencies from chapter 11 gave errors on 10 march 2024.
I followed the advices in the errors, but at the end it still didn’t work.
So I loaded this versions (They work with cuda120!)

 Mix.install([  
  {:bumblebee, "> 0.0.0"},  
  {:axon, "> 0.0.0"},  
  {:exla, "> 0.0.0"},  
  {:nx, "> 0.0.0"},  
  {:kino, "~> 0.8"},  
  {:kino_bumblebee, "> 0.0.0"}  
 ])  

Nx.global_default_backend(EXLA.Backend)

Section Generating Text

Bumblebee.Text.conversation does not exist in BumbleBee version 0.5
So I used Bumblebee.Text.generation()
Also the Kino part had to be changed!

With Bumblebee.Text.generation() the model with “microsoft/DialoGPT-medium” only finishes sentences. So it is not really a chatbot.

If you change the model to “gpt2” you get a real chatbot (but it is a weird bot!)
Sentences are repeated. Don’t know how to handle it.

# The weird bot  
{:ok, model} = Bumblebee.load_model({:hf, "gpt2"})  
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "gpt2"})  
{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "gpt2"})  

# This model "microsoft/DialoGPT-medium"  only finishes sentences 
# {:ok, model} = Bumblebee.load_model({:hf, "microsoft/DialoGPT-medium"})  
# {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "gpt2"})  
# {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "microsoft/DialoGPT-medium"})  

generation_config = Bumblebee.configure(generation_config, max_new_tokens: 100)

Also the serving statement mus be changed

serving =  
  Bumblebee.Text.generation(model, tokenizer, generation_config,  
    compile: [batch_size: 1, sequence_length: 1000],  
    defn_options: [compiler: EXLA]  
  )

The Kino part must be changed a bit.
In the form watch the “%{results…” line

frame = Kino.Frame.new()  
controls = [message: Kino.Input.text("New Message")]  
form = Kino.Control.form(controls, submit: "Send Message", reset_on_submit: [:message])  

form  
|> Kino.Control.stream()  
|> Kino.listen(nil, fn %{data: %{message: message}}, _token_summary ->  
  Kino.Frame.append(frame, Kino.Markdown.new("**Me:** #{message}"))  

  %{results: [%{text: text, token_summary: token_summary}]}= Nx.Serving.run(serving, %{text: message})  

  Kino.Frame.append(frame, Kino.Markdown.new("**Bot:** #{text}"))  
  {:cont, token_summary}  
end)  

Kino.Layout.grid([frame, form], gap: 16)

Section Classifying Images

In the first cell:
defn_options: [compiler: EXLA] gives an error, maybe because of use cuda120?
So I removed it

{:ok, model_info} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"})
serving =
  Bumblebee.Vision.image_classification(model_info, featurizer,
    top_k: 1,
    compile: [batch_size: 1]
    # defn_options: [compiler: EXLA] #gives error
  )

The Kino part doesn’t work at all.
Dragging the image removes all the Evalute/Reevaluate knobs
Open camera and Upload work correct.
But pushing the Run button gives an error. I could not repair that.
So I replaced the Kino part by:

image =  
  "/home/piet/livebooks/ML_Elixir/Cat.jpg"  
  |> StbImage.read_file!()  
  |> StbImage.resize(224, 224)  
  |> StbImage.to_nx()  
  |> Nx.as_type({:u, 8})  
  
  output = Nx.Serving.run(serving, image)  

output.predictions  
|> Enum.map(&{&1.label, &1.score})  
|> Kino.Bumblebee.ScoredList.new()

Section Fine-tuning Pre-trained Models

In the beginning training seemed to go well, although the accuracy after batch 861 was 0.0920101!
And at that batch I got an error!
The error after Epoch: 0, Batch: 861:

  ** (MatchError) no match of right hand side value: ["\"2", "This must be the place that .....  
  ............  
  ..........."]  
     
   (stdlib 5.1.1) erl_eval.erl:498: :erl_eval.expr/6  
    #cell:33sgnufmubmhq4l6:11: (file)  
    #cell:33sgnufmubmhq4l6:10: (file)  
    #cell:33sgnufmubmhq4l6:14: (file)

Probably there is some error in the Yelp zip file.
I downloaded it twice but still an error.

Also every adding to the output gave the annoying warning:

warning: passing options to Bumblebee.apply_tokenizer/3 is deprecated, please use Bumblebee.  configure/2 to set tokenizer options  
  (bumblebee 0.5.3) lib/bumblebee.ex:856: Bumblebee.apply_tokenizer/3  
  (elixir 1.15.7) src/elixir.erl:396: :elixir.eval_external_handler/3  
  (stdlib 5.1.1) erl_eval.erl:750: :erl_eval.do_apply/7  
  (stdlib 5.1.1) erl_eval.erl:494: :erl_eval.expr/6  
  (stdlib 5.1.1) erl_eval.erl:136: :erl_eval.exprs/6  
  (elixir 1.15.7) lib/stream.ex:613: anonymous fn/4 in Stream.map/2

Since this is not the end version (I assume) and the dependencies are not right, the chapter, besides the first section, needs a rewrite.
If it is necessary I can send you my livebook, But I don`t know how to attach.

1 comment

#errata /book-machine-learning-in-elixir

1 343 1

2024-04-19 12:48:40 UTC

First Post!

jshprentz

@seanmor5

As Piet reported, the code on page 260 fails at

    image
    |> Nx.from_binary(:u8)

Based on the example at Examples — Bumblebee v0.5.3, I replaced the image line with this pipeline:

    image.file_ref
    |> Kino.Input.file_path()
    |> File.read!()

Now the Run button successfully ingests the image, runs the model, and reports the top predictions. The serving = ... code on page 259 contained the option top_k: 1, so only one prediction is shown. After changing 1 to 5 and uploading the book’s cat image, the five predictions match those shown on page 261, albeit with different probabilities.

Here is the complete modified code from page 260:

image_input = Kino.Input.image("Image", size: {224, 224})
form = Kino.Control.form([image: image_input], submit: "Run")
frame = Kino.Frame.new()

form
|> Kino.Control.stream()
|> Stream.filter(& &1.data.image)
|> Kino.listen(fn %{data: %{image: image}} ->
  Kino.Frame.render(frame, Kino.Markdown.new("Running..."))

  image =
    image.file_ref
    |> Kino.Input.file_path()
    |> File.read!()
    |> Nx.from_binary(:u8)
    |> Nx.reshape({image.height, image.width, 3})

  output = Nx.Serving.run(serving, image)

  output.predictions
  |> Enum.map(&{&1.label, &1.score})
  |> Kino.Bumblebee.ScoredList.new()
  |> then(&Kino.Frame.render(frame, &1))
end)

Kino.Layout.grid([form, frame], boxed: true, gap: 16)

Post #2