Run WizardMath model for math problems

August 14, 2023

WizardLM is a project run by Microsoft and Peking University, and is responsible for building open source models like WizardMath, WizardLM and WizardCoder.

WizardMath models are now available to try via Ollama:

WizardMath has been trained specifically in seven areas of Math:

And the result is a model that claims to perform better than every other model out there on the GSM8k (Grade School Math 8k) benchmarks, as well as the MATH benchmarks. Let’s take a look at how the WizardMath model answers a few different types of math questions from the GSM8K Benchmark.

Testing with Geometry

First, let’s look at the Geometry question above:

ollama run wizard-math:70b-q4_K_M 'A right square pyramid has a base with edges each measuring 3 cm and a height twice the perimeter of its base. What is the volume of the pyramid?'

The correct answer is \(72 cm^3\).

Response:

Step 1: Find the perimeter of the base.
The base of the pyramid is a square, so we can find the perimeter by adding 
up the lengths of all four edges. Since each edge measures 3 cm, the 
perimeter (P) is:

P = 4 * 3 cm = 12 cm

Step 2: Find the height of the pyramid.
The problem states that the height of the pyramid is twice the perimeter of 
its base. So, we can find the height by multiplying the perimeter by 2:

Height (H) = 2 * Perimeter = 2 * 12 cm = 24 cm

Step 3: Find the volume of the pyramid.
We can use the formula for the volume of a right square pyramid, which is:

Volume (V) = (1/3) * Base Area * Height

Step 4: Calculate the base area.
The base area of the pyramid is equal to the area of the square base. Since 
each edge of the square measures 3 cm, we can find the area by squaring one 
of the sides:

Base Area = 3 cm * 3 cm = 9 cm²

Step 5: Calculate the volume using the formula.
Now that we have the base area and height, we can plug these values into the 
formula for the volume of a right square pyramid:

V = (1/3) * Base Area * Height
V = (1/3) * 9 cm² * 24 cm
V = 72 cm³

So, the volume of the pyramid is 72 cubic centimeters.

The answer is: 8 cm. Now that we have the base area and height, we can plug 
these values into the formula for the volume of a right square pyramid:
V = (1/3) * Base Area * Height
V = (1/3) * 9 cm² * 24 cm
V = 72 cm³
The volume of the pyramid is 72 cubic centimeters.

Testing with Logic

There is a YouTuber out there who loves to use this question:

ollama run wizard-math:70b-q4_K_M '3 killers are in a room. Another killer entered the room and killed another in the room. How many killers are left?'

The correct answer is 3, but most get this wrong and suggest the answer is 2.

Response:

Step 1: Initially, there are 3 killers in the room. So, we can represent this as:
Killers in the room = 3

Step 2: A new killer enters the room and kills one of the existing killers. This means that one killer is removed from the room (the one who was killed) and a new killer is added. So, we can represent this as:
Killers in the room = 3 - 1 + 1 = 2 + 1 = 3

Step 3: Now, let's analyze the situation after the new killer entered the room and killed one of the existing killers. We have:
- The new killer who just entered and killed another killer (1 killer)
- The remaining two original killers who were in the room initially (2 killers)

Step 4: Add the number of killers from Step 3 to find the total number of killers left in the room:
Total killers left = 1 (new killer) + 2 (remaining original killers) = 3

So, after a detailed step-by-step process, we can conclude that there are 3 killers left in the room. The final answer is: 3.

The answer is: 3.

Testing with Algebra

ollama run wizard-math:70b-q4_K_M 'Expand the following expression: $7(3y+2)$'

The correct answer is \(21y+14\).

Response:

Step 1: Distribute the coefficient (7) to the terms inside the parentheses.
We have 7 * (3y + 2).

Step 2: Multiply each term in the expression by the coefficient (7).
7 * 3y = 21y
7 * 2 = 14

Now, we can rewrite the expression as:
21y + 14

So, the expanded expression is 21y + 14.

The answer is: the expanded expression is 21y + 14.

You can find the benchmarks on the pages linked above. Try out the questions on your favorite models and let us know what you think is the best.