**Spreadsheet Statistics with GeoGebra 4.0**

(Verified with release 3.9.189.)

Release 4 brings a significant shift in the capabilities of GeoGebra as a tool for teaching statistical concepts. This is a fast demonstration of some of the statistics features of GeoGebra 4.0. For this worksheet we are looking at things that can be done from the spreadsheet view.

The easiest way to start with statistics in GeoGebra is to start with the spreadsheet view.

GeoGebra has commands for sampling from a number of random distributions. We start with the RandomBetween command in cell A1 to make that a random integer from a uniform distribution.

I then drag to do a quick fill to use the same command for the first 20 cells of column A.

It is useful to be aware of the options for viewing the Algebra Descriptions. This option also applies to the spreadsheet. While we are setting up the spreadsheet it is useful to see the commands. When we are doing data analysis it will be useful to see the values.

We fill in two more columns. In B we use a linear variable. In C we distort the linear variable by a normal variable.

I am ready to convert back to seeing the Algebra descriptions as values.

Notice that the toolbar has a different set of icons in the spreadsheet. Our first interest is the tool for one variable analysis. We select the data in the first column before choosing the tool.

Selecting the tool for "One Variable Analysis" produces a number of actions. In the algebra view, the selected data has become a list that is sorted. A new One Variable Statistics window pops up. On the left side of that window we have a summary of standard one variable statistics. On the right side we have a histogram of the data.

The label histogram is actually a drop down menu that lists a number of kinds of graphical presentations to choose from.

The tool icon on the right side of the screen opens a panel that gives you choices about the presentation of the graph.

The options button in the lower left corner of the window lets you see the data you are gathering. It also lets you display two graphs at once.

It should be noted that a stem and leaf plot also produces a text for the plot in the graphics view. It is also worth noting that when we closed the analysis window, the sorted list also disappeared.

We now turn toward making some of the data in the spreadsheet available to the algebra view. We choose the tool for making a list. A pop up dialog gives us the opportunity to name the list.

Similarly, we can use the same tool to make a list of points. This tool takes 2 columns of numbers as input and turns them into ordered pairs to produce points. In the pop up dialog, you can specify if the first or second column corresponds to x.

It should be noted that at this point, use of these tools creates a bunch of auxiliary points named by P with a subscript. You may want to make these points not visible.

We are ready to look at the Two Variable Regression Analysis tool. Having selected two columns of data the tool starts by creating a scatterplot and some information about the distributions of x and y. While no regression formula is given, a drop down menu lets you choose the kind of curve you want to fit.

When a type of curve is chosen, GeoGebra computes the best fit curve.

The multiple variable analysis tool lets you gather data about several distributions at a time.

There is also a tool to take a section of a spreadsheet and to turn it into a matrix.

We now want to look at the probability calculator tool. Clicking on this tool brings up a window for doing probability calculations. With a drop down menu you can choose one or two sided tests.

You can also do calculations for a variety of probability distributions. With each distribution you can input the appropriate parameters.

A final issue we want to deal with is easily being able to get GeoGebra to recompute a specified set of random variables. In the example we have been working with we would like to be able to resample the values in column A or C. The trick is to include a value that has no impact on the value of the cell, but whose value can be changed to force the cell to be re-evaluated.

We change the formula for the A cells from RandomBetween[0,100] to RandomBetween[0,100]+0*D$0.

For the cells in C we append +0*D$2. Since 0 times any number is still 0 we have not changed the value of the cells.

However, when we change the value of D1 or D2 we force the taking of new random samples. (We do not even have to change the values of D1 and D2. It is enough to select them and hit return so they are re-evaluated.)

© 2011, Mike May, S.J.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 license, Mike May, S.J. maymk@slu.edu

## Comments (0)

You don't have permission to comment on this page.