| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Stop wasting time looking for files and revisions. Connect your Gmail, DriveDropbox, and Slack accounts and in less than 2 minutes, Dokkio will automatically organize all your file attachments. Learn more and claim your free account.

View
 

SpreadsheetStatisticsGeoGebra40RC

Page history last edited by Mike May, S.J. 9 years, 2 months ago

Spreadsheet Statistics with GeoGebra 4.0

 

(Verified with release 3.9.351.)

Release 4 brings a significant shift in the capabilities of GeoGebra as a tool for teaching statistical concepts. This is a fast demonstration of some of the statistics features of GeoGebra 4.0. For this worksheet we are looking at things that can be done from the spreadsheet view.

 

The easiest way to start with statistics in GeoGebra is to start with the spreadsheet view.

 

We want to use the toggle to turn on the style bar. In the style bar we want to select the toggle to turn on the formula input.

 

 

This sets us up with a vary typical layout for a spreadsheet.

 

 

GeoGebra has commands for sampling from a number of random distributions. We start with the RandomBetween command in cell A1 to make that a random integer form a uniform distribution.

 

I then drag to do a quick fill to use the same command for the first 15 cells of column A.

 

It is useful to be aware of the options for viewing the Algebra Descriptions. This option also applies to the spreadsheet. While we are setting up the spreadsheet it is useful to see the commands. When we are doing data analysis it will be useful to see the values.

 

We fill in two more columns. In B we use a linear variable. In C we distort the linear variable by a normal variable.

 

Notice that the menus in the tool bar are different in the spreadsheet view. The picture below gives the menus, both in position and with the usual descriptions.

 

 

I am ready to convert back to seeing the Algebra descriptions as values.

Our first interest is the tool for one variable analysis. We want to select the data in the first column before choosing the tool. Select column or collection of cells. Collection need not be in one column.

 

Selecting the tool for "One Variable Analysis" produces a number of actions. A new One Variable Statistics window pops up. On the left side of that window we have a summary of standard one variable statistics. On the right side we have a histogram of the data.

 

 

The label histogram is actually a drop down menu that lists a number of kinds of graphical presentations to choose from.

 

The tool icon on the right side of the screen opens a panel that gives you choices about the presentation of the graph.

 

 

The options button in the lower right corner of the window lets you see the data you are gathering. This lets you selectively remove some data from the collection. (You may want to exclude outliers.) It also allows you put in two graphs at once.

 

We now turn toward making some of the data in the spreadsheet available to the algebra view. We first choose the tool for making a list. A pop up dialog gives us the opportunity to name the list.

 

Similarly, we can use the tool to make a list of points. This tool takes 2 columns of numbers as input and turns them into ordered pairs to produce points. In the pop up dialog, you can specify if the first or second column corresponds to x. The question of dependent or free determines if the values of the points will change with the values in the spreadsheet or be fixed.

 

Additionally, we can bring information across as either a matrix, or as a table. (A table is actually a text object that shows up in the graphics window.)

 

We are ready to look at the Two Variable Regression Analysis tool. Having selected two columns of data the tool starts by giving a scatterplot and some information about the distributions of x and y. While no regression formula is given, a drop down menu lets you choose the kind of curve to which you want to fit.

When a type of curve is chosen, GeoGebra computes the best fit curve.

 

The evaluate field lets you evaluate that curve at a specified value of x. Once again, you can show the data and remove selected points from the computations.

 

The multiple variable analysis tool lets you gather data about several distributions at a time.

 

One detail to note about this tool, you select columns by clicking on the column header. The pop up window gives a stacked box plot of the selected distributions along with statistics on each.

 

The drop down menu for statistics lets you replace that with several tests comparing the samples.

 

 

Finally, for this set of tools, we want to look at the probability calculator tool. Clicking on this tool brings up a window for doing probability calculations. With a drop down menu you can choose one or two sided tests.

 

The calculator allows you to do calculations for a variety of probability distributions. With each distribution you can input the appropriate parameters. As the picture indicates, for discrete distributions you are also given the probability values for the points.

 

The last column of tools in the spreadsheet view allows basic computations on data sets.

 

If date in a column or block is selected, the computation is made in each column, with the answer put in the cell beneath the block.

 

If the cells are selected in a row, the computations are done along the row.

 

 

 

A final issue we want to deal with is easily being able to get GeoGebra to recompute a specified set of random variables. In the example we have been working with we would like to be able to resample the values in column A or C. The trick is to include a value that has no impact on the value of the cell, but whose value can be changed to force the cell to be re-evaluated.

 

We change the formula for the A cells from RandomBetween[0,100] to RandomBetween[0,100]+0*D$1.

For the cells in C we append +0*D$2. Since 0 times any number is still 0 we have not changed the value of the cells.

 

However when we change the value of D1 or D2 we force the taking of new random samples. (We do not even have to change the values of D1 and D2. It is enough to select the cell, enter the formula line, and hit return so the cell is re-evaluated.)

 

 

© 2011, Mike May, S.J.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 license, Mike May, S.J. maymk@slu.edu

 

 

Comments (0)

You don't have permission to comment on this page.