reply by Bassetto et al


A statistical analysis of magnetic field effects for migratory monarch butterflies. II: The positive conditioning controls and the negative geotaxis experiments in Oxford and Oldenburg

Our conclusion was that the magnetic field effects reported by Gegear et al. were false positives because they chose the wrong statistical tests. We have covered the matter in the Supplementary Information of our paper. There are 8. Here we provide only a brief description.

The Positive conditioning controls and the magnetic exposure experiments in different locations berates us for doing things that are sensitive to temperature and humidity. There is no evidence in Gegear et al.4 to support such a contention. We chose to carry out the olfactory controls3 in Scott Waddell’s laboratory in Oxford specifically to take advantage of his facilities and expertise working with odour stimuli. For similar reasons, we carried out all of the magnetic stimulus tests3 in Oldenburg, where the experimental facilities for controlling magnetic fields are second to none13,14.

The non-parametric approach is what led us to conclude that the approach of Bassetto et al.1 is overly conservative. The data for the naive and trained groups of flies in the synthetic dataset are highly significantly different by this analysis (P < 0.0001).

We did not consider the monarch butterfly a model for studying light- dependent magnetoreception. Monarchs, according to reports, should be able to orient in the Earth’s magnetic field. However, in two separate studies16,17, we have found no evidence that monarchs have such an ability: 140 migratory monarch butterflies tested with access to only natural geomagnetic field cues showed random orientation, whereas monarchs tested with celestial cues showed a clearly directed southwest orientation16. When monarchs are kept flying in the rotating field for 2 h, they do not react to a horizontal 120 turn of the field even when first flown in the normal magnetic field.

Kyriacou1 remarks on the accuracy of the video tracking of fly movements in our negative geotaxis experiments3. The number of frames logged is simply a result of flies being out of bounds. Frames in which flies were hidden by the stoppers at the top or the supports at the base of the tubes were not included in the analysis. We did not log flies that got to the top of the tubes and then started to descend. It does not imply that flies were not tracked while climbing. Moreover, it is not necessary to track a fly in every frame to determine its climbing rate. Fedele et al.7 simply reported the proportion of flies that climbed to an arbitrarily chosen height with no further data or photographic documentation.

The Student’s t-test (and similarly analysis of variance) proposed by Krashes and Waddell9 and used by Gegear et al.4 to analyse group T-maze data is strongly affected by pseudo-replication and is fundamentally wrong for analysing preference indices3,8. The results were exaggerated and claimed in the ref. 4 for small proportion contrasts (45% naive versus 55% trained). We used a statistical framework for proportions and avoided pseudo-replication by using thebiological replicate as an independent statistical unit. This correction, albeit conservative, nonetheless yielded significant results in our positive control experiment (odour-conditioned flies). Results of the additional statistical tests are presented in the Comment and is based on synthetic data. The problem of the t-test is the same as that of these tests. Statistical tests are based on frameworks of assumptions that are appropriate for specific problems and data structures and cannot be applied out of context. We cannot comment further on these analyses because neither the original data4 nor the new synthetic data2 have been made available.

The statistical analysis in Gegear was reexamined by Bassetto et al. Their reanalysis is off base and does not support the contention that most of the original results were not statistically significant and were instead false positives.

When using an ordinal logistic fit model to assess the synthetic dataset, which is equivalent to the type of generalized linear model used by Bassetto et al.1 (based on the group averages in Gegear et al. 2008, Fig. 1b2; discussed in the text and Supplementary Fig. The results of the experiment depend on how many flies are in the model. The effect of training is insignificant with the addition of ‘batch’ as a variable, and much more so with the omission of ‘batch’. Presumably, Bassetto et al.1 chose the former option.

Mimicking the approach of Bassetto et al.1 to generate a single synthetic dataset, we generated an additional 20 synthetic datasets. When using a very conservative approach of Bassetto et al.1, 5 of the 20 datasets showed significant effect on training, whereas 15 did not. All of the synthetic replicates showed highly significant differences between the groups when using other approaches. Thus, Bassetto et al.1 seem to have selected a statistical approach with extremely poor sensitivity for detecting differences when reanalysing the data in Gegear et al. 2008 (ref. 2). They said that the results in Gegear et al. represent a false positive. Moreover, if false positives occurred in previous studies, they would be expected to occur in a variety of treatments and not in a way that consistently provides evidence for magnetosensitivity.