## To demonstrate our approach, we first generate a true (toy) population curve, which comprises a 3-CPL model PDF between 5

## (a) Testing continuous piecewise linear model for a typical sample size

5 and 7.5 kyr BP. We then randomly sample N = 1500 dates under this true (toy) population curve, ‘uncalibrate’ these dates, apply an arbitrary 14 C error of 25 years, then calibrate. We then conduct a parameter search for the best fitting 1-CPL, 2-CPL, 3-CPL, 4-CPL and 5-CPL models. The BIC is calculated using: ln(n) k ? 2 ln(L), where k is the number of parameters (k = 2p ? 1, where p is the number of phases), n is the number of 14 C dates and L is the ML . Table 1 gives the results of this model comparison and shows that the model fits closer to the data as its complexity increases. However, the BIC shows that the model is overfitted beyond a 3-CPL model. Therefore, the model selection process successfully recovered the 3-CPL model from which the data were generated.

Table 1. The 3-CPL model is selected as the best, since it has the lowest BIC (italics). As the number of parameters in the model increases, the likelihood of the model given the data increases. However, the BIC shows that this improvement is only justified up to the 3-CPL model, after which the more complex models are overfit to the data.

We then assess the accuracy of the parameter estimates by generating five more random datasets under our true (toy) population curve and apply a parameter search to each dataset. Figure 1 illustrates the best 3-CPL model for each dataset, which are all qualitatively similar to the true population curve. Each is the most likely model given the differences between their respective datasets, which are represented with SPDs.

Figure 1. 3-CPL models best fitted to five randomly sampled datasets of N = 1500 14 C dates. SPDs of each calibrated dataset illustrate the variation from generating random samples. This variation between random datasets is the underlying cause of the small differences between the hinge-point dates in each ML model. (Online version in colour.)

- Download figure
- Open in new tab
- Download PowerPoint

## (b) Testing continuous piecewise linear model with small sample size

We continue with the same true (toy) population curve and test the behaviour of both the model selection and parameter estimation with smaller sample sizes. As before, N dates are randomly sampled under the population curve, ‘uncalibrated’, assigned an error and calibrated. Figure 2 shows that for N = 329 and N = 454 the 3-CPL model is successfully selected, and its shape is similar to the true population. For N = 154, the lack of information content favours a 1-CPL model which successfully avoids overfitting, and for N = 47 and smaller, the even simpler uniform model is selected. Fo N = 6, heatedaffairs de inicio de sesiГіn the modelled date range is reduced to only encompass the range of the data (see ‘Avoiding edge effects’). These results successfully demonstrate that this approach provides robust inferences of the underlying population dynamics, avoids the misinterpretation inherent in small datasets and approaches the true population dynamics as sample sizes increase.

Figure 2. Model selection naturally guards against overfitting with small sample sizes since the lack of information content favours simple models. By contrast, the SPDs suggest interesting population dynamics that in fact are merely the artefacts of small sample sizes and calibration wiggles. (a) The best model (red) selected using BIC between a uniform distribution and five increasingly complex n-CPL models. (b) SPD (blue) generated from calibrated 14 C dates randomly sampled from the same true (toy) population curve (black), and best CPL model PDF (red) constructed from ML parameters. Note, the slight bend in black and red lines are merely a consequence of the nonlinear y-axis used. (Online version in colour.)