# Discontinuities due to survey redesigns : a structural time series approach

Here, x_{1}, . . . , x_{n }are the sample observations for the auxiliary variables and ^{ˆ}¯x_{HT }is the HT estimator of

the known ^{¯}

X. Särndal et al. (1992) proposed that λ_{k }= ^{1}_{σ}_{2 }can be seen as the variance of the independent

k

random variable Y_{k }defined in a superpopulation model ℜ where y_{k }are assumed to be the outcomes.

More specific, the choice of ^{ˆ}

b and σ^{2}

k

point scatter

is based on an assumption about the shape of the finite population

(Y_{k}, X_{k}), k = 1, . . . , N.

As proposed by Särndal et al. (1992), this assumption is expressed in terms of model ℜ :

y_{1}, . . . , y_{N }are considered as realized values of independent random variables Y_{1}, . . . , Y_{N },

E_{ℜ }(Y_{k}) = ♭^{t}x_{k}, and V ar_{ℜ }(Y_{k}) = σ^{2}

k^{.}

k = 1, . . . , N

Here, ♭ and σ^{2}_{1}, . . . , σ^{2}_{N }are the model parameters. In case of a hypothetical complete enumeration of the

entire population, the weighted least squares estimator of ♭ would have been

B =

(_{∑}

X_{k}λ_{k}X

k∈U

′

k

)_{−}_{1 }_{∑}

X_{k}λ_{k}Y_{k}.

k∈U

It can be shown that B is the best linear unbiased estimator of ♭ under the model. However, the entire

population cannot be observed, such that B is an unknown population characteristic. Using sample s,

B can be estimated by ^{ˆ}

b as defined in (2.3).

The variance of ^{ˆ}¯y_{greg }is V ar(^{ˆ}¯y_{greg}) = ^{0}^{.}^{5}

N^{2}

∑_{N }∑_{N}

k=1 l=1

by ^{̂}

V ar(^{ˆ}¯y_{greg}) = ^{0}^{.}^{5}

N^{2}

∑_{n }∑ ^{( )(}

^{n }_{π}_{k}_{π}_{l}_{−}_{π}_{kl }ξ_{k }ξ_{l}

k=1 l=1

−

π_{kl }π_{k }π_{l}

( )(

Ξ_{k}

π_{k}π_{l }− π_{kl π}_{k}

Ξ_{l}

−

π_{l}

^{)}^{2}, which can be estimated

)_{2}

. Here, Ξ_{t }= Y_{t }− BX_{t }and ξ_{t }= y_{t }− ^{ˆ}

bx_{t }are the

population and sample residuals, respectively. Note the difference with the variance formulas of the HT

estimator, where instead of the residuals the target variables are used. Hence the variance of the general

regression estimator is strictly lower than the variance of the HT estimator, if model ℜ is properly defined

(Särndal et al., 1992).

The estimator (2.3) can be also represented in terms of weights :

where w_{k }= ^{1}

(

1 + λ_{k}x

π_{k}

′

k^{ˆ}¯y_{greg }= ^{∑ }w_{k}y_{k}, (2.4)

k∈s

∑ ^{( }_{x}_{k}_{λ}_{k}_{x}^{′}

k

k∈s ^{π}^{k}

^{)}^{−}^{1 }( )

¯_{X }_{− }_{ˆ¯}_{x}_{HT}

)

= ^{g}^{k}

.

π_{k}

Here, w_{k }are called final weights,

1

_{π}_{k }the inclusion weights and g_{k }the correction weights or simply

g-weights. More on the general regression estimator and its alternative expressions can be found in paragraphs

6.4 - 6.6 of Särndal et al. (1992).

10

For simple random sampling it holds that

ˆ_{¯}_{y}_{greg }_{= }ˆ_{¯}_{y }_{+ }^{ˆ}

b^{′}( ^{¯}

X − ^{ˆ}¯x), (2.5)

with ^{ˆ}

b as in (2.3). Note that in case no auxiliary information is available, (2.5) boils down to the sample

mean ^{ˆ}¯y as discussed in section 2.3.2. For stratified sampling two estimators exist, depending on whether

all strata are estimated separately or on the national level :

ˆ_{¯}_{y}_{sepgreg }_{=}

H_{∑}

h=1

[

N_{h}

ˆ_{¯}_{y}

N

(h) _{+ ˆ}_{b}′

h^{( ¯}

X_{h }− ^{ˆ}¯x^{(}^{h}^{)}

]

) , or ^{ˆ}¯y_{comgreg }= ^{ˆ}¯y_{ST R }+ ^{ˆ}b^{′}( ^{¯}

X − ^{ˆ}¯x_{ST R}), (2.6)

with ^{ˆ}

b_{h }=

(

n_{h}

∑

k=1

x_{k}λ_{k}x^{′}

k

)_{−}_{1 }_{n}_{h}

^{∑ }x_{k}λ_{k}y_{k}, and ^{ˆ}

( _{H}_{∑}

b =

k=1 h=1

N_{h}

N

∑^{n}^{h }_{x}_{k}_{λ}_{k}_{x}_{′}

k

k=1

)_{−}_{1}

H_{∑}

h=1

N_{h}

N

∑^{n}^{h}

k=1

x_{k}λ_{k}y_{k}.

In equation (2.6), the first estimator refers to the seperated generalized regression estimator, whereas

^{ˆ}¯y_{comgreg }refers to the combined general regression estimator.

3 Data on victimization

3.1 Application on data of Statistics Netherlands

The theory described in the previous section is used in the data gathered by Statistics Netherlands.

More specific, Statistics Netherlands collects crime statistics from 1980 onwards. Since then in total five

different survey designs have been used, which are summarized in table 1. The variable considered in this

paper is victimization, and all surveys in table 1 collected data on this variable. Victimization is defined

as the percentage of the Dutch population which once or more has been a victim of a criminal offense.

The reference period of the surveys is 12 months before the interview date, unless stated otherwise.

Below the surveys are discussed in greater detail. Note that people in a detention house and psychological

institutions have been excluded from the target population for all surveys.

CSV survey design

The CSV has been conducted yearly from 1981 to 1985 and two-yearly from 1987 to 1993. The main

challenge was the collection of victimization data of people residing in a Dutch household of 15 years

and older. The CSV was conducted in the first quarter of the year following the reference year, hence

in January-March 1991 the data over 1990 was collected (Huys and Rooduijn, 1993). The interviewees

were visited at their homes and except for 1991 and 1993 the data was collected manually on paper.

In 1991 and 1993 the data was collected using Computer Assisted Personal Interviewing (CAPI). The

sample frame consisted of addresses as provided by the Dutch post office. With the CSV, the sampling

11

Table 1 – Data Summary

Survey Collection period Data collection mode English/Dutch translations

CSV 1980-1993 CAPI Crime Victims Survey/

Enquête Slachtoffers Misdrijven

LPSS 1992-1996 CAPI Legal Protection and Safety Survey/

Enquête Rechtsbescherming en Veiligheid

PSLC 1997-2004 CAPI Permanent Study Living Conditions/

Permanent Onderzoek Leefsituatie

SM 2005-2007 CATI/CAPI Safety Monitor/

(mixed mode) Veiligheidsmonitor Rijk

ISM 2008-current* CAWI/PAPI or CATI/CAPI Integral Safety Monitor/

(mixed mode) Integrale Veiligheidsmonitor

SM_{IV }2008-2010 CATI/CAPI Safety Monitor/

(mixed mode) Veiligheidsmonitor Rijk

* In this paper the ISM is only considered until 2010.

design used was stratified sampling, where using stratification variables gender, age, marital status, urbanization

and region in total 154 strata were obtained. The sample size varied between 4000 and 10000

respondents on a yearly basis (Statistics Netherlands, 2011b).

LPSS survey design

The LPSS was conducted on a yearly basis between 1992 and 1996, such that it was run parallel to the

CSV for one year. Both surveys focused on collecting crime statistics. The target population of the LPSS

consisted of persons of 15 years and older in private household in the Netherlands, and if possible two

respondents were interviewed per household. The sample frame also consisted of addresses as provided

by the Dutch post office. On the contrary to the CSV, the LPSS has been collected throughout the entire

calender year. The reference period was changed to 12+i months, where i was the month the survey

was taken. Hence, the reference period varied between 13 and 23 months. Furthermore, the entire survey

has been collected by CAPI. With the LPSS, two stage stratified sampling was used as sampling design,

where using four stratification variables in total 80 strata were obtained (Huys and Rooduijn, 1993). The

sample size varied around 5000 respondents on a yearly basis (Statistics Netherlands, 2011a).

PSLC survey design

The PSLC survey is a ”module-based integrated survey combining various themes concerning living

conditions and quality of life” (van den Brakel et al., 2008). Hence, the PSLC is a more general survey

on living conditions where the Justice and Security module (JSM) focused on publishing crime statistics.

An age difference in the target population has been introduced in the PSLC, lowering it from 15 years

and older to 12 years and older. Similar to the LPSS, this survey is collected throughout the calender year

using the CAPI interview method. As sampling design two-stage stratified sampling has been used with

Dutch provinces and municipalities as stratification variables. For the PSLC, the proportional allocation

scheme was used, as described in section 2.3.3. Being collected from 1997 to 2004, it had a relatively

high yearly sample size varying around 10000 (Statistics Netherlands, 2011d).

12

SM survey design

To reduce the response burden and costs, the JSM of the PSLC and the survey Population Police Monitor

(PPM) which was conducted by two Dutch ministries and collected data on the same topics, were

merged in 2004 into the SM, which would be conducted by Statistics Netherlands. Another purpose of

this merge is to obtain one consistent victimization rate. Whereas the PSLC was a more general survey,

the SM is again focused on collecting crime statistics such that the questionnaire and the context of the

survey were changed compared to PSLC (van den Brakel et al., 2008). Questions from the JSM were

skipped and questions from the PPM were added. SM has been collected from 2005 to 2008 in January

- March using a mixed mode survey design. First, all the interviewees of which Statistics Netherlands

possesses the phone number are called (from 2007 on both including a landline and a mobile phone) in

order to complete a questionnaire using Computer Assisted Telephone Interviewing (CATI), and the rest

is visited by interviewers at their homes to complete a questionnaire using CAPI. With the SM the target

population changed to the entire population of all persons of 15 years and older. Stratified two-stage

sampling was used for the SM, whereas the sample sizes between the strata were set equal, independent of

the stratum size. Therefore, on stratum level the sampling error was equal. This allocation deviates from

the allocations as described in 2.3.3, which are optimal for estimation at the national level. The sample

size in 2005 was 5000, whereas in the following years it fluctuated around 20000 (Statistics Netherlands,

2011e).

ISM survey design

The ISM has a more complex set up compared to its predecessors. The national sample is conducted by

Statistics Netherlands and some local authorities conducted extra samples on a local level. The target

population of the ISM consists of the entire population aged 15 years and older. The numbers were

collected from September until December on a yearly basis from 2008 onwards.

The data collection method for the national sample conducted by Statistics Netherlands is set up as follows

: first the interviewees are given the opportunity to fill in the surveys electronically using Computer

Assisted Web Interviewing (CAWI) or on paper using Paper and Pencil Interviewing (PAPI). After two

reminders the non responding interviewees are either called (CATI) or visited at their homes (CAPI),

depending on whether Statistics Netherlands is in possession of their phone numbers. The sample size

of the ISM as conducted by Statistics Netherlands has been constant throughout the years at roughly

19000 respondents.

Local authorities also contributed to the ISM by local oversampling. Oversampling occurs when extra

samples are drawn next to the original sample. The sample size of the ISM conducted by the local authorities

varied from year to year between 20000 and 180000. The amount of oversampling varied between

the different Dutch municipalities within a particular year as well. Also, the allocation of the respondents

between the data collection modes changed from year to year. The data collection modes used by local

13

authorities are CAWI, PAPI and CATI, such that CAPI is not used (Statistics Netherlands, 2011c).

The ISM had by far the largest sample size of all surveys considered, ranging from 40000 to 200000

respondents.

Concerning the sampling design, two-stage stratified sampling has been used with ISM, as described in

section 2.3.4. The sample sizes were set equal for all strata, independent of stratum size. The oversampling

by the local authorities has an effect on the variance of the HT estimator. As aforementioned the

HT estimator is designed in such a way that the more the inclusion expectations π_{k }are proportional to

the target variables, the smaller the variance of the HT estimator. However, the oversampling in the ISM

was not performed proportionally over the entire population. Only in some municipalities oversampling

occurred. In the variance formula of (2.1) this means that V ar(^{ˆ}¯y_{HT }) is not minimized, and therefore the

HT estimator (2.1) is not variance reducing in the case of ISM.

Additionally, parallel to ISM, between 2008 and 2010 the SM has also been conducted with a smaller

sample size of around 6000 respondents on a yearly basis. The adjusted SM has been collected in the

fourth quarter of the year (September-December) instead of the first quarter (January-March). Therefore,

the SM collected in the period 2008-2010 will be referred to as SM_{IV }. By conducting SM_{IV }, Statistics

Netherlands always had a back up survey and comparison material in case something would go wrong

with ISM. In this paper SM_{IV }is used as a parallel run for the ISM, which will be discussed in the section

on state space models.

The estimation of the victimization rates for all surveys discussed in this subsection is done by the

generalized regression estimator discussed in section 2.4. Using this methodology, for every respondent

in the sample a weight is calculated such that the sum of the weighted target variables results in an approximately

unbiased estimator for the unknown population parameters. This so-called weighting scheme

accounts for the differences in response between respondents by using auxiliary variables such as age, sex,

income and data collection mode. For the exact weighting schemes the reader is referred to the yearly

National Reports on the crime surveys of Statistics Netherlands, such as Statistics Netherlands (2008)

for the SM 2008 and Statistics Netherlands (2010) for the ISM 2009.

3.2 Time series

In this subsection the time series analyzed in this paper are discussed in depth.

3.2.1 Data on victimization

As aforementioned, this paper uses the victimization series as collected by Statistics Netherlands from

1980 onwards. The victimization series are classified in five subcategories or breakdowns, given in table 2.

The sum of the separate breakdowns does not equal the value for total victimization, due to the definition

14

of victimization. Namely, if a respondent has been victim of e.g. a property offense and vandalism, he

or she is taken only once in the total number of victimization. In such a way, multiple crimes are taken

only once in the total. This means that the value of the sum of all breakdowns is necessarily higher

than or equal to the total value. The complete list of crimes belonging to each breakdown, as well as the

questionnaires belonging to the respective surveys can be found on www.cbs.nl/en-gb or in the national

reports and report tables which are published yearly by Statistics Netherlands.

The victimization rates are also classified in subpopulations, such that it is possible to investigate whether

there are any differences between population classes. Three classifications are represented in table 3. The

subpopulation classification used in this paper is the gender classification, since for the age subpopulation

the classes are different in the LPSS and PSLC and the urbanization subpopulation was not available.

For the ISM, the numbers of the subpopulations for the year 2010 are not available yet.

Dutch version

Table 2 – Breakdowns in types of victimization

English translation

Geweldsdelicten Violence offenses

Vermogensdelicten Property offenses

Vandalisme Vandalism

Doorrijden na een aanrijding Failure to stop after an accident

Overige delicten^{… }Other offences^{…}

… Not available for the subpopulations in table 3

Table 3 – Subpopulations of victimization

Subpopulation type Number of classes Classes

Gender 2 Male, Female

Ubranization 5 Very highly urban, Highly urban, Moderate urban,

Little urban, Not urban

Age^{… }8 15-18, 18-25, 25-35, 35,45, 45-55, 55-65, 65-75, 75+

… This subpopulation is different for the LPSS and the PSLC

For the CSV, the first survey design, it appeared that the definition of the breakdowns differed

substantially from the definition of the breakdowns in the other four surveys, such that it was not

possible to compare them in any way. Therefore, it was decided to disregard the CSV survey from the

analysis. Additionally, the ISM did not report the values for Failure to stop after an accident.

15

V^{ictimization}

r^{ate}

(^{%)}

5

1^{0}

1^{5}

2^{0}

2^{5}

3^{0}

Violence

Vandalism

Other

SM_{IV}

Property

Failure to stop

Total

1995 2000 2005 2010

T (Years)

*A vertical line represents the introduction of a new survey.

Figure 1: Victimization data : 1992-2010

V^{ictimization}

r^{ate}

(^{%)}

2^{4}

2^{6}

2^{8}

Total victimization series

N^{umber}

o^{f}

o^{ffenses}

p^{er}

1^{000}

i^{nhabitants}

7^{5}

8^{0}

8^{5}

1995 2000 2005 2010

T (Years)

Police series

1995 2000 2005 2010

T (Years)

Figure 2: Total victimization series and police series : 1992-2010

On the first page of the Appendix, in table 12, the victimization series from 1992 to 2010 with its

breakdowns used in this paper are shown, as well as in figure 1 in this section. At the point where in

16

figure 1 the vertical lines hit the series, a new survey is introduced : the PSLC in 1997, the SM in 2005

and the ISM in 2008. The parallel run of the SM_{IV }with the ISM in 2008-2010 is displayed with dots. It

seems that the series makes a jump most of the times a new survey is introduced.

3.2.2 Police series

Additionally, the Dutch police also offers data on victimization. The police series is not a survey, but

a registration. It is the number of criminal offenses as reported in the Dutch police stations and therefore

it does not contain any sampling error. The police series was redesigned in 2005 after a new computer

system was introduced, so this also gives rise to a possible discontinuity. The police reports the series

as the number of offenses per 1000 inhabitants. It was available for the entire time span from 1992 to

2010. The series is represented together with the total victimization series in figure 2. Note the similar

movements of both series and the rise of the police series in 2005, the year the series was redesigned.

3.3 Discontinuities

Looking at figure 1 it seems that, especially concerning the total number of victimization, the value

of victimization tends to increase every time a new survey is introduced. However, it is not clear whether

these increases are due to the development of the victimization variable or because a new survey has been

introduced. As discussed in paragraph 3.1, none of the surveys considered in this paper have had exactly

the same survey design. More specific, several elements have been varying across surveys, including the

target population, data collection mode, the exact period in which the survey was taken and whether or

not the survey was combined with an other survey. These factors could lead to the observed differences

in figure 1 in the following ways.

1. Differences between target populations

The main difference in the target population in the series considered is the larger target population

of the PSLC, starting from 12 years instead of 15 years old. In such, it is possible that the rates

are different because the target population is larger.

2. Differences in data collection mode

The differences in data collection modes are probably one of the main reasons behind the discontinuities.

Namely, as it has been shown by de Leeuw (2005) and Dillman and Christian (2005) it

appears that the so called mode effects appear to have a large influence on the way the interviewee

answers the questionnaires. For example, compared to CATI, the interview speed with CAPI is

lower, which might result in a lower measurement error. An other example is that with electronic

questionnaires, i.e. CAPI, CAWI and CATI, more advanced set ups of questionnaires are possible

because once the respondent answered ”No” some questions can be skipped. This is not possible

17

with PAPI. Then, under CAPI fewer socially desirable answers are obtained due to the personal

contact with the interviewer.

3. Differences between data collection periods

The data collection period varied across the surveys considered in this paper, ranging from a short

period of a couple of months (SM, ISM) to the entire year (LPSS, PSLC). Moreover, the SM has

been conducted from January to March, while the ISM is conducted from September till December.

Note that since the respondent is asked about criminal offenses in the past year, seasonal effects

should not be present.

4. Differences between context of surveys

An example of a survey redesign which could have an effect on the parameter values is the changeover

from the PSLC to the SM. The PSLC also contained topics not directly related to criminal

offenses like health care and living conditions, whereas the SM was explicitly focused on criminal

offenses. The context of the survey is changed, such that a victim of a criminal offense is more

likely to participate in the SM than in the PSLC. Note that this effect can be seen as a response

selection effect.

5. Differences in questionnaire design

In general, with every new survey design the questionnaire changed as well. This was less the case

with the PSLC, where the questions in the JSM stayed more or less the same with the LPSS.

Continuing on the previous example, the questionnaire of the PSLC focused on more general

subjects, whereas in the SM only questions related to crime were asked. Additionally, with the ISM

the questioning on car theft and vandalism changed severely compared to SM. These redesigns

might have systematic effects on the outcomes of these surveys.

Another example of a change in the questionnaire is to add or delete questions in order to reduce socalled

telescoping effects, which happen when respondents mix up events in the past. More specific,

if one is asked on criminal offenses in the past 12 months, and that person has had a criminal offense

13 months ago, he or she might still report that offense. To minimize such telescoping effects, in

the questionnaire the first questions should structure the respondents mind, and then the questions

focused on events in the last 12 months should be asked.

4 Quantifying the discontinuities

The preceding effects of a survey redesign might result in discontinuities. Users of official statistics

are usually not only interested in the value of a target parameter at one time point, but also in the

development through years. When a series contains such a discontinuity, this comparison is not possible.

Several methods are developed in order to quantify these discontinuities.

18

The recalculation method can be applied if the micro data of the survey is consistent after the redesign.

That is if for example only the classification of publication domains of the victimization variable has been

changed. In that case, one can still use the data observed, and then simply apply the new classification

system using a domain indicator variable (van den Brakel et al., 2008).

However, in many real life situations the micro data collected is not consistent after a redesign for one

of the reasons mentioned in section 3.3. In that case two other methods can be applied. One approach is

to conduct an experiment, where the regular and new survey are run concurrently for some period. This

allows the researcher to estimate the main survey parameters under both survey designs and test whether

these parameters are significantly different from each other. Another feature of the experimental approach

is that in case the new survey fails to produce an appropriate estimate for the parameter of interest,

there is always a backup survey which still does produce the correct estimate. More on the experimental

approach can be found in van den Brakel et al. (2008). A major drawback of the experimental approach is

that two surveys have to be conducted, which is not always possible since statistical offices are generally

bounded by a certain financial budget.

An alternative to the experimental approach is the structural time series approach. With this approach

no parallel run is necessary. This approach is discussed further in this section and will be used to analyze

the victimization series discussed in section 3.2.1.

First, section 4.1 will discuss the notations used in this section. Then, in section 4.2 the structural time

series framework and the corresponding state space representation as discussed by Durbin and Koopman

(2001) will be addressed. Section 4.3 will present the application of the Kalman filter and the parameter

estimation.

4.1 Notational conventions

The notation as discussed in section 2.2 and used in section 2 is applied in this section as well.

Hence, an uppercase letter will refer to a population parameter, and lowercase letters will refer to sample

parameters. The circumflex in ˆy is used to denote to estimators.

In this section a special font is used to denote only matrices. For example, a k by k identity matrix is

represented by I_{k }and the system matrices in the state space model will be represented by Z_{t }and T_{t}.

Exact definitions of these matrices will be provided at their first use in the discussion.

4.2 Time series analysis

4.2.1 Structural time series models for sampling surveys

Structural time series models as described by Durbin and Koopman (2001) propose that a series y_{t}

can be decomposed in a trend component µ_{t}, a seasonal component ι_{t}, other cyclic components γ_{t }and

19