In this section we try to explain the concepts that is used in documentation of the Variables API. Understanding these will help you better understand the API.
Variables & Domains
By a variable we mean a function, i.e. a something that operates on a set of inputs, and for each element in the input set returns an output value.
Examples of variables are:
- Age of a person
- Income of household
- Street name of an address
- Industry of a company
In maths, the input set of a function is called the domain of a function, and this is also the term we will adopt here.
This is an important concept. Take for example a variable like the age of a person. This variable is defined both by what it is returning (here the age), but also by its domain, in this case the set of persons. The age of a company or the age of a person would be different variables. So defining the domain as part of the variable is essential - it tells what the varibles "applies to".
Examples of domains are persons, addresses, companies and so on.
We use the term domain element when we want talk a particular object with a domain. (Again we adapt the term from maths - a domain is a set, and a set contains elements - hence domain elements.) So in domain of all Danish persons a specific person is a domain element of that particular domain.
Aggregated Values & Data Levels
When working with demographical variables, data for those variables are often not available on a 1-1 level, but only as aggregated values. For example, we cannot provide information about the income of a specific person, but instead we often have the average income of a group of persons or the income distribution of a group of persons. We call such a distribution or an average for an aggregate value.
When providing an aggregate value, it is interesting to now what kind of grouping was used to provide the data for the aggregated value. E.g. was it the average income for all persons in a municipality, or was it the average income for a group of 50 persons living on the same street. The latter would naturally be a better value to work with than the former. We call this for the data level of the aggregated value.
Examples of data levels that are being used are:
- Street sides intervals (i.e. an range of houses lying next to each other on the same side of a street)
- Cells, e.g. 100m squares.
Data levels will differ from variable to variable and from country to country. But when defining a variable the data level is part of its definition.
Sometimes a variable can have data on moe than one data level. In this case, when a value for a variable is returned, the most specific among all applicable data levels, will be used for retrieving the value. Through the API it is always possible see which data level was used for retrieving a particular value of a variable.
Note, that a variable can, when being updated, change its data levels. This can occur due to changes in the way our data provider delivers data to us, or because we find better methods for clustering that improves the accuracy of data.
As mentioned above a variable can be considered a function, which based on some input returns an value - the output.
Below is a table of the different value types that exists for our variables.
If a value exists, then this is either true or false.
A variable of type category has an collection of categories, and it will return one of those categories as the value for a specific element of the domain. (It can also return null if no category applies to an element).
A category has an id and a name and optionally a description, and these can always be seen as part of the definition of a variable. Examples of category variables are conzoom type, age factor and municipality.
This can be a date, month or a year, depending on the accuracy of the variable.
A fraction is always number between 0 and 1.
This is simply an unformatted text string. The maximum length of the text string is defined by the variable.