Title: | Simulation of Populations by Sampling Waiting-Time Distributions |
---|---|
Description: | Generates lifespans and fertility histories in continuous time using individual-level state transition (multi-state) models and data from the Human Mortality Database and the Human Fertility Database. To facilitate virtual population analysis, data on virtual individuals are stored in a data structure commonly used in sample surveys. Life histories are generated for multiple generations. The genealogies that result facilitate the study of family ties. |
Authors: | Frans Willekens [aut, cre] , Tim Riffe [ctb] |
Maintainer: | Frans Willekens <[email protected]> |
License: | GPL-2 |
Version: | 1.0.3 |
Built: | 2024-11-02 03:42:55 UTC |
Source: | https://github.com/willekens/virtualpop |
Individual fertility histories
Children(dat0, rates)
Children(dat0, rates)
dat0 |
Data frame with base individual data on members of virtual population |
rates |
Mortality and fertility rates. The object 'rates' is produced by Getrates_refyear.R |
List object with two objects: (a) data frame with individual info and fertility history of egos and (b) children data frame
Frans Willekens
utils::data(dataLH) utils::data(rates) dat0 <- dataLH[1:10,] out <- Children(dat0=dat0,rates=rates)
utils::data(dataLH) utils::data(rates) dat0 <- dataLH[1:10,] out <- Children(dat0=dat0,rates=rates)
simulated population of four generations
A data frame with data on 29954 individuals (10000 in initial cohort).
Identification number
Generation
Sex. A factor with levels Males and Females
Date of birth (decimal date
Date of death (decimal date
Age at death (decimal number
ID of partner
ID of mother
ID of father
Child's line number in the household
Number of children ever born
ID of first child
ID of 2nd child
ID of 3rd child
ID of 4th child
ID of 5th child
ID of 6th child
ID of 7th child
ID of 8th child
ID of 9th child
Age of mother at birth of first child
Age of mother at birth of 2nd child
Age of mother at birth of 3rd child
Age of mother at birth of 4th child
Age of mother at birth of 5th child
Age of mother at birth of 6th child
Age of mother at birth of 7th child
Age of mother at birth of 8th child
Age of mother at birth of 9th child
Simulation uses period mortality rates and fertility rates by birth order from the United States 2019. The data are downloaded from the Human Mortality Database (HMD) and the Human Fertility Database (HFD).
dpopus data
Population of the United States in 2019 reported in the HMD (Population.txt file)
A data frame with 111 age groups (single years of age).
Female population
Male population
The data are downloaded from the Human Mortality Database (HMD). Country: USA. Year: 2019
Reads data from the HMD and HFD
GetData(country, user, pw_HMD, pw_HFD)
GetData(country, user, pw_HMD, pw_HFD)
country |
country |
user |
Name of the user, used at registration with the HMD and HFD. It is assumed that the same name is used for both HMD and HFD. |
pw_HMD |
Password to access HMD, provided at registration |
pw_HFD |
Password to access HFD, provided at registration |
data_raw |
5 objects: country,life tables females,life tables males,fertility rates,female population (from HFD): exposures |
Frans Willekens
## Not run: dataLH <- GetData(country="USA",user,pw_HMD,pw_HFD)
## Not run: dataLH <- GetData(country="USA",user,pw_HMD,pw_HFD)
Creates database 'dataLH' from mortality rates by age and sex, and fertility rates by age of mother and birth order
GetGenerations( rates, ncohort, ngen, age_end_perc = NULL, iages = NULL, ID1 = NULL )
GetGenerations( rates, ncohort, ngen, age_end_perc = NULL, iages = NULL, ID1 = NULL )
rates |
List object with death rates (ASDR) and birth rates (ASFR) |
ncohort |
Size of hypothetical birth cohort |
ngen |
Number of generations to be simulated |
age_end_perc |
If age_end_perc is not missing (NULL), then the simulated ages at death are replaced by the age distribution given by age_end_perc. The age distribution is a matrix with 2 dolumns, one for females (column 1) and one for males (column 2). The distribution is given by single years of age. |
iages |
If iages is not missing, the vector of simulated ages at death is replaced by the vector of individual ages at censoring |
ID1 |
Identification number of first person in virtual population being created (optional) |
age_end_prec or iages are used to simulate ages at censoring. For instance, to compare the virtual population with a real population for which information is collected retorspectively in a cross-sectional survey, the simulation window must be equal to the observation window. In other words, the virtual population and the real population must have the same censoring.
dataLH |
The database of simulated individual lifespans and fertility histories. The object 'dataLH' has two attributes: (a) the calendar year of period rates and (b) the country |
Frans Willekens
# The object rates is produced by the function GetRates. utils::data(rates) dLH <- GetGenerations (rates=rates,ncohort=100,ngen=4)
# The object rates is produced by the function GetRates. utils::data(rates) dLH <- GetGenerations (rates=rates,ncohort=100,ngen=4)
(a) Retrieves rates, the period life tables and the period fertility tables. (b) Computes death rates by age and sex, and birth rates by age and birth order.
GetRates(data, refyear)
GetRates(data, refyear)
data |
data |
refyear |
Reference year, which is the year of period data |
The user needs to register as a new user before data can be downloaded. To register with HMD, go to https://www.mortality.org. To register with HFD, go to https://www.humanfertility.org/cgi-bin/main.php.
ASDR |
Age-specific death rates, by sex (for reference year or all years) |
ASFR |
Age-specific birth rates by birth order (for reference year or all years) |
e0 |
REMOVE |
To access the HMD and HFD, the function used HMDHFDplus written by Tim Riffe and other at the Max Planck Institute for Demographic Research, Rostock, Germany
Frans Willekens
## Not run: ratesR <- GetRates(data,refyear)
## Not run: ratesR <- GetRates(data,refyear)
Computes cumulative hazard at duration t from age-specific demographic rates.
H_pw(t, breakpoints, rates)
H_pw(t, breakpoints, rates)
t |
Duration at which cumulative hazard is required. |
breakpoints |
Breakpoints: values of x at which piecewise-constant rates change. |
rates |
Piecewise-constant rates |
Cumulative hazard at duration t
Frans Willekens
Function H_pw called by pw_root, which is called by r_pw_exp.
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) z <- H_pw(t=0:40, breakpoints=breakpoints, rates=rates) utils::data(rates) ages <- as.numeric(rownames(rates$ASDR)) breakpoints <- c(ages,120) zz <- H_pw(t=ages, breakpoints=breakpoints, rates=rates$ASDR[,1])
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) z <- H_pw(t=0:40, breakpoints=breakpoints, rates=rates) utils::data(rates) ages <- as.numeric(rownames(rates$ASDR)) breakpoints <- c(ages,120) zz <- H_pw(t=ages, breakpoints=breakpoints, rates=rates$ASDR[,1])
Simulate length of life using age-specific death rates.Generate date of death and age at death. The function uses the rpexp function from the package msm and uniroot of base R
Lifespan(data, ASDR)
Lifespan(data, ASDR)
data |
Data frame with individual data |
ASDR |
Age-specific death rates |
data: data frame 'dataLH' with date of death and age of death completed.
Frans Willekens
utils::data(dataLH) utils::data(rates) z <- Lifespan (dataLH[1:5,],ASDR=rates$ASDR)
utils::data(dataLH) utils::data(rates) z <- Lifespan (dataLH[1:5,],ASDR=rates$ASDR)
Randomly allocates partners to egos
Partnership(dLH)
Partnership(dLH)
dLH |
Database |
Updated version of database (dLH), which includes the IDs of partners.
Frans Willekens
utils::data(dataLH) dLH=dataLH[1:10,] # Remove current partner dLH$IDpartner <- NA d <- Partnership(dLH=dLH) # NOTE: partners are randomly selected from the individuals documented in dLH.
utils::data(dataLH) dLH=dataLH[1:10,] # Remove current partner dLH$IDpartner <- NA d <- Partnership(dLH=dLH) # NOTE: partners are randomly selected from the individuals documented in dLH.
Equation: cumulative hazard functionn + log(uu) = 0
pw_root(t, breakpoints, rates, uu)
pw_root(t, breakpoints, rates, uu)
t |
Vector of durations to be considered in determining root. |
breakpoints |
Breakpoints |
rates |
Piecewise-constant rates |
uu |
Random draw from standard uniform distribution. |
The function is called by function uniroot (base R), which is called by r.pw_exp
Vector of differences between cumulative hazard and -log(uu) for different values of t.
Frans Willekens
Functions H_pw and r.pw_exp
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) z <- pw_root (t= c(10,18.3,23.6,54.7),breakpoints,rates,uu=0.43)
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) z <- pw_root (t= c(10,18.3,23.6,54.7),breakpoints,rates,uu=0.43)
Takes n random draws from a piecewise-constant exponential distribution.
r.pw_exp(n, breakpoints, rates)
r.pw_exp(n, breakpoints, rates)
n |
Number of random draws required |
breakpoints |
Breakpoints in piecewise-constant exponential distribution |
rates |
Piecewise-constant rates |
Vector of waiting times, drawn from piecewise-exponential survival function.
Frans Willekens
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) pw_sample <- r.pw_exp (n=10, breakpoints, rates=rates)
breakpoints <- c(0, 10, 20, 30, 60) rates <- c(0.01,0.02,0.04,0.15) pw_sample <- r.pw_exp (n=10, breakpoints, rates=rates)
Mortality rates by age and sex: fertility rates by age and birth order
A list of three objects.
Mortality rates
Fertility rates
Multistate transition rates
The data are downloaded from the Human Mortality Database (HMD) and the Human Fertility Database (HFD). Country: USA. Year: 2019
The function is called from the function Children. It uses the rpexp function of the msm package.
Sim_bio(datsim, ratesM)
Sim_bio(datsim, ratesM)
datsim |
Data frame with individual data |
ratesM |
Multistate transition rates in standard (multistate) format |
age_startSim |
Age at start of simulation |
age_endSim |
Age at end of simulation |
nstates |
Number of states |
path |
path: sequence of states occupied |
ages_trans |
Ages at transition |
Frans Willekens
# Generates single fertility history from mortality rates by age # and fertility rates by age and parity # Fertily history is simulated from starting age to ending age # Individual starts in state "par0" # ratesM is an object with the rates in the proper format for multistate analysis utils::data(rates) popsim <- data.frame(ID=1,born=2000.450,start=0,end=80,st_start="par0") ch <- Sim_bio (datsim=popsim,ratesM=rates$ratesM)
# Generates single fertility history from mortality rates by age # and fertility rates by age and parity # Fertily history is simulated from starting age to ending age # Individual starts in state "par0" # ratesM is an object with the rates in the proper format for multistate analysis utils::data(rates) popsim <- data.frame(ID=1,born=2000.450,start=0,end=80,st_start="par0") ch <- Sim_bio (datsim=popsim,ratesM=rates$ratesM)