API Reference
This page documents the main public-facing functions shipped with twowaypanel. The focus is on (i) data loaders and generators used in the examples and replication materials, and (ii) the estimation interface.
For most applied users, the central function is twowaypanel.fit(). Since it is
introduced in detail in the Quick Start Tutorial, we provide only a concise technical
summary here and refer readers to the tutorial for intuition and recommended workflows.
Data generation for examples and testing
twowaypanel.demo.PanelGenData
- twowaypanel.demo.PanelGenData(N, T, seed='2025', model='logit', dynamic=0)
Generate an artificial panel dataset for nonlinear models with two-way fixed effects.
This helper is primarily intended for: (i) quick demonstrations in the Examples section, (ii) unit tests / sanity checks, and (iii) Monte Carlo experiments.
Parameters
- Nint
Number of individuals.
- Tint
Number of time periods.
- seedstr or int, default=”2025”
Random seed used for reproducibility.
- model{“logit”, “probit”, “mlogit”, “ologit”}, default=”logit”
Model class used for the data generating process.
- dynamic{0, 1}, default=0
If
dynamic=0, the regressor matrix are strictly exogenous. Ifdynamic=1, the regressor matrix includes a lagged dependent variable (as an additional covariate), producing a specification with predetermined regressors.
Returns
- FEiarray_like
Individual-effect dummy matrix of dimension
(N·T) × N(long-form).- FEtarray_like
Time-effect dummy matrix of dimension
(N·T) × T(long-form).- FEarray_like
Stacked fixed-effect matrix of dimension
(N·T) × (N+T), constructed by concatenatingFEiandFEt.- index_iarray_like
Long-form individual indices of dimension
(N·T) × 1, taking values in{1, ..., N}.- index_tarray_like
Long-form time indices of dimension
(N·T) × 1, taking values in{1, ..., T}.- Yarray_like
Outcome variable in long form of dimension
(N·T) × 1.- Xarray_like
Regressors in long form.
Static model (
dynamic=0):(N·T) × 1.Dynamic model (
dynamic=1):(N·T) × 2, where the first column is the lagged dependent variable and the second column is an exogenous/predetermined covariate.
- Ystararray_like
Latent index in long form (when relevant to the chosen model), dimension
(N·T) × 1.- alphas0array_like
True individual fixed effects (DGP values), typically length
N.- gammas0array_like
True time fixed effects (DGP values), typically length
T.
Notes
The returned objects are in long form (dimension proportional to
N·T). In typical usage withtwowaypanel.fit(), you will reshape the outcome and regressors into:Yas(N, T)Xas(N, T, K)
where
Kis the number of covariates implied by the DGP.
Examples
Generate a dynamic ordered-logit panel and reshape into the format expected by
twowaypanel.fit():FEi, FEt, FE, index_i, index_t, Y, X, Ystar, alphas0, gammas0 = twowaypanel.demo.PanelGenData( N=45, T=15, seed=10, model="ologit", dynamic=1 ) Y = Y.reshape(45, 15) X = X.reshape(45, 15, 2) res = twowaypanel.fit(Y, X, model="ologit", prior="Generic", algorithm="JML", cutoff0=-2.5, ape=True)
Empirical data loader
twowaypanel.database.angristevans98
- twowaypanel.database.angristevans98()
Load the processed panel dataset based on Angrist and Evans (1998), Children and their parents’ labor supply: Evidence from exogenous variation in family size (American Economic Review).
The dataset is pre-processed and packaged for immediate use in binary logit/probit examples. It is intended to demonstrate typical panel-data workflows and serves as a convenient benchmark dataset for the package.
Returns
- dataobject
A processed dataset object (typically a pandas
DataFrame) containing the variables needed to construct the panel outcome and regressors used in the examples and replication materials.
Notes
The repository contains the Stata preprocessing scripts and notes used to construct the packaged dataset. Please refer to the following folder in the GitHub repository:
twowaypanel/demo/application_angristevans98Source repository: https://github.com/zizhongyan/twowaypanel
Users typically sort by individual and time indices and reshape the data into
(N, T)and(N, T, K)arrays before callingtwowaypanel.fit().
Examples
import twowaypanel data = twowaypanel.database.angristevans98() # Example workflow (variable names depend on the packaged dataset): # data = data.sort_values(["id", "year"]) # N = data["id"].nunique() # T = data["year"].nunique() # Y = data["lfp"].to_numpy().reshape(N, T) # X = data[["kids0_2", "kids3_5", "kids6_17", "ln_hus_inc"]].to_numpy().reshape(N, T, 4)
Estimation interface
twowaypanel.fit
- twowaypanel.fit(Y, X=None, model=None, prior=None, lag=0, ac=False, algorithm='JML', X_names=None, sv=None, silent=False, ape=True, cutoff0=0, mcmc_iters=16000, mcmc_burnin=1000, mcmc_skipsize=2, mcmc_timer=1, mcmc_sv_mle=True, mcmc_diagnosis=True, beta_variance=None, fe_variance=0.3, block_size=8)
Fit nonlinear panel data models with two-way (individual and time) fixed effects, with optional likelihood-based and/or analytical bias correction.
Quick pointer.
For a user-oriented description of the inputs, options, and outputs, please see the Quick Start Tutorial section of this documentation.
For complete argument descriptions in Python, run
help(twowaypanel.fit)in an interactive session.
Technical notes
The function expects
Yin(N, T)form andXin(N, T, K)form.Under
algorithm="JML", likelihood-based correction is implemented as a penalty term in the objective.Under
algorithm="MCMC", the routine runs Metropolis–Hastings sampling and can report convergence diagnostics (e.g., Geweke-type checks and trace/hist/ACF plots).