Tannenbaum Paper

Tannenbaum Paper#

PISA vs. Survey Data#

The Tannenbaum paper aimed to validate survey measures of social capital using their correlation with wallet report rates. Since PISA education scores are not explicit and direct measurements of social capital, it’s not clear that they could also be seen as validating the survey measures referenced in the paper. However, as can be seen below, the correlations are surprisingly consistent between wallet reporting rates/PISA scores and survey measures. This suggests that they not only contain the same amount of information about social capital but also the same type of information about social capital.

	Correlation (r)
	general_trust	GPS_trust	general_morality	MFQ_genmorality	civic_cooperation	GPS_posrecip	GPS_altruism	stranger1
response	0.603736	0.023510	0.612047	0.461323	0.391755	0.050279	-0.214705	0.645001
pisa_score	0.633428	0.122152	0.659558	0.364130	0.395210	-0.156832	-0.159437	0.665572

The facet plot above is a replication of Figure 3 from Tannenbaum and can be compared with the plot below that has PISA scores instead of wallet return rates on the y-axes.

PISA as a Predictor of Economic and Institutional Performance#

To address the second topic of the Tannenbaum paper, we ask how well do PISA scores (as compared to lost wallet reporting rates) explain variation in economic development?

There are four measures of “Economic and Institutional Performance” in the second part of the paper: GDP per capita (log_gdp), productivity(log_tfp), government effectiveness (gee), and letter grade efficiency (letter_grading). If PISA scores are considered a fifth measure of the same sort, we find that wallet reporting rates are an equally effective predictor. When combined with any other measure of social capital, the coefficient for wallet reporting rates are always statistically significant with p<0.01 and with \(R^2\) greater than most of the other fit models.

Regression Results#

The table below seeks to replicate part of Table 2 from Tannenbaum and adds two new columns (Model 9 and Model 10) which contain the respective OLS model results where PISA scores are predicted. Models 7 and 8 are recreated here to show that the table is generated with the same process that generated Table 2 in Tannenbaum.

Show code cell source Hide code cell source

import pandas as pd
import statsmodels.api as sm
from great_tables import GT

# Data import
survey_cols = [
    "general_trust",
    "GPS_trust",
    "general_morality",
    "MFQ_genmorality",
    "civic_cooperation",
    "GPS_posrecip",
    "GPS_altruism",
    "stranger1",
]

econ_cols = [
    "log_gdp",
    "log_tfp",
    "gee",
    "letter_grading",
]

df = pd.read_csv(
    "../data/tannenbaum_data.csv",
)

# Add PISA data
pisa = pd.read_csv("../data/pisa_data.csv")
pisa = pisa.loc[pisa['year'] == 2015]
pisa = pisa.groupby('country')['pisa_score'].mean().reset_index()
df = df.merge(pisa, how="left", on="country")


# p-value stars to award to each parameter coef estimate.
def stars(p):
    if p > 0.1:
        return ""
    elif p > 0.05:
        return "*"
    elif p > 0.01:
        return "**"
    else:
        return "***"


# Run regression for each survey measure.
def get_model_results_no_pred(survey_measure, econ_measure):
    regression_df = (
        df.groupby("country")[[econ_measure, survey_measure]].mean().dropna()
    )

    y = regression_df[econ_measure]
    X = regression_df[[survey_measure]]

    # Standardize predictors
    X_std = (X - X.mean()) / X.std()
    X_std = sm.add_constant(X_std)

    model = sm.OLS(y, X_std)
    results = model.fit(cov_type="HC1")  # Robust standard errors same as in Tannenbaum
    result_df = (
        pd.DataFrame(
            {
                "param": pd.Series(
                    [
                        f"{v:.3f}{stars(p)}"
                        for v, p in zip(results.params[1:], results.pvalues[1:])
                    ],
                    index=results.params.index[1:],
                ),
                "se": results.bse[1:].apply(lambda x: f"({x:.3f})"),
            }
        )
        .astype({"param": object, "se": object})
        .stack()
    )
    result_df.loc[("<i>N<i>", "")] = X.shape[0]
    result_df.loc[("<i>R<i><sup>2</sup>", "")] = f"{results.rsquared:.3f}"
    result_df = result_df.reset_index()
    result_df["measure"] = survey_measure

    return result_df


# Run regression for each survey measure with predictor variable.
def get_model_results(survey_measure, econ_measure):
    regression_df = (
        df.groupby("country")[["response", econ_measure, survey_measure]]
        .mean()
        .dropna()
    )

    y = regression_df[econ_measure]
    X = regression_df[[survey_measure, "response"]]

    # Standardize predictors
    X_std = (X - X.mean()) / X.std()
    X_std = sm.add_constant(X_std)

    model = sm.OLS(y, X_std)
    results = model.fit(cov_type="HC1")  # Robust standard errors same as in Tannenbaum

    result_df = (
        pd.DataFrame(
            {
                "param": pd.Series(
                    [
                        f"{v:.3f}{stars(p)}"
                        for v, p in zip(results.params[1:], results.pvalues[1:])
                    ],
                    index=results.params.index[1:],
                ),
                "se": results.bse[1:].apply(lambda x: f"({x:.3f})"),
            }
        )
        .astype({"param": object, "se": object})
        .stack()
    )
    result_df.loc[("<i>N<i>", "")] = X.shape[0]
    result_df.loc[("<i>R<i><sup>2</sup>", "")] = f"{results.rsquared:.3f}"
    result_df = result_df.reset_index()
    result_df["measure"] = survey_measure

    return result_df


model_7_results = [
    get_model_results_no_pred(col, "letter_grading") for col in survey_cols
]
model_7 = pd.concat(model_7_results)
model_7 = model_7.rename(columns={0: "Model 7"})

model_8_results = [get_model_results(col, "letter_grading") for col in survey_cols]
model_8 = pd.concat(model_8_results)
model_8 = model_8.rename(columns={0: "Model 8"})

model_9_results = [get_model_results_no_pred(col, "pisa_score") for col in survey_cols]
model_9 = pd.concat(model_9_results)
model_9 = model_9.rename(columns={0: "Model 9"})

model_10_results = [get_model_results(col, "pisa_score") for col in survey_cols]
model_10 = pd.concat(model_10_results)
model_10 = model_10.rename(columns={0: "Model 10"})

# Combine results and make pretty.
display_df = (
    model_7.merge(model_8, on=["level_0", "level_1", "measure"], how="right")
    .merge(model_9, on=["level_0", "level_1", "measure"], how="left")
    .merge(model_10, on=["level_0", "level_1", "measure"], how="right")
    .iloc[:, [0, 2, 3, 4, 5, 6]]
)

display_df.loc[:, "level_0"] = display_df.loc[:, "level_0"].where(
    display_df.loc[:, "level_0"] != display_df.loc[:, "level_0"].shift(), ""
)

display_df
(
    GT(display_df)
    .tab_header(title="TABLE 2.—PREDICTIVE VALUE OF WALLET REPORTING RATES")
    .tab_stub(rowname_col="level_0", groupname_col="measure")
    .tab_spanner(label="Letter grade efficiency", columns=["Model 7", "Model 8"])
    .tab_spanner(label="PISA Score", columns=["Model 9", "Model 10"])
    .tab_options(
        table_body_hlines_style="none",
    )
    .cols_align(align="center", columns=["Model 7", "Model 8"])
)

	Letter grade efficiency		PISA Score
TABLE 2.—PREDICTIVE VALUE OF WALLET REPORTING RATES
	Model 7	Model 8	Model 9	Model 10
general_trust
general_trust	0.077*	-0.013	26.665***	9.099
	(0.041)	(0.040)	(4.461)	(6.632)
response		0.148***		24.956***
		(0.050)		(5.819)
N	39	39	32	32
R²	0.078	0.263	0.455	0.656
GPS_trust
GPS_trust	-0.016	-0.018	3.309	4.477
	(0.050)	(0.039)	(7.499)	(3.642)
response		0.125***		31.574***
		(0.041)		(3.851)
N	36	36	29	29
R²	0.003	0.213	0.007	0.628
general_morality
general_morality	0.080**	-0.012	25.366***	9.868**
	(0.036)	(0.047)	(4.857)	(4.125)
response		0.150***		25.321***
		(0.055)		(3.718)
N	38	38	32	32
R²	0.083	0.268	0.412	0.669
MFQ_genmorality
MFQ_genmorality	0.118***	0.069*	13.271*	2.465
	(0.041)	(0.041)	(7.993)	(3.818)
response		0.107***		31.432***
		(0.032)		(3.616)
N	35	35	31	31
R²	0.219	0.360	0.110	0.656
civic_cooperation
civic_cooperation	0.089*	0.038	19.470***	2.093
	(0.045)	(0.059)	(5.560)	(4.312)
response		0.130**		30.727***
		(0.054)		(4.505)
N	37	37	31	31
R²	0.100	0.283	0.235	0.633
GPS_posrecip
GPS_posrecip	0.009	0.003	3.255	1.411
	(0.040)	(0.044)	(7.924)	(4.321)
response		0.125***		31.325***
		(0.042)		(3.888)
N	36	36	29	29
R²	0.001	0.209	0.007	0.617
GPS_altruism
GPS_altruism	-0.033	-0.006	0.699	3.908
	(0.037)	(0.037)	(8.108)	(5.177)
response		0.124***		31.803***
		(0.040)		(3.547)
N	36	36	29	29
R²	0.015	0.209	0.000	0.625
stranger1
stranger1	0.091**	0.001	28.408***	12.071***
	(0.036)	(0.062)	(4.186)	(4.536)
response		0.140**		23.480***
		(0.061)		(4.237)
N	39	39	32	32
R²	0.110	0.263	0.508	0.687

Tannenbaum Paper

Contents

Tannenbaum Paper#

PISA vs. Survey Data#

PISA as a Predictor of Economic and Institutional Performance#

Regression Results#