Introduction#

In the paper What Do Cross-country Surveys Tell Us about Social Capital?, Tannenbaum et al. use the Wallet Return Dataset as a direct measure of civic honesty to investigate two types of indirect social capital measures. First, they provide an analysis of lost wallet reporting rates and their correlation to survey measures of social capital, showing the quantitative extent to which survey measures contain legitimate information about social capital. Second, they show that lost wallet reporting rates may be used as effective predictors of “Economic and Institutional Performance”, confirming social capital’s economic explanatory value.

I became curious of how educational assessment data would relate to these findings. The Programme for International Student Assessment (PISA) contains data on national educational program effectiveness, measured on 15-year-olds and is a standard dataset for comparing education outcomes between countries. Surprisingly, PISA scores (averaged across reading, math, and science for each country) were very strongly correlated with lost wallet reporting rates. This concerned the two aims of the Tannenbaum paper:

  1. PISA scores correlated with survey measures of social capital in largely the same manner as lost wallet reporting rates.

  2. Lost wallet reporting rates proved to be as strong a predictor of PISA scores as any of the other measures of “Economic and Institutional Performance”.

Also, for kicks, I tried to see if I could predict whether a wallet would be returned or not using machine learning techniques. Accuracy around 0.7 was reached which what one would expect given that most of the data was at the country-level. There likely isn’t enough personal information to increase accuracy much further.

Preliminary Inspection of PISA Data#

Wallet reporting rates (proportion of ‘100’ responses) were calculated per country and joined with the PISA data.

Hide code cell source
import pandas as pd
import plotly.express as px
import plotly.io as pio

pio.templates.default = "plotly_white"
pio.renderers.default = "sphinx_gallery"

import statsmodels.api as sm

# Import Tannenbaum dataa
df = pd.read_csv(
    "../data/tannenbaum_data.csv",
)

# Import PISA data
pisa = pd.read_csv("../data/pisa_data.csv")
pisa = pisa.loc[pisa['year'] == 2015]
pisa = pisa.groupby('country')['pisa_score'].mean().reset_index()

df = df.merge(pisa, how="left", on="country")
wallet_pisa = df.groupby(["country"])[["response", "pisa_score"]].mean()

# Calculate sample size
sample_size = wallet_pisa[["response", "pisa_score"]].reset_index().dropna().shape[0]

# Scatter plot
fig_scatter_pisa_wallet = px.scatter(
    wallet_pisa.reset_index(),
    x="response",
    y="pisa_score",
    hover_data=["country"],
    title="PISA 2015 Average Score vs. Wallet Return Rate by Country",
    subtitle=f"r={wallet_pisa.corr().iloc[0,1]:.3f}" + ", N=" + str(sample_size),
    trendline="ols",
)
fig_scatter_pisa_wallet.update_xaxes(showline=True, mirror=True, linecolor="darkgray")
fig_scatter_pisa_wallet.update_yaxes(showline=True, mirror=True, linecolor="darkgray")

fig_scatter_pisa_wallet.show()

There’s obviously a strong correlation between PISA scores and wallet reporting rates (r = 0.799). This is greater than the correlation with any other “Economic and Institutional Performance” measure as seen later in a correlation matrix plot. This suggests a close link between societal honesty (as measured by wallet returns) and educational outcomes.

Since PISA scores and lost wallet return rates are so closely correlated, it isn’t surprising that we’ll see that they relate similarly with other variables in the next section.

Missing Countries from PISA Study#

Caution

The PISA data used in this report is from 2015 but similar results are found using the 2018 PISA data. It should be noted that the PISA study is missing data for some important countries that are included in the Wallet Return Dataset (China, India, Ghana, Kenya, and South Africa). The 2015 PISA results do include measures for Chinese cities, however, China proves to be an extreme outlier with very high education scores and very low lost wallet reporting rate. Tannenbaum’s paper also noted China as a special case. East Asian countries are generally underrepresented in the wallet dataset and the three other East Asian countries included (Malaysia, Thailand, Indonesia) are very different from China, both culturally, economically, and governmentally.